Bug #2462

RCP: 0.7.9, XML import modules broken on </form> closing tag

Added by Serge Heiden about 1 year ago. Updated about 1 year ago.

Status:New Start date:10/16/2018
Priority:Normal Due date:
Assignee:- % Done:

80%

Category:Import Spent time: -
Target version:TXM 0.8.0

Description

A </form> closing tag breaks the <txm:form> management -> word properties get shifted.

Example:

<form>
...
    <orth>
        <hi rend="b">
            <w id="w_EdmondBéquetArtDramatique_703">
                <txm:form>ABANDON</txm:form>
                <txm:ana resp="none" type="#n">703</txm:ana>
                <txm:ana resp="#txm" type="#frpos">NAM</txm:ana>
                <txm:ana resp="#txm" type="#frlemma">ABANDON</txm:ana>
            </w>
        </hi>
    </orth>

    <w id="w_EdmondBéquetArtDramatique_704">
        <txm:form>.</txm:form>
        <txm:ana resp="none" type="#n">704</txm:ana>
        <txm:ana resp="#txm" type="#frpos">SENT</txm:ana>
        <txm:ana resp="#txm" type="#frlemma">.</txm:ana>
    </w>

</form>

<def>
    <w id="w_EdmondBéquetArtDramatique_705">
        <txm:form>Action</txm:form>
        <txm:ana resp="none" type="#n">705</txm:ana>
        <txm:ana resp="#txm" type="#frpos">SENT</txm:ana>
        <txm:ana resp="#txm" type="#frlemma">.</txm:ana>
    </w>
    <w id="w_EdmondBéquetArtDramatique_706">
        <txm:form>de</txm:form>
        <txm:ana resp="none" type="#n">706</txm:ana>
        <txm:ana resp="#txm" type="#frpos">NOM</txm:ana>
        <txm:ana resp="#txm" type="#frlemma">action</txm:ana>
    </w>
...
</def>

->
  • the 'Action' form gets the '.' lemma;
  • the 'de' form gets the 'action' lemma.
  • ...

Solution 1

Check that the </form> tag is inside a <w> element.

Solution 2

Use XML namespaces to disambiguate XML semantics.

If necessary, build a whole namespace environment for XML sources that don't declare namespaces explicitly or formally.

History

#1 Updated by Matthieu Decorde about 1 year ago

  • % Done changed from 0 to 80

Also available in: Atom PDF