Bug #2462
RCP: 0.7.9, XML import modules broken on </form> closing tag
Status: | New | Start date: | 10/16/2018 | ||
---|---|---|---|---|---|
Priority: | Normal | Due date: | |||
Assignee: | - | % Done: | 80% |
||
Category: | Import | Spent time: | - | ||
Target version: | TXM 0.8.0 |
Description
A </form> closing tag breaks the <txm:form> management -> word properties get shifted.
Example:
<form> ... <orth> <hi rend="b"> <w id="w_EdmondBéquetArtDramatique_703"> <txm:form>ABANDON</txm:form> <txm:ana resp="none" type="#n">703</txm:ana> <txm:ana resp="#txm" type="#frpos">NAM</txm:ana> <txm:ana resp="#txm" type="#frlemma">ABANDON</txm:ana> </w> </hi> </orth> <w id="w_EdmondBéquetArtDramatique_704"> <txm:form>.</txm:form> <txm:ana resp="none" type="#n">704</txm:ana> <txm:ana resp="#txm" type="#frpos">SENT</txm:ana> <txm:ana resp="#txm" type="#frlemma">.</txm:ana> </w> </form> <def> <w id="w_EdmondBéquetArtDramatique_705"> <txm:form>Action</txm:form> <txm:ana resp="none" type="#n">705</txm:ana> <txm:ana resp="#txm" type="#frpos">SENT</txm:ana> <txm:ana resp="#txm" type="#frlemma">.</txm:ana> </w> <w id="w_EdmondBéquetArtDramatique_706"> <txm:form>de</txm:form> <txm:ana resp="none" type="#n">706</txm:ana> <txm:ana resp="#txm" type="#frpos">NOM</txm:ana> <txm:ana resp="#txm" type="#frlemma">action</txm:ana> </w> ... </def>->
- the 'Action' form gets the '.' lemma;
- the 'de' form gets the 'action' lemma.
- ...
Solution 1¶
Check that the </form> tag is inside a <w> element.
Solution 2¶
Use XML namespaces to disambiguate XML semantics.
If necessary, build a whole namespace environment for XML sources that don't declare namespaces explicitly or formally.
History
#1 Updated by Matthieu Decorde about 5 years ago
- % Done changed from 0 to 80