Bug #1488
TBX: 0.7.7, allow XML special characters entities in word or structure properties in all XML based import modules
Status: | New | Start date: | 09/10/2015 | ||
---|---|---|---|---|---|
Priority: | Normal | Due date: | |||
Assignee: | - | % Done: | 70% |
||
Category: | Import | Spent time: | - | ||
Target version: | TXM 0.7.8 |
Description
Currently, the Compiler translates any XML entities found in XML attribute values like '<', '"' or '&' to the native character (like '<', '"' or '&') which breaks the CWB syntax.
Note: those entities are built from the sources by previous steps of the import process. If the entities are already present in the sources, they pass to the CWB format without translation and the import is OK.
Solution¶
- a) keep the entities in the CWB output
- b) change CWB syntax to allow < (change
- MD: the "-x" option is already set
- MD: all TXM 0.7.8 import modules now write "&"s and "<"s
History
#1 Updated by Matthieu Decorde over 5 years ago
- Target version changed from TXM 0.7.8 to TXM 0.8.0a (split/restructuration)
#2 Updated by Serge Heiden over 5 years ago
- Description updated (diff)
#3 Updated by Serge Heiden over 5 years ago
- Target version changed from TXM 0.8.0a (split/restructuration) to TXM 0.7.8
#4 Updated by Serge Heiden over 5 years ago
- Subject changed from TBX: 0.7.7, allow < character in word or structure properties in all XML based import modules to TBX: 0.7.7, allow XML special characters entities in word or structure properties in all XML based import modules
- Description updated (diff)
#5 Updated by Matthieu Decorde about 5 years ago
- Description updated (diff)
- % Done changed from 0 to 80
#6 Updated by Serge Heiden about 5 years ago
What about '"' ?
- it is an XML reserved character
- it is sensible to encode XML element attributes
#7 Updated by Serge Heiden almost 5 years ago
- % Done changed from 80 to 70