Bug #1488

Updated by Matthieu Decorde over 3 years ago

Currently, the Compiler translates any XML entities found in XML attribute values like '&lt;', '&quot;' or '&amp;' to the native character (like '<', '"' or '&') which breaks the CWB syntax.

Note: those entities are built from the sources by previous steps of the import process. If the entities are already present in the sources, they pass to the CWB format without translation and the import is OK.

h3. Solution

* a) keep the entities in the CWB output
* b) change CWB syntax to allow < (change cwb-encode)
** *MD*: the "-x" option is already set
** *MD*: all TXM 0.7.8 import modules now write "&amp;"s and "&lt;"s

Back