Bug #3243
XML import modules, force non ASCII chars in element names or attributes names before CQP format
Statut: | New | Début: | 22/04/2022 | |
---|---|---|---|---|
Priorité: | Normal | Echéance: | ||
Assigné à: | - | % réalisé: | 0% |
|
Catégorie: | Import | Temps passé: | - | |
Version cible: | TXM 0.8.4 |
Description
Currently, if an XML element is named with an accent, CQP breaks on the structure attribute value access when using the corpus Properties command.
For example, source:
<répondant>
We get:
** Échec de lecture des valeurs de la propriété de structure répondant_n : org.txm.searchengine.cqp.clientExceptions.CqiClientException: org.txm.searchengine.cqp.serverException.CqiClErrorInternal: Stacktrace: [1] org.txm.searchengine.cqp.corpus.StructuralUnitProperty. getValues StructuralUnitProperty.java, 139 [2] org.txm.properties.core.functions.CorpusPropertiesComputer.stepStructuralUnits CorpusPropertiesComputer.java, 199 [3] org.txm.properties.core.functions.CorpusPropertiesComputer. _compute CorpusPropertiesComputer.java, 464
Solution¶
Force XML elements and attributes names normalization by a Unicode character conversion method.