Feature #2042
Mis à jour par Serge Heiden il y a plus de 8 ans
Currently, in the "Import a Glozz corpus..." "Analec corpus import" command, the TXT+CSV import module tokenization seems sufficient for raw text Analec annotated texts. But for XML-TRS files or part of files, the XML-TRS import module tokenization doesn't work.
h3. Solution
Add a command to align Analec annotations (character positions) to a TXM corpus Analec annotations (word positions).
The algorithm must deal with missing or added characters in the base text character flow.
h3. Solution
Add a command to align Analec annotations (character positions) to a TXM corpus Analec annotations (word positions).
The algorithm must deal with missing or added characters in the base text character flow.