Feature #2042

Updated by Serge Heiden over 3 years ago

Currently the TXT+CSV import module tokenization seems sufficient for raw text Analec annotated texts. But for XML-TRS files or part of files, the XML-TRS import module tokenization doesn't work.

h3. Solution

Add a command to align aligne Analec annotations (character positions) position) to a TXM corpus Analec annotations (word positions).

The algorithm must deal with missing or added characters in the base text character flow.