Feature #2042

RCP: X.X, Import annotations from Glozz corpus command

Added by Matthieu Decorde over 2 years ago. Updated over 1 year ago.

Status:New Start date:02/16/2017
Priority:Normal Due date:
Assignee:- % Done:

0%

Category:Annotation Spent time: -
Target version:TXM Democrat 4.0

Description

Currently, in the "Import a Glozz corpus..." command, the TXT+CSV import module tokenization seems sufficient for raw text Analec annotated texts. But for XML-TRS source files or part of files, the XML-TRS import module tokenization doesn't work.

Solution

Add a "Import annotations from Glozz..." command to align Analec annotations (character positions) to a TXM corpus Analec annotations (word positions).

The algorithm must deal with missing or added characters in the base text character flow.

History

#1 Updated by Serge Heiden over 2 years ago

  • Description updated (diff)

#2 Updated by Serge Heiden over 2 years ago

  • Description updated (diff)

#3 Updated by Serge Heiden over 2 years ago

  • Description updated (diff)

#4 Updated by Serge Heiden over 2 years ago

  • Description updated (diff)

#5 Updated by Serge Heiden over 2 years ago

  • Description updated (diff)

#6 Updated by Serge Heiden over 2 years ago

  • Target version changed from TXM Democrat 3.0 to TXM Democrat 4.0

#7 Updated by Matthieu Decorde almost 2 years ago

  • % Done changed from 0 to 80

#8 Updated by Matthieu Decorde almost 2 years ago

  • % Done changed from 80 to 0

#9 Updated by Matthieu Decorde over 1 year ago

  • Subject changed from RCP: X.X, align Analec annotation with an imported corpus to RCP: X.X, Import annotations from Glozz corpus command
  • Description updated (diff)

Also available in: Atom PDF