Feature #3190

Import, CoNLL-U corpus

Added by Matthieu Decorde about 1 month ago. Updated about 1 month ago.

Status:New Start date:12/07/2021
Priority:Normal Due date:
Assignee:- % Done:

80%

Category:Import Spent time: -
Target version:TXM Profiterole 2.0

Description

Add a new import module : conllu

The import creates a CQP corpus with the ud words and properties

+ create CQP structures using CoNLL-U comment lines

+ based on the XTZ import with texts order, metadata, XSLs, ...

+ manage the word contractions

+ it creates the head-* and deps-* pre-computed conllu properties

+ it creates also a TIGERSearch representation

see https://groupes.renater.fr/wiki/txm-info/public/chantier_profiterole/chantier_finalisation_extension_syntactic_annotation and https://groupes.renater.fr/wiki/txm-info/public/import/conllu

History

#1 Updated by Matthieu Decorde about 1 month ago

  • % Done changed from 0 to 50

head and deps projection not done yet

#2 Updated by Matthieu Decorde about 1 month ago

  • Description updated (diff)

#3 Updated by Matthieu Decorde about 1 month ago

  • % Done changed from 50 to 80

Also available in: Atom PDF