Feature #3190
Import, CoNLL-U corpus
Status: | New | Start date: | 12/07/2021 | ||
---|---|---|---|---|---|
Priority: | Normal | Due date: | |||
Assignee: | - | % Done: | 60% |
||
Category: | Import | Spent time: | - | ||
Target version: | TXM Profiterole 2.0 |
Description
Add a new import module : conllu
The import creates a CQP corpus with the ud words and properties
+ create CQP structures using CoNLL-U comment lines
+ based on the XTZ import with texts order, metadata, XSLs, ...
+ manage the word contractions
+ it creates the head-* and deps-* pre-computed conllu properties
+ it creates also a TIGERSearch representation
see https://groupes.renater.fr/wiki/txm-info/public/chantier_profiterole/chantier_finalisation_extension_syntactic_annotation and https://groupes.renater.fr/wiki/txm-info/public/import/conllu
Related issues
History
#1 Updated by Matthieu Decorde almost 2 years ago
- % Done changed from 0 to 50
head and deps projection not done yet
#2 Updated by Matthieu Decorde almost 2 years ago
- Description updated (diff)
#3 Updated by Matthieu Decorde almost 2 years ago
- % Done changed from 50 to 80
#4 Updated by Matthieu Decorde about 1 year ago
- % Done changed from 80 to 60
split the conllu files before all processes