Feature #1503

TBX: X.X, use one instance of TreeTagger to tag all corpus texts

Added by Matthieu Decorde almost 4 years ago. Updated 5 months ago.

Status:New Start date:09/17/2015
Priority:Normal Due date:
Assignee:- % Done:

0%

Category:Import Spent time: -
Target version:TXM X.X

Description

Currently TreeTagger is launched for each tagged XML-TXM .

This cost time during import process.

Solution

Concat TreeTagger sources in one stream per model file.
Question : are text limits well managed ?
use https://reckart.github.io/tt4j ?

History

#1 Updated by Matthieu Decorde almost 4 years ago

  • Description updated (diff)

#2 Updated by Sebastien Jacquot about 1 year ago

  • Target version changed from TXM 0.8.0a (split/restructuration) to TXM 0.8.0

#3 Updated by Matthieu Decorde 5 months ago

  • Target version changed from TXM 0.8.0 to TXM X.X

Also available in: Atom PDF