Feature #200

export, corpus : "light" vs "complete" export option

Added by Benedicte Pincemin over 6 years ago. Updated 10 months ago.

Status:New Start date:07/04/2013
Priority:High Due date:
Assignee:- % Done:


Category:- Spent time: -
Target version:TXM 0.8.1


Have the possibility to reduce the size of the binary corpus to the minimum necessary size (for example, by deleting some directories like interp, ptokenized, tokenized, treetagger, wtc). This parameter could be set through the parameters interface (tools/parameters/txm/user/export). The default value could be "light" ? (the advantage of choosing the "complete" option should be explaned, anyway).


#1 Updated by Alexey Lavrentev over 6 years ago

When choosing the "light" option, the user should be asked if (s)he wants to reduce the binary corpus on his/her system or just for the export. This may help save a lot of disk space.
Probably, a new function "Optimize binary corpora" should be created in the Settings menu as well

#2 Updated by Matthieu Decorde almost 6 years ago

  • Priority changed from Normal to High

#3 Updated by Serge Heiden almost 6 years ago

This ticket interfers with the need to keep (or not) intermediary data files during an import process, for import debug or multiple-import operations. Which should be another ticket related to Import.

Otherwise, for the current ticket. Here is a prefered Scenario: the option should not be designed to help to downsize a binary corpus exported, but instead to augment a standard binary corpus (called "light" in the description) with various secondary informations.

#4 Updated by Matthieu Decorde almost 2 years ago

  • Target version changed from 5 to TXM 0.8.0a (split/restructuration)

#5 Updated by Sebastien Jacquot over 1 year ago

  • Target version changed from TXM 0.8.0a (split/restructuration) to TXM 0.8.0

#6 Updated by Matthieu Decorde 10 months ago

  • Target version changed from TXM 0.8.0 to TXM 0.8.1

Also available in: Atom PDF