Feature #3353
Corpus, sample texts to n first words
Status: | New | Start date: | 03/14/2023 | |
---|---|---|---|---|
Priority: | Normal | Due date: | ||
Assignee: | - | % Done: | 0% |
|
Category: | Corpus | Spent time: | - | |
Target version: | TXM 0.8.4 |
Description
Help to sample a corpus at:
a)- import
- cut texts at n first words after tokenization
- add 'Sampling' section in import parameters form
- add 'Sample texts to [ ] first words' parameter
- add 'Cut at sentence boundaries' option parameter
- cut texts at n first words after tokenization
or
b)- update
- add new corpus command 'Sample texts at n first words' (on XML-TXM pivot)
- add 'Number of words' parameter
- add 'Cut at sentence boundaries' option parameter
- update corpus
- add new corpus command 'Sample texts at n first words' (on XML-TXM pivot)
or
c)- update
- add new corpus command 'Sample texts from sub-corpus' (on XML-TXM pivot from sub-corpus matches)
- for example with sub-corpus built with query
<text> []{1,10000}
and MatchingStrategy set at 'longest' - update corpus
- for example with sub-corpus built with query
- add new corpus command 'Sample texts from sub-corpus' (on XML-TXM pivot from sub-corpus matches)
History
#1 Updated by Serge Heiden about 1 month ago
- Description updated (diff)
- Target version changed from TXM 0.8.4 to TXM 0.8.3
#2 Updated by Matthieu Decorde about 1 month ago
- Target version changed from TXM 0.8.3 to TXM 0.8.4