Bug #1505

Updated by Matthieu Decorde over 4 years ago

Currently the TXT+CSV import module (and clipboard import) creates a "lb" structure for each line and creates a "p" structure each 2 empty lines found.

This is too specific and missleading.

h3. Solution 1

Don't create the "p" structures.

h3. Solution 2

With an import option "document format" : don't create the "lb" milestone and create a "p" structure each line read.
Thus the "lbid" word property is removed.

see ticket #1585

h3. Validation test

the clipboard import of <pre>this is a small test.

With some line breaks

sometimes</pre>

must give the following description:
<pre>
Description du corpus PRESSEPAPIER1

- pressepapier1
- mdecorde
- 2016-06-29
Statistiques Générales

Nombre de mots 11
Nombre de propriétés de mot 4
Nombre d'unités de structure 3
Propriétés des unités lexicales (max 20 valeurs)

frlemma : this, is, avoir, small, test, ., With, some, line, break, sometimes, ...
frpos : NOM, ADJ, VER:pres, SENT, NAM, ...
lbid : 1, 3, 5, ...
word : this, is, a, small, test, ., With, some, line, breaks, sometimes, ...
Propriétés des structures (max 20 valeurs)

s
n (2) = 1, 2.
text
id (1) = pressepapier1.
</pre>

Back