Bug #1577

Mis à jour par Serge Heiden il y a presque 10 ans

A) Some XML elements are not recognized as structures, they are regular words included in the Lexicon, when the a certain number of different XML elements is reached. Lexicon.

B) Some import steps seem to be threaded which should not be the case for this experimental and pedagogical import module (at least Compiler and Pager steps).

Here is the console log of the Compiler step of the XTZ import module:
<pre>
-- Running CWB-encodes...
Word properties: [id, enpos, enlemma, n, type]
Structures: [argument:0+n, back:0+n, bibl:0+n, body:1+n, byline:0+n, cit:0+n, closer:0+n, corr:0+sic+n, date:0+n, dateline:0+n, div:0+type+n, docauthor:0+n, docdate:0+n, docimprint:0+n, doctitle:0+n, emph:0+rend+n, expan:0+abbr+n, floatingtext:0+n, foreign:0+lang+n, front:0+n, head:0+n, hi:0+n, l:0+rend+n, name:0+n, note:0+id+anchored+place+n, opener:0+n, p:0+n, pb:0+id+n, publisher:0+n, pubplace:0+n, q:0+n, ref:0+target+n, signed:0+n, term:0+n, text:0+id+base+project+genre+author+title+pubdate+lang, title:0+n, titlepage:0+n, titlepart:0+type+n, trailer:0+n, txmcorpus:0+lang]
Encoding 5 files...
......
-- Running CWB-makeall...
....
</pre>

Note:
* recursive structures (e.g. div) are not recognized
* some structures (item, epigraph...) are not recognized at all, cf. screen capture below:

!{width: 100%}xtz-compiler-structures.png!

h3. Source sample

A sample source directory to reproduce the bug: attachment:"nanovwwp-xtz.zip"

h3. Solution

A) find the bug

B) remove the threads

* 1) write a spec for threads in import modules
* 2) do the code
* 3) make the code debuged by someone else

Retour