Bug #2373: RCP: 0.7.9, XTZ + CSV import module: error in page indexing if source file name contains the "_" character - Plateforme TXM - Forge du Centre Blaise Pascal

Bug #2373

RCP: 0.7.9, XTZ + CSV import module: error in page indexing if source file name contains the "_" character

Ajouté par Alexey Lavrentev il y a plus de 7 ans. Mis à jour il y a presque 2 ans.

Statut:

Closed

Début:

30/04/2018

Priorité:

Normal

Echéance:

Assigné à:

% réalisé:

100%

Catégorie:

Import

Temps passé:

Version cible:

TXM 0.8.2

Description

If the corpus source directory contains files differentiated by underscore-separated suffixes, the index of pages in the import contains duplicates.

Example¶

Sources files :

mytext.xml
mytext_a.xml

import.xml in the binary corpus:

            <text name="mytext">
               <source file="/home/user/TXM/corpora/MYCORPUS/txm/MYCORPUS/mytext.xml" 
                       type=".xml"/>
               <editions>
                  <edition index="/home/user/TXM/corpora/MYCORPUS/HTML/MYCORPUS/default" 
                           mode="xsl" 
                           name="default" 
                           script="1-default-html.xsl" 
                           type="html">
                     <page id="1" wordid="w_0"/>
                     <page id="a_1" wordid="w_0"/>
                     <page id="a_2" wordid="mytext_a_1"/>
                     <page id="2" wordid="mytext_1"/>
                     <page id="3" wordid="mytext_137"/>
                     <page id="a_3" wordid="mytext_a_18"/>
          </editions>
</text>