Bug #2252

RCP: 0.7.8, XML/w import module, creating text identifiers fails when the root <TEI> element does not contain a <text> child

Added by Serge Heiden over 5 years ago. Updated almost 2 years ago.

Status:New Start date:09/27/2017
Priority:Normal Due date:
Assignee:- % Done:

0%

Category:Import Spent time: -
Target version:TXM 0.8.4

Description

Diagnostic

If source files have the following structure, the creation of text identifiers fails and some functions (as back-to-text) are unavailable

<TEI>
  <div> some text </div>
  <div> in divs </div>
</TEI>

gives

<?xml version="1.0" encoding="UTF-8"?>
<TEI xmlns="http://www.tei-c.org/ns/1.0" xmlns:txm="http://textometrie.org/1.0">
<teiHeader>...</teiHeader><div>
<text id="tei"><w id="w_tei_1"><txm:form>some</txm:form><txm:ana resp="#txm" type="#enpos">DT</txm:ana><txm:ana resp="#txm" type="#enlemma">some</txm:ana><txm:ana resp="#src" type="#n">1</txm:ana></w>
<w id="w_tei_2"><txm:form>text</txm:form><txm:ana resp="#txm" type="#enpos">NN</txm:ana><txm:ana resp="#txm" type="#enlemma">text</txm:ana><txm:ana resp="#src" type="#n">2</txm:ana></w>
</text>
<div>
<w id="w_tei_3"><txm:form>in</txm:form><txm:ana resp="#txm" type="#enpos">IN</txm:ana><txm:ana resp="#txm" type="#enlemma">in</txm:ana><txm:ana resp="#src" type="#n">3</txm:ana></w>
<w id="w_tei_4"><txm:form>divs</txm:form><txm:ana resp="#txm" type="#enpos">NNS</txm:ana><txm:ana resp="#txm" type="#enlemma">div</txm:ana><txm:ana resp="#src" type="#n">4</txm:ana></w>
</div>
</div></TEI>

Solution

The XML/w import module must be XML schema agnostic: no particular XML element should be supposed to be present in the source for the module to work.

History

#1 Updated by Sebastien Jacquot almost 5 years ago

  • Target version changed from TXM 0.8.0a (split/restructuration) to TXM 0.8.0

#2 Updated by Matthieu Decorde about 4 years ago

  • Target version changed from TXM 0.8.0 to TXM 0.8.2

#3 Updated by Matthieu Decorde almost 3 years ago

  • Category set to Import

#4 Updated by Matthieu Decorde almost 2 years ago

  • Description updated (diff)

#5 Updated by Matthieu Decorde almost 2 years ago

  • Description updated (diff)
  • % Done changed from 0 to 80

#6 Updated by Matthieu Decorde almost 2 years ago

  • Target version changed from TXM 0.8.2 to TXM 0.8.4
  • % Done changed from 80 to 0

Also available in: Atom PDF