Bug #2252
RCP: 0.7.8, XML/w import module, creating text identifiers fails when the root <TEI> element does not contain a <text> child
Status: | New | Start date: | 09/27/2017 | |
---|---|---|---|---|
Priority: | Normal | Due date: | ||
Assignee: | - | % Done: | 0% |
|
Category: | Import | Spent time: | - | |
Target version: | TXM 0.8.4 |
Description
Diagnostic
If source files have the following structure, the creation of text identifiers fails and some functions (as back-to-text) are unavailable
<TEI> <div> some text </div> <div> in divs </div> </TEI>
gives
<?xml version="1.0" encoding="UTF-8"?> <TEI xmlns="http://www.tei-c.org/ns/1.0" xmlns:txm="http://textometrie.org/1.0"> <teiHeader>...</teiHeader><div> <text id="tei"><w id="w_tei_1"><txm:form>some</txm:form><txm:ana resp="#txm" type="#enpos">DT</txm:ana><txm:ana resp="#txm" type="#enlemma">some</txm:ana><txm:ana resp="#src" type="#n">1</txm:ana></w> <w id="w_tei_2"><txm:form>text</txm:form><txm:ana resp="#txm" type="#enpos">NN</txm:ana><txm:ana resp="#txm" type="#enlemma">text</txm:ana><txm:ana resp="#src" type="#n">2</txm:ana></w> </text> <div> <w id="w_tei_3"><txm:form>in</txm:form><txm:ana resp="#txm" type="#enpos">IN</txm:ana><txm:ana resp="#txm" type="#enlemma">in</txm:ana><txm:ana resp="#src" type="#n">3</txm:ana></w> <w id="w_tei_4"><txm:form>divs</txm:form><txm:ana resp="#txm" type="#enpos">NNS</txm:ana><txm:ana resp="#txm" type="#enlemma">div</txm:ana><txm:ana resp="#src" type="#n">4</txm:ana></w> </div> </div></TEI>
Solution
The XML/w import module must be XML schema agnostic: no particular XML element should be supposed to be present in the source for the module to work.
History
#1 Updated by Sebastien Jacquot almost 5 years ago
- Target version changed from TXM 0.8.0a (split/restructuration) to TXM 0.8.0
#2 Updated by Matthieu Decorde about 4 years ago
- Target version changed from TXM 0.8.0 to TXM 0.8.2
#3 Updated by Matthieu Decorde almost 3 years ago
- Category set to Import
#4 Updated by Matthieu Decorde almost 2 years ago
- Description updated (diff)
#5 Updated by Matthieu Decorde almost 2 years ago
- Description updated (diff)
- % Done changed from 0 to 80
#6 Updated by Matthieu Decorde almost 2 years ago
- Target version changed from TXM 0.8.2 to TXM 0.8.4
- % Done changed from 80 to 0