Task #2916

Double space in CQP word and structural units properties

Added by Matthieu Decorde over 1 year ago. Updated 12 months ago.

Status:New Start date:10/12/2020
Priority:High Due date:
Assignee:- % Done:

0%

Category:Corpus Spent time: -
Target version:TXM 0.8.3

Description

(old: some non visible characters are transferred in the CQP structural unit properties values from the table cells)

when a CQP property value contains double space " " CQP fails in some case. eg for the word property value t="A B" :

test.xml:

<TEI>
<teiHeader></teiHeader>
<text>
<w>A</w>
<w>A  B</w>
<w>C</w>
</text>
</TEI>

* [t="A  B"] FAIL
* [t="A  .+"] SUCCESS

Solution A

normalize all CQP property values:
  • " +" -> " "
The normalization should be done in:
  • XML-TXM files ?
  • CQP files ?

History

#1 Updated by Matthieu Decorde over 1 year ago

  • Subject changed from AFVOIXOFF, fix non-visible characters to AFVOIXOFF, fix non-visible characters in table data
  • Description updated (diff)

#2 Updated by Matthieu Decorde over 1 year ago

  • Subject changed from AFVOIXOFF, fix non-visible characters in table data to double space in CQP word and structural units properties
  • Description updated (diff)
  • Priority changed from Normal to High

#3 Updated by Matthieu Decorde over 1 year ago

  • Tracker changed from Task to Bug

#4 Updated by Matthieu Decorde over 1 year ago

  • Description updated (diff)

#5 Updated by Matthieu Decorde over 1 year ago

  • Subject changed from double space in CQP word and structural units properties to Double space in CQP word and structural units properties

#6 Updated by Matthieu Decorde over 1 year ago

  • Description updated (diff)

#7 Updated by Matthieu Decorde over 1 year ago

  • Description updated (diff)

#8 Updated by Matthieu Decorde over 1 year ago

  • Tracker changed from Bug to Task
  • Description updated (diff)

#9 Updated by Matthieu Decorde over 1 year ago

  • Target version changed from TXM 0.8.2 - Antract to TXM 0.8.2

#10 Updated by Matthieu Decorde 12 months ago

  • Description updated (diff)

#11 Updated by Matthieu Decorde 12 months ago

  • Description updated (diff)
  • Target version changed from TXM 0.8.2 to TXM 0.8.3

Also available in: Atom PDF