Feature #1640

TBX: X.X, empty property value code

Added by Serge Heiden over 7 years ago. Updated about 7 years ago.

Status:New Start date:01/21/2016
Priority:Normal Due date:
Assignee:Benedicte Pincemin % Done:

80%

Category:Preferences Spent time: -
Target version:TXM 0.7.8

Description

Currently, empty or missing word properties are encoded by default by '__UNDEF__' in cwb-encode.

The user should be able to choose the code.

Solution

Create a new global Preference (in the Import section) named 'Empty property value code' (emptyPropertyValueCode for insiders).

Validation test

  • import a short corpus with XML/w or XTZ import module
    <test>
    <w t="1">very</w>
    <w t="">small</w>
    <w t="2"></w>
    <w t="3">small</w>
    </test>
    
  • set the empty property value code to "EMPTY"
  • the "word" lexicon must show:
        small    2
        EMPTY    1
        very    1
    
  • the "t" lexicon must show:
        1    1
        2    1
        3    1
        EMPTY    1
    

History

#1 Updated by Matthieu Decorde over 7 years ago

  • % Done changed from 0 to 80

#2 Updated by Matthieu Decorde about 7 years ago

  • Description updated (diff)

Also available in: Atom PDF