Feature #1640

Updated by Matthieu Decorde over 3 years ago

Currently, empty or missing word properties are encoded by default by '__UNDEF__' in cwb-encode.

The user should be able to choose the code.

h4. Solution

Create a new global Preference (in the Import section) named 'Empty property value code' (emptyPropertyValueCode for insiders).

h3. Validation test

* import a short corpus with XML/w or XTZ import module

<pre>
<test>
<w t="1">very</w>
<w t="">small</w>
<w t="2"></w>
<w t="3">small</w>
</test>
</pre>
* set the empty property value code to "EMPTY"
* the "word" lexicon must show:
<pre>
small 2
EMPTY 1
very 1
</pre>
* the "t" lexicon must show:
<pre>
1 1
2 1
3 1
EMPTY 1
</pre>

Back