Feature #2027
RCP: X.X, Concordance view corpus parameters
Status: | New | Start date: | 02/09/2017 | |
---|---|---|---|---|
Priority: | Normal | Due date: | ||
Assignee: | - | % Done: | 0% |
|
Category: | Commands | Spent time: | - | |
Target version: | TXM X.X |
Description
Add corpus parameters to manage how the concordance lines displays the CQP tokens and structural units
Solution 1: tokens and structures¶
- use the corpus-language parameter/property to apply the right txt-renderer of that language when building the content of the contexts of the lines
- a txt-renderer is a component producing raw text (txt) by using language specific typographic rules to display the graphical forms of tokens
- the Pager already uses a txt-renderer component to build edition pages content
- add a new concordance-display-opening-structural-unit-separator corpus parameter
- possible values: '<', '(', '|'
- add a new concordance-display-opening-structural-units parameter corpus parameter
- possible values: div,sp,p
- add a new concordance-display-closing-structural-unit-separator corpus parameter
- possible values: '>', ')', '|'
- add a new concordance-display-closing-structural-units parameter corpus parameter
- possible values: div,sp,p
Solution 2: tokens versus structures¶
For some studies on infra-word units (like syllables), a double corpus strategy is often used:- corpus A builds CQP tokens as text words and the infra-word units are encoded through concatenation inside the word form (eg 'gi-ra-fe')
- corpus B builds CQP tokens as infra-word units and text words are encoded as <word> structures (eg <word form="girafe" seg-form="gi-ra-fe">gi ra fe</word>
- add a new word-as-structure corpus parameter
- possible values: 'word' or <empty>
- if the parameter value is <empty> use corpus tokens to render text words
- if the parameter value is the name of a structure use that structure to render text words accordingly
When building text edition and concordance line contexts, use that parameter value to render correctly word forms to ease reading.
Remark: there already exist some XSLT stylesheets to build correct text edition for those two different cases.
History
#1 Updated by Serge Heiden over 6 years ago
- Description updated (diff)
- Category changed from Administration to Commands
#2 Updated by Serge Heiden over 6 years ago
- Description updated (diff)
#3 Updated by Matthieu Decorde over 5 years ago
- Target version changed from 5 to TXM 0.8.0a (split/restructuration)
#4 Updated by Sebastien Jacquot almost 5 years ago
- Target version changed from TXM 0.8.0a (split/restructuration) to TXM 0.8.0
#5 Updated by Sebastien Jacquot over 4 years ago
- Target version changed from TXM 0.8.0 to TXM X.X