Feature #2027

RCP: X.X, Concordance view corpus parameters

Added by Matthieu Decorde over 2 years ago. Updated 8 months ago.

Status:New Start date:02/09/2017
Priority:Normal Due date:
Assignee:- % Done:

0%

Category:Commands Spent time: -
Target version:TXM X.X

Description

Add corpus parameters to manage how the concordance lines displays the CQP tokens and structural units

Solution 1: tokens and structures

  • use the corpus-language parameter/property to apply the right txt-renderer of that language when building the content of the contexts of the lines
    • a txt-renderer is a component producing raw text (txt) by using language specific typographic rules to display the graphical forms of tokens
    • the Pager already uses a txt-renderer component to build edition pages content
  • add a new concordance-display-opening-structural-unit-separator corpus parameter
    • possible values: '<', '(', '|'
  • add a new concordance-display-opening-structural-units parameter corpus parameter
    • possible values: div,sp,p
  • add a new concordance-display-closing-structural-unit-separator corpus parameter
    • possible values: '>', ')', '|'
  • add a new concordance-display-closing-structural-units parameter corpus parameter
    • possible values: div,sp,p

Solution 2: tokens versus structures

For some studies on infra-word units (like syllables), a double corpus strategy is often used:
  • corpus A builds CQP tokens as text words and the infra-word units are encoded through concatenation inside the word form (eg 'gi-ra-fe')
  • corpus B builds CQP tokens as infra-word units and text words are encoded as <word> structures (eg <word form="girafe" seg-form="gi-ra-fe">gi ra fe</word>
To display correctly word forms in text editions and concordance lines:
  • add a new word-as-structure corpus parameter
    • possible values: 'word' or <empty>
    • if the parameter value is <empty> use corpus tokens to render text words
    • if the parameter value is the name of a structure use that structure to render text words accordingly

When building text edition and concordance line contexts, use that parameter value to render correctly word forms to ease reading.

Remark: there already exist some XSLT stylesheets to build correct text edition for those two different cases.

History

#1 Updated by Serge Heiden over 2 years ago

  • Description updated (diff)
  • Category changed from Administration to Commands

#2 Updated by Serge Heiden over 2 years ago

  • Description updated (diff)

#3 Updated by Matthieu Decorde over 1 year ago

  • Target version changed from 5 to TXM 0.8.0a (split/restructuration)

#4 Updated by Sebastien Jacquot about 1 year ago

  • Target version changed from TXM 0.8.0a (split/restructuration) to TXM 0.8.0

#5 Updated by Sebastien Jacquot 8 months ago

  • Target version changed from TXM 0.8.0 to TXM X.X

Also available in: Atom PDF