Feature #2027

RCP: X.X, Concordance view corpus parameters

Ajouté par Matthieu Decorde il y a plus d'un an. Mis à jour il y a 18 jours.

Statut:New Début:09/02/2017
Priorité:Normal Echéance:
Assigné à:- % réalisé:

0%

Catégorie:Commands Temps passé: -
Version cible:TXM 0.8.0

Description

Add corpus parameters to manage how the concordance lines displays the CQP tokens and structural units

Solution 1: tokens and structures

  • use the corpus-language parameter/property to apply the right txt-renderer of that language when building the content of the contexts of the lines
    • a txt-renderer is a component producing raw text (txt) by using language specific typographic rules to display the graphical forms of tokens
    • the Pager already uses a txt-renderer component to build edition pages content
  • add a new concordance-display-opening-structural-unit-separator corpus parameter
    • possible values: '<', '(', '|'
  • add a new concordance-display-opening-structural-units parameter corpus parameter
    • possible values: div,sp,p
  • add a new concordance-display-closing-structural-unit-separator corpus parameter
    • possible values: '>', ')', '|'
  • add a new concordance-display-closing-structural-units parameter corpus parameter
    • possible values: div,sp,p

Solution 2: tokens versus structures

For some studies on infra-word units (like syllables), a double corpus strategy is often used:
  • corpus A builds CQP tokens as text words and the infra-word units are encoded through concatenation inside the word form (eg 'gi-ra-fe')
  • corpus B builds CQP tokens as infra-word units and text words are encoded as <word> structures (eg <word form="girafe" seg-form="gi-ra-fe">gi ra fe</word>
To display correctly word forms in text editions and concordance lines:
  • add a new word-as-structure corpus parameter
    • possible values: 'word' or <empty>
    • if the parameter value is <empty> use corpus tokens to render text words
    • if the parameter value is the name of a structure use that structure to render text words accordingly

When building text edition and concordance line contexts, use that parameter value to render correctly word forms to ease reading.

Remark: there already exist some XSLT stylesheets to build correct text edition for those two different cases.

Historique

#1 Mis à jour par Serge Heiden il y a plus d'un an

  • Description mis à jour (diff)
  • Catégorie changé de Administration à Commands

#2 Mis à jour par Serge Heiden il y a plus d'un an

  • Description mis à jour (diff)

#3 Mis à jour par Matthieu Decorde il y a 7 mois

  • Version cible changé de 5 à TXM 0.8.0a (split/restructuration)

#4 Mis à jour par Sebastien Jacquot il y a 18 jours

  • Version cible changé de TXM 0.8.0a (split/restructuration) à TXM 0.8.0

Formats disponibles : Atom PDF