Feature #969

TXM: X.X, (GWT + RCP) textual planes

Ajouté par Serge Heiden il y a environ 11 ans. Mis à jour il y a environ 11 ans.

Statut:New Début:07/08/2014
Priorité:Normal Echéance:
Assigné à:- % réalisé:

0%

Catégorie:Toolbox Temps passé: -
Version cible:TXM 0.X.X

Description

Texts are possibly composed of several textual planes:
  • the basic text, or primary text or main body of text
  • section titles (it can be a choice to consider those as not participating to the basic text content - for example when the title is added by the scientific editor)
  • citations, possibly in different languages (it can be a choice to differenciate those from the basic text - for example to build the lexicon of the basic text)
  • speaker discourses (it can be a choice to consider or to ignore the discourse of specific speakers)
  • footnotes text, possibly in different languages and possibly embeded (hierarchy of critical apparatus)

We must help the user to use various textual planes for searching or counting.

Solution:
  • make import modules aware of textual planes and design corresponding import parameters
    • drive NLP tools with respect to languages found in textual planes
  • design UI to select which textual planes are used
  • relate textual planes to sub-corpus management (for example through the 'sequence of textual planes' CQP encoding strategy)
  • relate textual planes to 'Text edition with source text not in corpus' management, see #968

Historique

#1 Mis à jour par Serge Heiden il y a environ 11 ans

  • Catégorie mis à Toolbox
  • Version cible mis à TXM 0.X.X

Formats disponibles : Atom PDF