Feature #969

TXM: X.X, (GWT + RCP) textual planes

Added by Serge Heiden almost 5 years ago. Updated almost 5 years ago.

Status:New Start date:08/07/2014
Priority:Normal Due date:
Assignee:- % Done:

0%

Category:Toolbox Spent time: -
Target version:TXM X.X

Description

Texts are possibly composed of several textual planes:
  • the basic text, or primary text or main body of text
  • section titles (it can be a choice to consider those as not participating to the basic text content - for example when the title is added by the scientific editor)
  • citations, possibly in different languages (it can be a choice to differenciate those from the basic text - for example to build the lexicon of the basic text)
  • speaker discourses (it can be a choice to consider or to ignore the discourse of specific speakers)
  • footnotes text, possibly in different languages and possibly embeded (hierarchy of critical apparatus)

We must help the user to use various textual planes for searching or counting.

Solution:
  • make import modules aware of textual planes and design corresponding import parameters
    • drive NLP tools with respect to languages found in textual planes
  • design UI to select which textual planes are used
  • relate textual planes to sub-corpus management (for example through the 'sequence of textual planes' CQP encoding strategy)
  • relate textual planes to 'Text edition with source text not in corpus' management, see #968

History

#1 Updated by Serge Heiden almost 5 years ago

  • Category set to Toolbox
  • Target version set to TXM X.X

Also available in: Atom PDF