Feature #1296

RCP: 0.7.7, Manage corpus with holes

Added by Sebastien Jacquot about 4 years ago. Updated about 4 years ago.

Status:New Start date:04/02/2015
Priority:Normal Due date:
Assignee:- % Done:

0%

Category:Corpus Spent time: -
Target version:TXM X.X

Description

Manage corpus with holes in TXM ?

Actually a sub-corpus in CQP is an abstract result based on the root corpus (a named query result).
CQP doesn't seem to offer a way to create sub-corpus with holes from the root corpus. For example, creating a sub-corpus where some tags has been removed will lead to some strange behaviors in TXM.
A index/lexicon will not return the tokens from the removed tags, as expected.
But the global context stays the root corpus leading to:
  • "wrong" contexts in Concordance (the tokens contained in the removed tags stay in the contexts)
  • "wrong" edition (the tokens contained in the removed tags stay in the edition, problematic or not? or at least do a "diff" to highlight root corpus VS sub-corpus ?)
  • the Internal view structural properties information becomes wrong because the positions do not take in account the holes in sub-corpus
  • TODO: try to define other potential problems in TXM commands

History

#1 Updated by Sebastien Jacquot about 4 years ago

  • Description updated (diff)

Also available in: Atom PDF