Task #2779

Structured Subcorpus query optimisation

Added by Matthieu Decorde 5 months ago. Updated 5 months ago.

Status:New Start date:02/26/2020
Priority:Normal Due date:
Assignee:- % Done:

80%

Category:Corpus Spent time: -
Target version:TXM 0.8.1

Description

In TXM 0.8.0, structured subcorpus creation was based on "[] expand to structure" queries which is quite slow compared to the "<structure>[] expand to structure".

Depending on the corpus matches, the 2 queries don't select the same matches. (There is no difference when they are computed on a MainCorpus, because there is only one match from 0 to END)

Solution

  • The "[] expand to structure" query is replaced with the "<structure>[] expand to structure" query
  • add a preference to switch to the old query

Documentation

Add the new preference page and new option

The "<structure>[] expand to structure" query is faster but won't work if the first position of the structure is not included in the corpus matches.

Both queries can break the corpus limits, but the first one more often since it needs only one available position in the structure to work

History

#1 Updated by Matthieu Decorde 5 months ago

  • % Done changed from 0 to 80

#2 Updated by Matthieu Decorde 5 months ago

  • Description updated (diff)

Also available in: Atom PDF