Feature #3105

Import, manage foreign word ids V1

Added by Matthieu Decorde about 1 year ago. Updated 7 months ago.

Status:New Start date:06/23/2021
Priority:High Due date:
Assignee:- % Done:

50%

Category:Import Spent time: -
Target version:TXM 0.8.2

Description

Manage sources with no word ids or avoid propagation errors of source ids into TXM words.

Solution 1

Implement parts of the strategy described here: https://groupes.renater.fr/wiki/txm-info/public/composant_de_tokenisation#strategie
(this strategy imply that a tokenizer process is always done.)

The solution creates a new import parameter : doBuildWordIDs.

In this solution, the tokenizer is activated or not (in the Tokenizer import section) :
  • if the tokenizer not activated : the user must provide the word elements and set correctly the xml:id = behave like the doBuildWordId is not set
  • if activated, the tokenizer can user the xml:id of pre-encoded words if the doBuildWordIds import parameter is not set. If not pre-encoded xml:id are moved to the "foreign-id" property

History

#1 Updated by Serge Heiden about 1 year ago

  • Description updated (diff)

#2 Updated by Serge Heiden 11 months ago

  • Description updated (diff)

#3 Updated by Matthieu Decorde 11 months ago

  • Target version changed from TXM 0.8.2 to TXM 0.8.3

#4 Updated by Matthieu Decorde 11 months ago

  • Target version changed from TXM 0.8.3 to TXM 0.8.2

#5 Updated by Matthieu Decorde 11 months ago

  • Description updated (diff)

#6 Updated by Matthieu Decorde 10 months ago

  • Priority changed from Normal to High

#7 Updated by Matthieu Decorde 10 months ago

  • Tracker changed from Task to Feature

#8 Updated by Matthieu Decorde 7 months ago

  • Subject changed from Import, manage foreign word ids to Import, manage foreign word ids V1
  • Description updated (diff)
  • % Done changed from 0 to 50

Also available in: Atom PDF