Feature #3105

Import, manage foreign word ids V1

Added by Matthieu Decorde over 2 years ago. Updated almost 2 years ago.

Status:New Start date:06/23/2021
Priority:High Due date:
Assignee:- % Done:

50%

Category:Import Spent time: -
Target version:TXM 0.8.2

Description

Manage sources with no word ids or avoid propagation errors of source ids into TXM words.

Solution 1

Implement parts of the strategy described here: https://groupes.renater.fr/wiki/txm-info/public/composant_de_tokenisation#strategie
(this strategy imply that a tokenizer process is always done.)

The solution creates a new import parameter : doBuildWordIDs.

In this solution, the tokenizer is activated or not (in the Tokenizer import section) :
  • if the tokenizer not activated : the user must provide the word elements and set correctly the xml:id = behave like the doBuildWordId is not set
  • if activated, the tokenizer can user the xml:id of pre-encoded words if the doBuildWordIds import parameter is not set. If not pre-encoded xml:id are moved to the "foreign-id" property

History

#1 Updated by Serge Heiden over 2 years ago

  • Description updated (diff)

#2 Updated by Serge Heiden about 2 years ago

  • Description updated (diff)

#3 Updated by Matthieu Decorde about 2 years ago

  • Target version changed from TXM 0.8.2 to TXM 0.8.4

#4 Updated by Matthieu Decorde about 2 years ago

  • Target version changed from TXM 0.8.4 to TXM 0.8.2

#5 Updated by Matthieu Decorde about 2 years ago

  • Description updated (diff)

#6 Updated by Matthieu Decorde about 2 years ago

  • Priority changed from Normal to High

#7 Updated by Matthieu Decorde about 2 years ago

  • Tracker changed from Task to Feature

#8 Updated by Matthieu Decorde almost 2 years ago

  • Subject changed from Import, manage foreign word ids to Import, manage foreign word ids V1
  • Description updated (diff)
  • % Done changed from 0 to 50

Also available in: Atom PDF