Feature #1597

TBX: X.X, XML-XTZ import, encode milestones in word properties

Added by Matthieu Decorde about 4 years ago. Updated over 2 years ago.

Status:In Progress Start date:11/27/2015
Priority:Normal Due date:
Assignee:- % Done:

80%

Category:Import Spent time: -
Target version:TXM - Oriflamms 1.0

Description

XML milestones are not implemented in the structural attributes of the CQP corpus data model.

Some corpora need to be queried directly by some milestones related encodings because it is not pertinent to transform the information in usual structural attributes.

Solution

A first step, to just allow using the position information of milestones with respect to words, is to encode them in word properties. Like for example the distance of the word to a previous and to a next specific milestone.

4 word properties are created for each milestone specified in the import form:

  • <milestone>start : distance in tokens to the 1st preceding milestone (implemented)
  • <milestone>end : distance in tokens to the 1st following milestone (implemented)
  • <milestone>id : identifier (@*:id) of the 1st preceding milestone (implemented)
  • <milestone>n : number (@n) of the 1st preceding milestone (not implemented)

History

#1 Updated by Serge Heiden about 4 years ago

  • Description updated (diff)

#2 Updated by Serge Heiden about 4 years ago

  • Description updated (diff)

#3 Updated by Matthieu Decorde almost 4 years ago

  • Category set to Import

#4 Updated by Matthieu Decorde almost 4 years ago

  • % Done changed from 0 to 80

#5 Updated by Matthieu Decorde almost 4 years ago

  • Subject changed from TBX: X.X, XML-XTZ import and milestones to TBX: X.X, XML-XTZ import, encode milestones in word properties

#6 Updated by Alexey Lavrentev over 2 years ago

  • Description updated (diff)
  • Status changed from New to In Progress

#7 Updated by Alexey Lavrentev over 2 years ago

  • Description updated (diff)

Also available in: Atom PDF