Bug #833

RCP: 0.7.5, Import, tokenizer parameters

Added by Alexey Lavrentev almost 5 years ago. Updated about 1 month ago.

Status:New Start date:05/23/2014
Priority:Normal Due date:
Assignee:- % Done:

0%

Category:Import Spent time: -
Target version:TXM X.X

Description

1. Vertical bar should not be used inside brackets.
Replace :

  • [\p{Z}|\p{C}]+ with [\p{Z}\p{C}]+
  • [\p{Ps}|\p{Pe}|\p{Pi}|\p{Pf}|\p{Po}|\p{S}] with [\p{Ps}\p{Pe}\p{Pi}\p{Pf}\p{Po}\p{S}]

2. Grouping with parentheses provokes endless loop in the tokenizer

Either the tokenizer code must be changed or unescaped parentheses not allowed in the tokenizer parameters

History

#1 Updated by Matthieu Decorde over 4 years ago

  • Category set to Import

#2 Updated by Sebastien Jacquot over 4 years ago

  • Target version changed from TXM 0.7.7 to TXM 0.7.8

#3 Updated by Matthieu Decorde over 3 years ago

  • Target version changed from TXM 0.7.8 to TXM 0.8.0a (split/restructuration)

#4 Updated by Sebastien Jacquot 10 months ago

  • Target version changed from TXM 0.8.0a (split/restructuration) to TXM 0.8.0

#5 Updated by Matthieu Decorde about 1 month ago

  • Target version changed from TXM 0.8.0 to TXM X.X

Also available in: Atom PDF