Bug #833
RCP: 0.7.5, Import, tokenizer parameters
Status: | New | Start date: | 05/23/2014 | |
---|---|---|---|---|
Priority: | Normal | Due date: | ||
Assignee: | - | % Done: | 0% |
|
Category: | Import | Spent time: | - | |
Target version: | TXM X.X |
Description
1. Vertical bar should not be used inside brackets.
Replace :
[\p{Z}|\p{C}]+
with[\p{Z}\p{C}]+
[\p{Ps}|\p{Pe}|\p{Pi}|\p{Pf}|\p{Po}|\p{S}]
with[\p{Ps}\p{Pe}\p{Pi}\p{Pf}\p{Po}\p{S}]
2. Grouping with parentheses provokes endless loop in the tokenizer
Either the tokenizer code must be changed or unescaped parentheses not allowed in the tokenizer parameters
History
#1 Updated by Matthieu Decorde over 8 years ago
- Category set to Import
#2 Updated by Sebastien Jacquot over 8 years ago
- Target version changed from TXM 0.7.7 to TXM 0.7.8
#3 Updated by Matthieu Decorde over 7 years ago
- Target version changed from TXM 0.7.8 to TXM 0.8.0a (split/restructuration)
#4 Updated by Sebastien Jacquot over 4 years ago
- Target version changed from TXM 0.8.0a (split/restructuration) to TXM 0.8.0
#5 Updated by Matthieu Decorde almost 4 years ago
- Target version changed from TXM 0.8.0 to TXM X.X