Task #1666
Task #1630: TBX: improve performances of import process
TBX: improve performances of tokenizing process
Status: | New | Start date: | 02/10/2016 | ||
---|---|---|---|---|---|
Priority: | Normal | Due date: | |||
Assignee: | - | % Done: | 80% |
||
Category: | Import | Spent time: | - | ||
Target version: | TXM 0.7.8 |
Description
Some improvements may be done in the Groovy code of tokenizing sections.
Streams¶
Wrap FileOutputStreams with BufferedOutputStreams.
Compiling REGEX patterns¶
- compile all REGEX patterns that are used in Groovy scripts (use "= ~")
- e.g. replace from: reg3pts = /\A(.*)(\.\.\.)(.*)\Z/ to: reg3pts = ~/\A(.*)(\.\.\.)(.*)\Z/
- also compile the patterns used in replaceAll(), split(), etc. then call these methods from the Matcher class itself rather than the String class
- see Pattern.compile() for Java code sections
Use static Groovy compilation to avoid the runtime reflection¶
see:
- import groovy.transform.CompileStatic
- import static groovy.transform.TypeCheckingMode.SKIP
- @CompileStatic
- @CompileStatic(SKIP) => may be used when we can't statically compiling the script because of Groovy syntax usage. But a better solution is to remove the Groovy syntax to statically compile
- add @CompileStatic before "public class SimpleTokenizerXml" in SimpleTokenizer.groovy
- edit protected String standardChecks(String s) so it can be statically compiled
- compile all reg3pts, regPunct, etc. member patterns using "= ~" instead of "="
- extract the split pattern of split(), replaceAll() and matches() methods and store them compiled as member, eg.:
regSplitWhiteSpaces = Pattern.compile(TokenizerClasses.whitespaces); regLN = Pattern.compile("/\n/"); regCTRL = Pattern.compile("/\\p{C}/");
History
#1 Updated by Sebastien Jacquot over 7 years ago
- Parent task set to #1630
#2 Updated by Matthieu Decorde over 7 years ago
- Description updated (diff)
#3 Updated by Matthieu Decorde over 7 years ago
- Description updated (diff)
#4 Updated by Matthieu Decorde about 7 years ago
- % Done changed from 0 to 50
#5 Updated by Matthieu Decorde about 7 years ago
- % Done changed from 50 to 60
#6 Updated by Matthieu Decorde about 7 years ago
- % Done changed from 60 to 80
vérification du bon déroulement des modules d'import du menu import