Bug #2793
XTZ, XML/w and Transcriber Import, XLSX and ODS metadata files not supported when XSL directory is present
Status: | New | Start date: | 04/08/2020 | ||
---|---|---|---|---|---|
Priority: | Urgent | Due date: | |||
Assignee: | - | % Done: | 80% |
||
Category: | Import | Spent time: | - | ||
Target version: | TXM 0.8.2 |
Description
This is due to the fact that TXM tries to run XSLT transformations on XSLX and ODS files. The module works properly with metadata.csv.
Démarrage de TXM 0.8.0.2221 (2019-08-30 14h42)… TXM est prêt. The JOUBERTXTZOG corpus will be created from the /home/alavrent/Bureau/Ex5/joubert-xtz-og directory. The 'annotate' import parameter has been activated since TreeTagger is installed. Sauvegarde des paramètres d'importation… Démarrage du script d'import Groovy xtzLoader.groovy. [[id, auteur, titre, extrait, date, ville], [joubert1579_1-02, Joubert, Laurent, Erreurs populaires, Livre 1, ch. 2, 1579, Bordeaux], [joubert1587_1-02, Joubert, Laurent, Erreurs populaires, Livre 1, ch. 2, 1587, Paris]] -- Split-Merge XSL Step with /home/alavrent/Bureau/Ex5/joubert-xtz-og/xsl/1-split-merge -- Front XSL Step with the /home/alavrent/Bureau/Ex5/joubert-xtz-og/xsl/2-front directory. ApplyXsl2 with the /home/alavrent/Bureau/Ex5/joubert-xtz-og/xsl/2-front/01-txm-front-teip5-og-xtz-joubert-removeAncor.xsl stylesheet. -- Applying /home/alavrent/Bureau/Ex5/joubert-xtz-og/xsl/2-front/01-txm-front-teip5-og-xtz-joubert-removeAncor.xsl XSL to 3 (from /home/alavrent/TXM-0.8.0/corpora/JOUBERTXTZOG/src) files with parameters: {output-directory=file:/home/alavrent/TXM-0.8.0/corpora/JOUBERTXTZOG/txm/JOUBERTXTZOG/} on directory /home/alavrent/TXM-0.8.0/corpora/JOUBERTXTZOG/src result written in /home/alavrent/TXM-0.8.0/corpora/JOUBERTXTZOG/txm/JOUBERTXTZOG 003 .Error on line 1 column 1 of metadata.xlsx: SXXP0003: Error reported by XML parser: Contenu non autorisé dans le prologue. net.sf.saxon.trans.XPathException: org.xml.sax.SAXParseException; systemId: file:/home/alavrent/TXM-0.8.0/corpora/JOUBERTXTZOG/src/metadata.xlsx; lineNumber: 1; columnNumber: 1; Contenu non autorisé dans le prologue. at net.sf.saxon.event.Sender.sendSAXSource(Sender.java:425) at net.sf.saxon.event.Sender.send(Sender.java:178) at net.sf.saxon.Controller.transform(Controller.java:1790) at org.txm.importer.ApplyXsl2.process(ApplyXsl2.java:304) at org.txm.importer.ApplyXsl2.processImportSources(ApplyXsl2.java:437) at org.txm.importer.ApplyXsl2.processImportSources(ApplyXsl2.java:363) at org.txm.importer.ApplyXsl2$processImportSources.call(Unknown Source) ... at org.txm.core.engines.ScriptedImportEngine.build(ScriptedImportEngine.java:56) at org.txm.objects.Project._compute(Project.java:320) at org.txm.core.results.TXMResult.compute(TXMResult.java:2224) at org.txm.core.results.TXMResult.compute(TXMResult.java:2143) at org.txm.rcp.handlers.scripts.ExecuteImportScript$2.run(ExecuteImportScript.java:146) at org.eclipse.core.internal.jobs.Worker.run(Worker.java:56) Caused by: org.xml.sax.SAXParseException; systemId: file:/home/alavrent/TXM-0.8.0/corpora/JOUBERTXTZOG/src/metadata.xlsx; lineNumber: 1; columnNumber: 1; Contenu non autorisé dans le prologue. at com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.createSAXParseException(ErrorHandlerWrapper.java:203) at com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.fatalError(ErrorHandlerWrapper.java:177) ... at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1213) at com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(SAXParserImpl.java:643) at net.sf.saxon.event.Sender.sendSAXSource(Sender.java:405) ... 37 more --------- org.xml.sax.SAXParseException; systemId: file:/home/alavrent/TXM-0.8.0/corpora/JOUBERTXTZOG/src/metadata.xlsx; lineNumber: 1; columnNumber: 1; Contenu non autorisé dans le prologue. at com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.createSAXParseException(ErrorHandlerWrapper.java:203) at com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.fatalError(ErrorHandlerWrapper.java:177) at com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:400) ... at org.txm.scripts.importer.xtz.xtzLoader.run(xtzLoader.groovy:58) at groovy.util.GroovyScriptEngine.run(GroovyScriptEngine.java:599) at org.txm.groovy.core.GroovyScriptedImportEngine._build(GroovyScriptedImportEngine.java:123) at org.txm.core.engines.ScriptedImportEngine.build(ScriptedImportEngine.java:56) at org.txm.objects.Project._compute(Project.java:320) at org.txm.core.results.TXMResult.compute(TXMResult.java:2224) at org.txm.core.results.TXMResult.compute(TXMResult.java:2143) at org.txm.rcp.handlers.scripts.ExecuteImportScript$2.run(ExecuteImportScript.java:146) at org.eclipse.core.internal.jobs.Worker.run(Worker.java:56)
ODS :
Sauvegarde des paramètres d'importation… Démarrage du script d'import Groovy xtzLoader.groovy. Warning: the 7the column name is empty -- Split-Merge XSL Step with /home/alavrent/Bureau/Ex5/joubert-xtz-og/xsl/1-split-merge -- Front XSL Step with the /home/alavrent/Bureau/Ex5/joubert-xtz-og/xsl/2-front directory. ApplyXsl2 with the /home/alavrent/Bureau/Ex5/joubert-xtz-og/xsl/2-front/01-txm-front-teip5-og-xtz-joubert-removeAncor.xsl stylesheet. -- Applying /home/alavrent/Bureau/Ex5/joubert-xtz-og/xsl/2-front/01-txm-front-teip5-og-xtz-joubert-removeAncor.xsl XSL to 3 (from /home/alavrent/TXM-0.8.0/corpora/JOUBERTXTZOG/src) files with parameters: {output-directory=file:/home/alavrent/TXM-0.8.0/corpora/JOUBERTXTZOG/txm/JOUBERTXTZOG/} on directory /home/alavrent/TXM-0.8.0/corpora/JOUBERTXTZOG/src result written in /home/alavrent/TXM-0.8.0/corpora/JOUBERTXTZOG/txm/JOUBERTXTZOG 003 ...Error on line 1 column 1 of metadata.ods: SXXP0003: Error reported by XML parser: Contenu non autorisé dans le prologue. net.sf.saxon.trans.XPathException: org.xml.sax.SAXParseException; systemId: file:/home/alavrent/TXM-0.8.0/corpora/JOUBERTXTZOG/src/metadata.ods; lineNumber: 1; columnNumber: 1; Contenu non autorisé dans le prologue. at net.sf.saxon.event.Sender.sendSAXSource(Sender.java:425) at net.sf.saxon.event.Sender.send(Sender.java:178) at net.sf.saxon.Controller.transform(Controller.java:1790) at org.txm.importer.ApplyXsl2.process(ApplyXsl2.java:304) at org.txm.importer.ApplyXsl2.processImportSources(ApplyXsl2.java:437) at org.txm.importer.ApplyXsl2.processImportSources(ApplyXsl2.java:363) ... at org.txm.scripts.importer.xtz.XTZImport.start(XTZImport.groovy:86) at org.txm.importer.xtz.ImportModule.process(ImportModule.java:242) at org.txm.importer.xtz.ImportModule$process$2.call(Unknown Source) at org.txm.scripts.importer.xtz.xtzLoader.run(xtzLoader.groovy:58) at groovy.util.GroovyScriptEngine.run(GroovyScriptEngine.java:599) at org.txm.groovy.core.GroovyScriptedImportEngine._build(GroovyScriptedImportEngine.java:123) at org.txm.core.engines.ScriptedImportEngine.build(ScriptedImportEngine.java:56) at org.txm.objects.Project._compute(Project.java:320) at org.txm.core.results.TXMResult.compute(TXMResult.java:2224) at org.txm.core.results.TXMResult.compute(TXMResult.java:2143) at org.txm.rcp.handlers.scripts.ExecuteImportScript$2.run(ExecuteImportScript.java:146) at org.eclipse.core.internal.jobs.Worker.run(Worker.java:56) Caused by: org.xml.sax.SAXParseException; systemId: file:/home/alavrent/TXM-0.8.0/corpora/JOUBERTXTZOG/src/metadata.ods; lineNumber: 1; columnNumber: 1; Contenu non autorisé dans le prologue. at com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.createSAXParseException(ErrorHandlerWrapper.java:203) at com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.fatalError(ErrorHandlerWrapper.java:177) at com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:400) ... at com.sun.org.apache.xerces.internal.parsers.XMLParser.parse(XMLParser.java:141) at com.sun.org.apache.xerces.internal.parsers.AbstractSAXParser.parse(AbstractSAXParser.java:1213) at com.sun.org.apache.xerces.internal.jaxp.SAXParserImpl$JAXPSAXParser.parse(SAXParserImpl.java:643) at net.sf.saxon.event.Sender.sendSAXSource(Sender.java:405) ... 31 more --------- org.xml.sax.SAXParseException; systemId: file:/home/alavrent/TXM-0.8.0/corpora/JOUBERTXTZOG/src/metadata.ods; lineNumber: 1; columnNumber: 1; Contenu non autorisé dans le prologue. at com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.createSAXParseException(ErrorHandlerWrapper.java:203) at com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.fatalError(ErrorHandlerWrapper.java:177) a... at org.txm.scripts.importer.xtz.XTZImport.start(XTZImport.groovy:86) at org.txm.importer.xtz.ImportModule.process(ImportModule.java:242) at org.txm.importer.xtz.ImportModule$process$2.call(Unknown Source) at org.txm.scripts.importer.xtz.xtzLoader.run(xtzLoader.groovy:58) at groovy.util.GroovyScriptEngine.run(GroovyScriptEngine.java:599) at org.txm.groovy.core.GroovyScriptedImportEngine._build(GroovyScriptedImportEngine.java:123) at org.txm.core.engines.ScriptedImportEngine.build(ScriptedImportEngine.java:56) at org.txm.objects.Project._compute(Project.java:320) at org.txm.core.results.TXMResult.compute(TXMResult.java:2224) at org.txm.core.results.TXMResult.compute(TXMResult.java:2143) at org.txm.rcp.handlers.scripts.ExecuteImportScript$2.run(ExecuteImportScript.java:146) at org.eclipse.core.internal.jobs.Worker.run(Worker.java:56)
Solution¶
update ApplyXSL file filters and fix data source selection in XML/w and XTZ import module
Validation test¶
- download attached archive: xslodsmetadata.zip
- import with XML/w and XTZ
- the import don't fail and only one text "t1" is present
Associated revisions
fix xlsx extension filtering refs #2793
History
#1 Updated by Matthieu Decorde over 3 years ago
- Category set to Import
- Target version changed from TXM 0.8.2 to TXM 0.8.1
#2 Updated by Matthieu Decorde over 3 years ago
- Subject changed from RCP: 0.8.0, XTZ Import, XLSX and ODS metadata files not supported when XSL directory present to 0.8.0, XTZ Import, XLSX and ODS metadata files not supported when XSL directory is present
#3 Updated by Matthieu Decorde over 3 years ago
- % Done changed from 0 to 80
- File xslodsmetadata.zip added
- Description updated (diff)
#4 Updated by Serge Heiden over 2 years ago
- Subject changed from 0.8.0, XTZ Import, XLSX and ODS metadata files not supported when XSL directory is present to XTZ, XML/w and Transcriber Import, XLSX and ODS metadata files not supported when XSL directory is present
- Priority changed from Normal to Urgent
- Target version changed from TXM 0.8.1 to TXM 0.8.2
- % Done changed from 80 to 0
Bug still present in Transcriber import module.
#5 Updated by Matthieu Decorde over 2 years ago
- File xslodsmetadata.zip added
- % Done changed from 0 to 80
update attached test directory