Feature #1508
Mis à jour par Matthieu Decorde il y a environ 10 ans
Currently the pager of the XML/w+CSV import module doesn't transfer any structure to the HTML pages although they are present in the WTC source, that is those structures are available for CQL queries.
It may be useful to have some HTML source elements available in corpus editions. For example, web links (<a> element) which could be interpreted by the current edition HTML renderer. Other elements could be considered: <p>, <i>... (most HTML elements)
h3. Discussion
Default XML/w+CSV import module should not interpret any semantics of XML elements. But it is currently already interpreting TEI elements semantics for various purpose. As for the TEI semantics aware XTZ importer that we have to develop, we may define a (X)HTML importer to implement HTML specific semantics.
h3. Solution
Produce a '<a href="url" target="output">url</a>' element in HTML pages if present in the source. The 'output' frame could receive all links followed by default.
MD: ok but the content of the a elements is tokenized
It may be useful to have some HTML source elements available in corpus editions. For example, web links (<a> element) which could be interpreted by the current edition HTML renderer. Other elements could be considered: <p>, <i>... (most HTML elements)
h3. Discussion
Default XML/w+CSV import module should not interpret any semantics of XML elements. But it is currently already interpreting TEI elements semantics for various purpose. As for the TEI semantics aware XTZ importer that we have to develop, we may define a (X)HTML importer to implement HTML specific semantics.
h3. Solution
Produce a '<a href="url" target="output">url</a>' element in HTML pages if present in the source. The 'output' frame could receive all links followed by default.
MD: ok but the content of the a elements is tokenized