Feature #1641

RCP: X.X, Synoptic import module

Added by Serge Heiden over 3 years ago. Updated 5 months ago.

Status:New Start date:01/21/2016
Priority:Normal Due date:
Assignee:- % Done:

0%

Category:Import Spent time: -
Target version:TXM X.X

Description

A lot of projects need to compare raw OCR results or transcriptions with original page images.

XTZ import module helps a lot to easily build synoptic editions, but it is also rather complex with all the various features it provides.

The idea is to help people use TXM to build and use synoptic editions, and possibly to use more tools like concordances to explore OCR errors efficiently and maybe discover more TXM tools in the end.

Solution

Build a simplified, and reduced, import module UI (based on the XTZ import module) that selects only a source directory as input and builds a corpus with synoptic editions.

The necessary source directory structure with XML-TEI files and images files or URLs is documented extensively to prepare the sources correctly before calling the import module.

The import module (XTZ) should provide progressive, systematic, extensive and comprehensive diagnostic messages while importing the corpus to help debug the corpus sources.

Such a corpus could be opened in a new simplified, and reduced, Edition Perspective.

History

#1 Updated by Sebastien Jacquot about 1 year ago

  • Target version changed from TXM 0.8.0a (split/restructuration) to TXM 0.8.0

#2 Updated by Matthieu Decorde 5 months ago

  • Target version changed from TXM 0.8.0 to TXM X.X

Also available in: Atom PDF