Feature #1983

Updated by Serge Heiden about 3 years ago

Currently TXM doesn't provide any PDF format related import module for the reasons explained "here":https://groupes.renater.fr/wiki/txm-users/public/faq#peut_on_importer_des_documents_au_format_pdf_dans_txm.

We must help to import sources in PDF format because:
* it is massively used
* even if the PDF format is not easy to manage, some PDF representations work well
* some libraries do decent job with respect to some PDF representations

h3. Solution

* use the PDF Java library used by "GROBID":https://github.com/kermitt2/grobid (a tool used by a lot of document management platforms)
* document the fact that PDF import is not be perfect