Feature #1983
Mis à jour par Serge Heiden il y a plus de 8 ans
Currently TXM doesn't provide any PDF format related import module for the reasons explained "here":https://groupes.renater.fr/wiki/txm-users/public/faq#peut_on_importer_des_documents_au_format_pdf_dans_txm.
We must help to import sources in PDF format because:
* it is massively used
* even if the PDF format is not easy to manage, some PDF representations work well
* some libraries do decent job with respect to some PDF representations
h3. Solution
* use the PDF Java library used by "GROBID":https://github.com/kermitt2/grobid (a tool used by a lot of document management platforms)
* document the fact that PDF import is not be perfect
We must help to import sources in PDF format because:
* it is massively used
* even if the PDF format is not easy to manage, some PDF representations work well
* some libraries do decent job with respect to some PDF representations
h3. Solution
* use the PDF Java library used by "GROBID":https://github.com/kermitt2/grobid (a tool used by a lot of document management platforms)
* document the fact that PDF import is not be perfect