An open API service indexing awesome lists of open source software.

https://github.com/zevio/pcu_pdf

PDF parser component (Apache Tika) for PCU project
https://github.com/zevio/pcu_pdf

apache component parser pcu pdf pdf-parser-component pdf-to-text python tika

Last synced: about 1 month ago
JSON representation

PDF parser component (Apache Tika) for PCU project

Awesome Lists containing this project

README

          

# pcu_pdf (Apache Tika parser for PCU project)

PDF parser component (Apache Tika) for PCU project.
From the path of a PDF file, get its textual content.

Based on [Apache Tika][tika].

![pdf](https://framapic.org/3KUuLTR6t4ot/ZK3b8GArxwxC.png)

----

[Check PCU project][pcu].

[tika]: https://tika.apache.org
[pcu]: https://github.com/zevio/pcu_core

## Usage in another project

If you wish to import this module in another Python project, please install it :

`pip install pcu-pdf`

Then, add this import line at the beginning of your Python file :

`from pcu_pdf import pcu_pdf`

You can now use pcu_pdf's functions, for example :

`pcu_pdf.PDFParser("path/to/pdf/file")`

## Test

To test your installation, go to pcu_pdf/ directory and execute the Makefile with the following command line :

`make test`