Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/WZBSocialScienceCenter/pdftabextract
A set of tools for extracting tables from PDF files helping to do data mining on (OCR-processed) scanned documents.
https://github.com/WZBSocialScienceCenter/pdftabextract
data-mining image-processing ocr pdf python tables
Last synced: 5 days ago
JSON representation
A set of tools for extracting tables from PDF files helping to do data mining on (OCR-processed) scanned documents.
- Host: GitHub
- URL: https://github.com/WZBSocialScienceCenter/pdftabextract
- Owner: WZBSocialScienceCenter
- License: apache-2.0
- Created: 2016-07-08T11:44:46.000Z (over 8 years ago)
- Default Branch: master
- Last Pushed: 2022-06-24T09:51:22.000Z (over 2 years ago)
- Last Synced: 2024-10-29T15:34:20.945Z (5 days ago)
- Topics: data-mining, image-processing, ocr, pdf, python, tables
- Language: Python
- Homepage: https://datascience.blog.wzb.eu/2017/02/16/data-mining-ocr-pdfs-using-pdftabextract-to-liberate-tabular-data-from-scanned-documents/
- Size: 138 MB
- Stars: 2,216
- Watchers: 84
- Forks: 371
- Open Issues: 5
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGES.md
- License: LICENSE
Awesome Lists containing this project
- awesome-pdf - pdftabextract
- awesome-list - pdftabextract - A set of tools for extracting tables from PDF files helping to do data mining on (OCR-processed) scanned documents. (Computer Vision / OCR)
- starred-awesome - pdftabextract - A set of tools for extracting tables from PDF files helping to do data mining on (OCR-processed) scanned documents. (Python)
- awesome-python-machine-learning-resources - GitHub - 14% open · ⏱️ 24.06.2022): (光学字符识别OCR)