Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/ahlusar1989/pdf-table-extractor_saran_version1
PDF
https://github.com/ahlusar1989/pdf-table-extractor_saran_version1
Last synced: 23 days ago
JSON representation
- Host: GitHub
- URL: https://github.com/ahlusar1989/pdf-table-extractor_saran_version1
- Owner: ahlusar1989
- License: mit
- Created: 2015-05-26T15:56:51.000Z (over 9 years ago)
- Default Branch: master
- Last Pushed: 2015-05-26T18:21:46.000Z (over 9 years ago)
- Last Synced: 2024-10-05T23:41:19.366Z (about 1 month ago)
- Language: HTML
- Size: 4.38 MB
- Stars: 0
- Watchers: 3
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
*PDF Table Extraction Utility.* Analyses a page in a PDF looking
for well delineated table cells, and extracts the text in each cell.
Outputs include JSON, XML, and CSV lists of cell locations, shapes,
and contents, and CSV and HTML versions of the tables. This utility
is intended to be the first step in automatically processing data
in tables from a PDF file, and was originally designed to read the
tables in ST Micro’s datasheets. The script requires numpy and poppler
(pdftoppm and pdftotext)###License
[MIT Expat](http://ashimagroup.net/os/license/mit-expat)###Tags
[Utilities](http://ashimagroup.net/os/tag/utilities)