https://github.com/statcan/slicemypdf
This project uses SLICE algorithm to extract information from a text-based PDF page containing financial statements (tabular data). It can also be used to extract regular tables but will contain all text on a page.
https://github.com/statcan/slicemypdf
Last synced: 8 months ago
JSON representation
This project uses SLICE algorithm to extract information from a text-based PDF page containing financial statements (tabular data). It can also be used to extract regular tables but will contain all text on a page.
- Host: GitHub
- URL: https://github.com/statcan/slicemypdf
- Owner: StatCan
- License: other
- Created: 2021-08-11T13:52:06.000Z (over 4 years ago)
- Default Branch: main
- Last Pushed: 2021-08-11T14:11:42.000Z (over 4 years ago)
- Last Synced: 2023-03-02T22:23:16.370Z (about 3 years ago)
- Language: Jupyter Notebook
- Homepage:
- Size: 714 KB
- Stars: 21
- Watchers: 4
- Forks: 7
- Open Issues: 3