https://github.com/parthapray/pdf_text_extraction_json_section_subsection
This repo contains codes for extraction of PDF text to JSON to show section number, section title, section body content, footnote
https://github.com/parthapray/pdf_text_extraction_json_section_subsection
article-extractor document extraction json pdf pymupdf-fitz regex text
Last synced: about 2 months ago
JSON representation
This repo contains codes for extraction of PDF text to JSON to show section number, section title, section body content, footnote
- Host: GitHub
- URL: https://github.com/parthapray/pdf_text_extraction_json_section_subsection
- Owner: ParthaPRay
- License: other
- Created: 2024-12-23T12:25:48.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2024-12-23T12:46:48.000Z (over 1 year ago)
- Last Synced: 2025-01-09T23:54:47.739Z (over 1 year ago)
- Topics: article-extractor, document, extraction, json, pdf, pymupdf-fitz, regex, text
- Language: Python
- Homepage:
- Size: 2.01 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0