https://github.com/taxihabbel/parsemypdf
Collection of PDF parsing libraries like AI based docling, claude, openai, llama-vision, unstructured-io, and pdfminer, pymupdf, pdfplumber etc for efficient snapshot, text, table, and metadata extraction.
https://github.com/taxihabbel/parsemypdf
Last synced: 3 days ago
JSON representation
Collection of PDF parsing libraries like AI based docling, claude, openai, llama-vision, unstructured-io, and pdfminer, pymupdf, pdfplumber etc for efficient snapshot, text, table, and metadata extraction.
- Host: GitHub
- URL: https://github.com/taxihabbel/parsemypdf
- Owner: taxihabbel
- License: mit
- Fork: true (genieincodebottle/parsemypdf)
- Created: 2025-02-18T21:30:16.000Z (4 months ago)
- Default Branch: main
- Last Pushed: 2024-12-18T05:14:54.000Z (6 months ago)
- Last Synced: 2025-04-30T03:24:13.810Z (about 2 months ago)
- Homepage: https://www.instagram.com/genieincodebottle/
- Size: 1.94 MB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
Awesome Lists containing this project
- mcp-index - Complex PDF Parsing Toolkit - Extract and analyze complex PDF documents using various tools to maintain document structure and efficiently extract tables, images, and mixed content. Specialized processors are available tailored to the complexity and content type of the PDFs. (File Management)