Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
Projects in Awesome Lists tagged with pdf-document-processor
A curated list of projects in awesome lists tagged with pdf-document-processor .
https://github.com/wmjordan/PDFPatcher
PDF补丁丁——PDF工具箱,可以编辑书签、剪裁旋转页面、解除限制、提取或合并文档,探查文档结构,提取图片、转成图片等等
pdf pdf-converter pdf-document-processor pdf-generation
Last synced: 31 Jul 2024
https://github.com/pdf2htmlEX/pdf2htmlEX
Convert PDF to HTML without losing text or format.
html pdf pdf-document-processor pdf-viewer
Last synced: 31 Jul 2024
https://github.com/qpdf/qpdf
QPDF: A content-preserving PDF document transformer
Last synced: 30 Sep 2024
https://github.com/run-llama/llama_parse
Parse files for optimal RAG
document parsing pdf pdf-document-processor ppt pptx structured-data
Last synced: 01 Oct 2024
https://github.com/unidoc/unipdf
Golang PDF library for creating and processing PDF files (pure go)
golang pdf pdf-compression pdf-document-processor pdf-generation pdf-generator pdf-library pdf-manipulation pdf-reader pdf-reports pdf-sign signing text-extraction
Last synced: 30 Sep 2024
https://github.com/uglytoad/pdfpig
Read and extract text and other content from PDFs in C# (port of PDFBox)
alto-xml csharp document-analysis hocr layout-analysis netstandard page-xml pdf pdf-document pdf-document-processor pdf-extractor pdf-files pdf-generation pdfbox
Last synced: 01 Oct 2024
https://github.com/UglyToad/PdfPig
Read and extract text and other content from PDFs in C# (port of PDFBox)
alto-xml csharp document-analysis hocr layout-analysis netstandard page-xml pdf pdf-document pdf-document-processor pdf-extractor pdf-files pdf-generation pdfbox
Last synced: 31 Jul 2024
https://github.com/sailist/chatgpt-enhancement-extension
An all-in-one plugin to improve your ChatGPT experience!
chatgpt chatgpt-chrome-extension markdown pdf-document-processor
Last synced: 31 Jul 2024
https://github.com/abarker/pdfCropMargins
pdfCropMargins -- a program to crop the margins of PDF files
crop cropper pdf pdf-converter pdf-document-processor python
Last synced: 31 Jul 2024
https://github.com/hellerbarde/stapler
A small utility making use of the pypdf library to provide a (somewhat) lighter alternative to pdftk
pdf pdf-converter pdf-document-processor python
Last synced: 15 Aug 2024
https://github.com/GURPREETKAURJETHRA/Multi-PDFs_ChatApp_AI-Agent
Meet MultiPDF 📚 Chat AI App! 🚀 Chat seamlessly with Multiple PDFs using Langchain, Google Gemini Pro & FAISS Vector DB with Seamless Streamlit Deployment. Get instant, accurate responses from Awesome Google Gemini OpenSource language Model. 📚💬 Transform your PDF experience now! 🔥✨
chat-application chatbot-application chatgpt gemini gemini-api gemini-pro generative-ai google instructor-embeddings langchain langchain-python large-language-models llm open-source openai pdf-document-processor python-3 streamlit-application
Last synced: 01 Aug 2024
https://github.com/ksharindam/pdfcook
Prepress preparing tool and PDF editor
pdf pdf-document-processor pdf-editor prepress
Last synced: 01 Aug 2024
https://github.com/JustinTheWhale/PDF-Dark-Mode
Converts PDF's to have a grey background to be easier on the eyes
converter linux macos numba pdf pdf-converter pdf-document pdf-document-processor pillow poppler python python3 python310 python36 python37 python38 python39 windows
Last synced: 04 Aug 2024
https://github.com/rvbcldud/focus-study
A collection of FOCUS Bible studies in booklet format.
bible-study catholic pdf-document-processor
Last synced: 02 Aug 2024
https://github.com/trogon/otus-pdf
Object oriented PDF document generation library (for PHP).
composer-package library pdf-document pdf-document-processor pdf-generation php php54 php55 php56 php7 php71 php72 php73
Last synced: 28 Sep 2024
https://github.com/ctadeodev/spark-word-counter
A Dockerized PySpark application for counting word frequencies in an input PDF document
docker pdf-document-processor pyspark python spark
Last synced: 28 Sep 2024
https://github.com/vin0x/pdf-to-vehicle-data-etl
This project extract data from a website (.pdf file) containing car data, manipulate data, store in a AWS RDS, create pipeline with Apache Airflow to automatically refresh and create a Power BI Dashboard.
database-schema etl jupyter manipulate-data pdf-document-processor
Last synced: 27 Sep 2024
https://github.com/akari2600/pdf_analyzer_poc
PDF layout analysis with OpenCV
kivy opencv pdf-document-processor
Last synced: 25 Sep 2024