Projects in Awesome Lists tagged with pdf-ocr
A curated list of projects in awesome lists tagged with pdf-ocr .
https://github.com/stirling-tools/stirling-pdf
#1 Locally hosted web application that allows you to perform various operations on PDF files
docker java pdf pdf-converter pdf-editor pdf-manipulation pdf-merger pdf-ocr pdf-tools pdf-web-apps pdfmerger
Last synced: 03 Apr 2026
https://github.com/Stirling-Tools/Stirling-PDF
locally hosted web application that allows you to perform various operations on PDF files
docker java pdf pdf-converter pdf-manipulation pdf-merger pdf-ocr pdf-tools pdf-web-apps pdfmerger
Last synced: 24 Mar 2025
https://github.com/alam00000/bentopdf
The Privacy First PDF Toolkit
adobe-acrobat docker hacktoberfest javascript jpgtopdf pdf pdf-converter pdf-editor pdf-generation pdf-ocr pdf-tools pdf-viewer pdf-viewer-component pdffiller pdfjs privacy self-hosted self-hosting toolkit typescript
Last synced: 23 May 2026
https://github.com/Haste171/langchain-chatbot
AI Chatbot for analyzing/extracting information from data in conversational format.
ai artificial-intelligence bot chromadb discord discord-bot embeddings extractive-question-answering gpt-3 gpt-4 langchain ocr openai openai-api openai-api-chatbot pdf pdf-chat-bot pdf-ocr pinecone vector-database
Last synced: 24 Mar 2025
https://github.com/jiangnanboy/JiaJiaOCR
Building on the existing general text recognition capabilities, new features such as handwritten OCR, layout detection, and table detection and recognition have been added, covering all scenarios involving printed text, handwritten text, and document structure analysis.在原通用文本识别基础上,新增手写 OCR、版面检测、表格检测与识别功能,覆盖印刷体、手写体、文档结构解析全场景。
handwriting-recognition java-ocr layout ocr pdf-ocr table-ocr
Last synced: 17 Jan 2026
https://github.com/ahnafnafee/local-llm-pdf-ocr
Convert scanned PDFs into searchable text locally using Vision LLMs (olmOCR). 100% private, offline, and free. Features a modern Web UI & CLI.
document-processing fastapi local-llm no-api-key ocr offline-ai olmocr pdf-ocr privacy-focused python searchable-pdf surya-ocr vision-llm web-ui
Last synced: 24 Apr 2026
https://github.com/codelined-ag/extracto
Your private document brain. PDFs in, RAG out. Self-hosted. Plug everywhere.
agents bun claude docker document-processing mcp mcp-server mistral nextjs ocr ollama openrouter pdf-ocr rag self-hosted vector-database vision-models
Last synced: 10 May 2026
https://github.com/azozzalfiras/pdf-ocr
A simple, free tool for extracting text from scanned PDFs and images using OCR, and converting images to PDFs. It processes files locally in the browser, ensuring privacy and security while enabling users to effortlessly convert documents and images into editable text or PDF format.
azozzalfiras image-ocr ocr pdf-ocr pdf2text pdf2txt
Last synced: 09 Apr 2025
https://github.com/bbc-esq/fast-pyocr
Simple and reliable script to conduct high-quality fast OCR on a PDF
ocr ocr-python pdf pdf-ocr pdf-ocr-extraction tesseract-ocr tesseract-ocr-engine windows-ocr
Last synced: 20 Jan 2026
https://github.com/shadwoods2942/pdf-merger
A Python utility for merging multiple PDFs and images into a single PDF file. This tool maintains aspect ratios, centers content on custom-sized pages (default A4), and supports recursive directory processing. Perfect for organizing documents and creating cohesive PDF compilations.
javascript merger pdf pdf-editor pdf-generation pdf-manipulation pdf-ocr pdf-tools pdf-web-apps pdfmerger ppt-merger ppts ppts-to-pdf-merger pypdf
Last synced: 24 Jul 2025
https://github.com/am009/llm-online-tool
LLM PDF OCR工具,Markdown/Latex 文章翻译工具。支持逐段翻译和直接校对。支持数学公式。基于大语言模型(LLM)API
Last synced: 26 Sep 2025