An open API service indexing awesome lists of open source software.

Projects in Awesome Lists tagged with ocr-python

A curated list of projects in awesome lists tagged with ocr-python .

https://github.com/hiroi-sora/umi-ocr

OCR software, free and offline. 开源、免费的离线OCR软件。支持截屏/批量导入图片,PDF文档识别,排除水印/页眉页脚,扫描/生成二维码。内置多国语言库。

ocr ocr-python paddleocr

Last synced: 05 Apr 2025

https://github.com/hiroi-sora/Umi-OCR

OCR software, free and offline. 开源、免费的离线OCR软件。支持截屏/批量导入图片,PDF文档识别,排除水印/页眉页脚,扫描/生成二维码。内置多国语言库。

ocr ocr-python paddleocr

Last synced: 24 Mar 2025

https://github.com/breezedeus/CnOCR

CnOCR: Awesome Chinese/English OCR Python toolkits based on PyTorch. It comes with 20+ well-trained models for different application scenarios and can be used directly after installation. 【基于 PyTorch/MXNet 的中文/英文 OCR Python 包。】

chinese-character-recognition english-character-recognition ocr ocr-python pytorch

Last synced: 04 Apr 2025

https://github.com/breezedeus/cnocr

CnOCR: Awesome Chinese/English OCR Python toolkits based on PyTorch. It comes with 20+ well-trained models for different application scenarios and can be used directly after installation. 【基于 PyTorch/MXNet 的中文/英文 OCR Python 包。】

chinese-character-recognition english-character-recognition ocr ocr-python pytorch

Last synced: 06 Oct 2025

https://github.com/catchthetornado/text-extract-api

Document (PDF, Word, PPTX ...) extraction and parse API using state of the art modern OCRs + Ollama supported models. Anonymize documents. Remove PII. Convert any document or picture to structured JSON or Markdown

anonymization api extract json llm ocr ocr-python pdf pii

Last synced: 14 May 2025

https://github.com/hiroi-sora/umi-ocr_v2

结束和新的开始

ocr ocr-python paddleocr qml qt

Last synced: 12 Apr 2025

https://github.com/hiroi-sora/Umi-OCR_v2

结束和新的开始

ocr ocr-python paddleocr qml qt

Last synced: 24 Mar 2025

https://github.com/blueaxis/Cloe

Manga OCR snipping application for desktop

manga-ocr ocr ocr-python pyqt5 snipping-tool

Last synced: 14 Mar 2025

https://github.com/shibing624/imgocr

Python3 package for Chinese/English OCR, with paddleocr-v4 onnx model(~14MB). 基于ppocr-v4-onnx模型推理,可实现 CPU 上毫秒级的 OCR 精准预测,通用场景中英文OCR达到开源SOTA。

chinese-ocr ocr ocr-python

Last synced: 22 Apr 2025

https://github.com/pspdfkit/nutrient-dws-client-python

Official Python client library for Nutrient Document Web Services API - PDF processing, OCR, watermarking, and document manipulation with automatic Office format conversion

ocr-python pdf-converter pdf-document-processor pdf-generation pdf-processing python

Last synced: 05 Sep 2025

https://github.com/bentoml/BentoOCR

Turn any OCR models into online inference API endpoint 🚀 🌖

ai-applications model-deployment model-serving ocr ocr-python

Last synced: 04 May 2025

https://github.com/fardinhash/easyocr-based-automatic-bangla-license-plate-recognition

EasyOCR is basically Optical Character Reading package that belongs from PyTorch. Using this texts from the images can be extracted easily, documents, texts can be scanned. For License Plate's Number Recognition, it can be applicable easily as it can extract the texts. About License Plate's Number, there are several language's character plates are in the world, Bangla is one of them. Here EasyOCR is applied for Bangla Character Based License Plate Recognition.

bangla-character bangla-license-plates bangla-number bangla-ocr character-recognition license-checking license-number license-plate-detection license-plate-recognition machine-learning number-plate-detection number-plate-recognition ocr ocr-bangla ocr-python ocr-recognition python

Last synced: 11 Apr 2025

https://github.com/hermann-web/python-ocr

Converting invoice pdf to image, image to text and then get, from the text, invoice informations like invoice number or vendor name

image-to-text invoice-number invoice-pdf ocr ocr-python ocr-recognition ocr-text-reader pdf pdf-to-image python tesseract

Last synced: 21 Jun 2025

https://github.com/pk5ls20/easypaddleocr

A simple package for PaddleOCR on CPU and GPU using PyTorch

ocr ocr-python paddleocr paddlepaddle pytorch

Last synced: 19 Apr 2025

https://github.com/moheladwy/ocr4linux

OCR Script Tool for Extracting Text from Screenshots (images) using bash, and python scripts only

bash bash-script ocr ocr-python ocr-text-reader python tesseract tesseract-engine tesseract-ocr tesseract-python

Last synced: 27 Oct 2025

https://github.com/rubenszimbres/repo-2020

Machine Learning, Google Cloud and Quantitative Algorithms for Stocks Trading

google googlecloudplatform ocr-python pascal

Last synced: 10 Apr 2025

https://github.com/jdhao/anti-ocr

A tool to generate text images that are hard for OCR engine to recognize and understand.

anti-ocr ocr-python ocr-recognition

Last synced: 29 Oct 2025

https://github.com/HamidRezaAttar/Persian-OCR-Streamlit

Persian OCR allows users to scan documents and extract text from scanned image.

ocr ocr-python ocr-recognition persian persian-ocr

Last synced: 09 Jul 2025

https://github.com/minnukota381/flask-ocr-app

A web application that allows users to upload an image and convert it to text using Optical Character Recognition (OCR) technology. This application supports user authentication and provides a user-friendly interface for image uploads and text extraction.

css3 easyocr flask gunicorn html5 javascript jinja2 ninja numpy ocr-python pillow-library scipy sqlite3 torch torchvision werkzeug

Last synced: 04 Aug 2025

https://github.com/junalmeida/homeassistant-addons

Home Assistant add ons, home to the Utility Meter Parser MQTT add on.

camera home-assistant meters ocr-python opencv-python

Last synced: 12 Oct 2025

https://github.com/kartikmehta8/basic-surya-ocr

Basic Implementation of Surya OCR [EN]

ocr-python python surya

Last synced: 13 Apr 2025

https://github.com/stabrise/scaledp-tutorials

Tutorials for ScaleDP library. ScaleDP is an Open-Source Library for Processing Documents in Apache Spark.

ner nlp ocr ocr-python pdf spark

Last synced: 12 Oct 2025

https://github.com/rover0811/drugdiary

AI를 활용한 복약관리 캘린더 앱

expo ocr-python react-native

Last synced: 06 Jul 2025

https://github.com/zenoverflow/eyeofbabel

Privacy-first self-hosted translation app with basic OCR capabilities.

desktop-app ocr ocr-python privacy python python3 python311 self-hosted server-app translator webui

Last synced: 22 Apr 2025

https://github.com/syaagalib/project_ocr

This code can do ocr

ocr-python python streamlit ui

Last synced: 25 Jun 2025

https://github.com/techycsr/advaitelegrambot

Telegram Advanced AI ChatBot: GPT-4o and Gpt-4o-mini, Dall-E Model, OCR and Google Voice2Text.

gpt-4o gpt-4o-mini machine-learning ocr-python openai pyrogram python3 telegram-bot tgbot

Last synced: 12 May 2025

https://github.com/arbazkhan4712/image-to-text

It fetch the text from an image and saves it into a text file

ocr ocr-python ocr-recognition ocr-text-reader python tesseract

Last synced: 10 Apr 2025

https://github.com/annndruha/ocr-munji

Text detection of printed text in Munji language.

lingustics ocr ocr-python ocr-recognition

Last synced: 21 Jul 2025

https://github.com/khaouitiabdelhakim/arabicocr-python-tutorial

This project uses the ArabicOcr package to convert Arabic text in images to editable text using OCR techniques.

ocr ocr-python ocr-recognition ocr-text-reader pip python script

Last synced: 25 Apr 2025

https://github.com/udaylunawat/music-playlist-ocr-python

Automating PNG Screenshot parsing to get Artist and Music Title.

image-processing ocr ocr-python pytesseract python tesseract-ocr

Last synced: 11 Oct 2025

https://github.com/kimaruthagna/digital-vision

Computer Vision Basics to advanced. My journey through this subfield of AI

computer-vision imageai model-weights object-detection ocr-python opencv-python optical-flow

Last synced: 18 Mar 2025

https://github.com/dashroshan/data-extractor

Extract and download key-value pairs, tables, and paragraphs from your scanned pdf, jpg, and png documents as CSV files.

document-extraction form-analysis key-value-pairs ocr-python table-extraction

Last synced: 06 Apr 2025

https://github.com/zreechxnn/servo-controller-ocr

Servo Controller-OCR integrates computer vision, OCR, and Arduino to control a servo motor based on text detection from a webcam. It uses Python for real-time image processing and Tesseract OCR for text recognition, combined with Arduino to handle servo motor operations. Ideal for automation projects requiring text-based triggers.

arduino-uno iot-device ocr-python ocr-text-reader opencv opencv-python servo-controller

Last synced: 14 Oct 2025

https://github.com/macktireh/easycardbackend

API for managing credit cards and extracting credit card numbers from an image of a credit card using artificial intelligence.

ai api clean-code easycard flask flask-restx injection-dependency ocr-python python

Last synced: 15 Apr 2025

https://github.com/altayyuzeir/pdf2docx-paddleocr-ui

📚 Pdf to Docx Converter with PaddleX & PaddleOCR UI

ocr ocr-python ocr-text-reader paddleocr paddlepaddle paddlex paddlex-gui streamlit

Last synced: 21 Mar 2025

https://github.com/bbc-esq/fast-pyocr

Simple and reliable script to conduct high-quality fast OCR on a PDF

ocr ocr-python pdf pdf-ocr pdf-ocr-extraction tesseract-ocr tesseract-ocr-engine windows-ocr

Last synced: 01 May 2025

https://github.com/sevagh/scriptorium

OCR reading assistant with opencv, Tesseract, kraken, DAWGs and a splay tree

dafsa dawg dictionary ocr ocr-python python-multiprocessing splay-tree tesseract

Last synced: 09 Apr 2025

https://github.com/meph1sto666/spark

Arknights OCR tool to automatically create a detailed list of your operators.

arknights arknights-helper ocr ocr-python ocr-recognition skeleton sveltekit

Last synced: 12 Oct 2025

https://github.com/mixpeek/top-ocr-libraries

Most popular open source OCR libraries listed by accuracy and speed

ocr-python opencv tesseract-ocr tika

Last synced: 27 Jun 2025

https://github.com/kfur/fineocr

Free OCR that use FineReader (previously FineScanner) Mobile API , due to hardened password

abbyy asyncio ocr ocr-pdf ocr-python python3

Last synced: 22 Mar 2025

https://github.com/boshyxd/resumeocr

Python tool that converts multiple resume images to searchable text files using OCR technology

conversion ocr ocr-python ocr-recognition python resume scraper

Last synced: 22 Feb 2025

https://github.com/hermann-web/pix2tex

This Python script employs Streamlit and LatexOCR to extract LaTeX formulas from uploaded images (JPEG, JPG, PNG). It displays the uploaded image and the extracted LaTeX formula, demonstrating a basic example of LaTeX to Markdown conversion.

image-processing latex latex-ocr ocr ocr-python pix2text python streamlit

Last synced: 23 Feb 2025

https://github.com/Jayakrishnan-mk/User-Doc-Management-NEST-JS

User-Document-Management System - Modular Nest Js Backend. This is a production-ready NestJS backend application for user and document management.

axios bcryptjs jest jwt multer nestjs nodejs ocr-python passport-jwt pdf-parse postgresql typeorm typescript virustotal

Last synced: 30 Jun 2025

https://github.com/lucs1590/study_ocr

This is a repository to test with tesseract OCR and to improve other technical images.

ocr ocr-python ocr-recognition python3 tesseract-ocr

Last synced: 18 Aug 2025

https://github.com/jwinman91/ai-ocr

An AI-powered, but model-agnostic (Optical-Character-Recognition) OCR tool

genai image-to-plot-generation image-to-text-generation llama-cpp ocr-python ocr-recognition python3

Last synced: 13 Mar 2025

https://github.com/anant2003jain/textextractify

TextExtractify is an AI-powered tool that extracts text from images and PDFs using both Azure OCR and EasyOCR. It offers features like multi-image upload, text entity extraction, and .docx export for premium users. Designed to streamline document processing with fast, accurate text extraction.

azure login-system ocr ocr-python pillow python3 streamlit text-extraction

Last synced: 21 Aug 2025

https://github.com/xenoswarlocks/image_text_extractor

A Python-based tool for batch processing and extracting text from images using OCR (Tesseract). The extracted text is cleaned by removing unwanted terms, and potential names are identified and formatted. Results are saved in a structured text file for easy reference. Ideal for automating data extraction and preprocessing tasks.

ocr-python ocr-recognition pytesseract-ocr python unittest

Last synced: 30 Nov 2025

https://github.com/furkankhann/screenai

An AI-powered clipboard tool that fetches responses from models like Gemini instantly when text is copied. Plans include integrating OCR for handling non-copiable text and enhancing the UI for a seamless experience.

artificial-intelligence chatgpt gemini large-language-models natural-language-understanding ocr-python ocr-recognition open-source python

Last synced: 10 Sep 2025

https://github.com/imprvhub/ocrsense-python

an optical character recognition python web app.

flask ocr ocr-python ocr-recognition python

Last synced: 11 Jun 2025

https://github.com/intummadee/visionid-check

Apply Computer Vision to create a system for checking names for exams by checking cards instead of signing and displays the names of students and their exam entry status. 💳

bootstrap computer-vision django-admin django-framework django-project image-processing jquery mongodb mongodb-atlas ocr ocr-python opencv pandas pillow python tesseract tesseract-ocr

Last synced: 03 Nov 2025

https://github.com/y1d1r/pyfacture

PyFacture is a Python project designed to automate expense management from receipts. The application utilizes image processing techniques and Optical Character Recognition (OCR) using Tesseract and Llama3.2-vision to extract relevant information from a photo of a receipt.

image-procesing lamma ocr-python opencv python tesseract

Last synced: 17 Mar 2025

https://github.com/blackmonk13/wordament_solver

A simple tool to help you find words in Wordament puzzles using OpenCV.

image-processing ocr ocr-python ocr-recognition opencv tesseract-ocr trie-data-structure wordament wordament-solver

Last synced: 17 Jun 2025

https://github.com/bnvulpe/code-extractor

Transforming images into code at a click. Upload a photo or screenshot and copy the code to your script in seconds!

code-extraction cv2 image-processing-python ocr ocr-python open-source optical-character-recognition python streamlit-application tesseract-ocr text-retrieval

Last synced: 12 Mar 2025

https://github.com/takk8is/thearchivistlens

This system implements the complete Reinert Method of Descending Hierarchical Classification (DHC), offering all IRaMuTeQ functionalities

analysis davidccavalcante iramuteq llm machine-learning ocr ocr-python ocr-recognition ocr-text-reader python reinert takk-ag takk-design takk8is

Last synced: 10 Oct 2025

https://github.com/victorharbo/ocr_workshop

This repository is the material for the OCR workshop at AU Library, Nobelparken.

ocr-python ocr-recognition tesseract

Last synced: 10 Oct 2025

https://github.com/snakers4/silero-ocr

Nice, clean and minimalistic OCR pipeline for Russian and English.

english english-ocr ocr ocr-python ocr-recognition onnx onnxruntime optical-character-recognition russian russian-language

Last synced: 30 Mar 2025

https://github.com/abstra-app/template-quote-proposal

Quote Proposal Workflow with AI automatic quotations based on a proposal + PDF Generation with quotation + Email notification to the customer.

ocr ocr-python ocr-recognition pdf-generator python-workflows quote-generator quote-proposal sales sales-automation

Last synced: 06 Apr 2025

https://github.com/zephyrusblaze/studybuddy-ai

StudyBuddy is an AI-powered web app that helps students summarize notes, generate practice questions, and get answers to specific study material queries. It supports PDFs, images, and text files, making learning more efficient and interactive.

flask gemini-pro html-css-javascript ocr-python python study-project

Last synced: 24 Jul 2025

https://github.com/kshrugalj/lex-med

This is for my LexMed project that I had done.

mupdf ocr-python python tesseract-ocr

Last synced: 16 Aug 2025

https://github.com/riccardogiorato/together-ai-vision-examples

Together AI SDK Vision and OCR examples in Typescript and Python

ocr ocr-python together-ai togetherai vision-api

Last synced: 29 Jun 2025

https://github.com/sdam-au/ocr

Experiments with OCR using Python.

ocr ocr-python pymupdf pytesseract tesseract tool

Last synced: 15 May 2025

https://github.com/harika534/fas

Built a tool that detects and corrects faulty images uploaded,using OpenCV and OCR technologies.Useful when uploading data in identity verfication portals.

html-css-javascript ocr-python opencv-python pyhton

Last synced: 01 Nov 2025

https://github.com/tjkessler/tesseract-positional

Tool to save positional OCR data to a text file

ocr-python ocr-recognition ocr-text-reader tesseract tesseract-ocr

Last synced: 15 Mar 2025

https://github.com/jwinman91/ai-ocr-frontend

An AI-powered, but model-agnostic (Optical-Character-Recognition) OCR tool (frontend)

genai ocr-python ocr-recognition python3 streamlit

Last synced: 15 Mar 2025

https://github.com/imprvhub/ocr-sense

an optical character recognition python web app.

flask ocr-python ocr-recognition optical-character-recognition python vercel-deployment

Last synced: 04 Mar 2025

https://github.com/pchaparro/ocr-practices

A collection of practices abour OCR to transform scanned and handwritten documents to digital formats (xlxs, csv, txt, etc.)

ocr ocr-python python

Last synced: 24 Mar 2025

https://github.com/ajxv/pyocr-flask

Pdf OCR text extraction using python

ocr-python pdf-text pytesseract python

Last synced: 24 Dec 2025

https://github.com/jonanv/ocr-python

Optical Character Recognition Python

ocr ocr-python ocr-recognition python python3

Last synced: 10 Jun 2025

https://github.com/mathstava/jk_tech-user-doc-management

User and Document Management is a robust NestJS backend solution designed for efficient user and document handling. This repository includes features like JWT authentication, PostgreSQL integration, and comprehensive testing to ensure reliability. 🐙📄

axios bcryptjs jest jwt multer nestjs nodejs ocr-python passport-jwt pdf-parse postgresql typeorm typescript virustotal

Last synced: 02 Jul 2025

https://github.com/pranav-0309/ocr_model_dc

OCR model to extract a primary and a secondary ID, for each image-insurance type pair.

jupyter-notebook ocr ocr-python ocr-recognition python3 pytorch

Last synced: 29 Mar 2025

https://github.com/tsvetang2/advanced-local-ocr

Advanced local OCR is a project, inspired by the text extraction some AIs do. So instead of leaving people paying for such services, why not publish a open-source version, that keeps the privacy of each user. The app allows integration with LLMs via APIs.

desktop-app easyocr gui-application image-processing llms local-first ocr ocr-engine ocr-python ocr-text-reader privacy pyqt5 pyqt5-desktop-application tesseract text-cleaning text-recognition-from-image

Last synced: 03 Jul 2025

https://github.com/karanvishwakarma-1807/mnist-cnn-digit-recognition

Convolutional Neural Network (CNN) for handwritten digit recognition using the MNIST dataset with TensorFlow/Keras — a simple Optical Character Recognition (OCR) demo.

cnn-keras machine-learning mnist-handwriting-recognition ocr-python python tens

Last synced: 14 Jun 2025

https://github.com/pilarcode/receipt-ocr

Named entity recognition (NER). Extraction of features from images of receipts with different formats. #NER #OCR 🛒🏷️

flask-api ocr-python

Last synced: 28 Feb 2025

https://github.com/carlosacchi/captiocr

CaptiOCR - A real-time screen text extraction tool using Tesseract OCR. Capture, recognize, and log on-screen text dynamically. Future updates will include on-demand language installation, resizable selection areas, and live text overlays.

captions live logging ocr ocr-python ocr-recognition

Last synced: 28 Feb 2025

https://github.com/samuela31/sanskrit-manuscripts-revival-using-deep-learning-techniques

Restoring destroyed text in ancient Sanskrit manuscripts by predicting missing text using deep learning techniques. Mini project done in 3rd year of college using RoBERTa LLM, Tesseract OCR, and OpenCV.

hugging-face huggingface huggingface-datasets huggingface-transformers jupyter-notebook llm manuscripts-restoration mlm ocr ocr-python opencv restoration roberta roberta-fine-tuning roberta-model sanskrit sanskrit-manuscripts tesseract tesseract-ocr

Last synced: 18 Jul 2025

https://github.com/prathamesh-patil-5090/image_recognition

An image recognition project that leverages deep learning techniques to classify and analyze images. The model is built using Python and TensorFlow/Keras, with a focus on recognizing and categorizing objects from various image datasets.

django ocr ocr-python python

Last synced: 19 Oct 2025

https://github.com/moha-cm/bizcardx

BizCardX: Extracting Business Card Data with OCR

data-extraction mariadb-database ocr ocr-python sql sqlalchemy streamlit-dashboard

Last synced: 27 Feb 2025