Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

https://github.com/pymupdf/PyMuPDF

PyMuPDF is a high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents.

data-science epub extract-data font mupdf ocr pdf pdf-documents pymupdf python table-extraction tesseract text-processing text-shaping xps

Last synced: 03 Jul 2024

https://github.com/lvgithub/blog

技术资料日常积累(欢迎投稿)

c chrome-extension http linux nodejs ocr python3 tools

Last synced: 02 Jul 2024

https://github.com/wzx54321/LockDemo

指纹识别、图形识别、aliOCR识别

demo fingerprint fingerprint-lock gesture-lock gesturelock ocr pattern pattern-lock

Last synced: 02 Jul 2024

https://github.com/RD17/ambar

:mag: Ambar: Document Search Engine

ambar ambar-search ocr pdf search search-engine search-in-text self-hosted

Last synced: 02 Jul 2024

https://github.com/ZGGSONG/STranslate

A ready-to-use, ready-to-go translation ocr tool developed by WPF/WPF 开发的一款即开即用、即用即走的翻译、OCR工具

baidu-api bing deepl gemini mvvm ocr openai paddleocr tts wpf

Last synced: 02 Jul 2024

https://github.com/datasciencecampus/readpyne

Toolkit for extracting relevant lines from receipts or similar image data.

dsc-projects extraction ocr receipts research

Last synced: 01 Jul 2024

https://github.com/TalkUHulk/ai.deploy.box

A toolbox for deep learning model deployment using C++ YoloX | YoloV7 | YoloV8 | Gan | OCR | MobileVit | Scrfd | MobileSAM | StableDiffusion

controlnet cpp face gan lora mnn mobilesam ncnn ocr onnx paddlelite scrfd stablediffusion tnn webassembly yolov7 yolov8 yolox

Last synced: 01 Jul 2024

https://github.com/aim-uofa/AdelaiDet

AdelaiDet is an open source toolbox for multiple instance-level detection and recognition tasks.

abcnet adelaidet blendmask boxinst condinst densecl fcos instance-segmentation meinst object-detection ocr solo solov2 text-detection text-recognition

Last synced: 30 Jun 2024

https://github.com/Masao-Taketani/FOTS_OCR

TensorFlow Implementation of FOTS, Fast Oriented Text Spotting with a Unified Network.

computer-vision deep-learning image-recognition ocr scene-text-recognition tensorflow

Last synced: 30 Jun 2024

https://github.com/fcakyon/craft-text-detector

Packaged, Pytorch-based, easy to use, cross-platform version of the CRAFT text detector

actions anaconda computer-vision craft deep-learning document hacktoberfest linux macos neural-network ocr pypi python pytorch text text-detection vision windows workflow

Last synced: 30 Jun 2024

https://github.com/liuheng92/tensorflow_PSENet

This is a tensorflow re-implementation of PSENet: Shape Robust Text Detection with Progressive Scale Expansion Network.My blog:

cpp ocr psenet python tensorflow text-detection

Last synced: 30 Jun 2024

https://github.com/yizt/keras-ctpn

keras复现场景文本检测网络CPTN: 《Detecting Text in Natural Image with Connectionist Text Proposal Network》;欢迎试用,关注,并反馈问题...

ctpn deep-learning keras ocr text-detection

Last synced: 30 Jun 2024

https://github.com/shinjayne/textboxes

Textboxes implementation with Tensorflow (python)

ocr

Last synced: 30 Jun 2024

https://github.com/YongWookHa/craft-text-detector

CRAFT text detector for high resolution image

craft high-resolution ocr pytorch pytorch-lightning text-detector

Last synced: 30 Jun 2024

https://github.com/songdejia/EAST

This is a pytorch re-implementation of EAST: An Efficient and Accurate Scene Text Detector.

deeplearning east icdar ocr pytorch textdetection

Last synced: 30 Jun 2024

https://github.com/HusseinYoussef/Arabic-OCR

OCR system for Arabic language that converts images of typed text to machine-encoded text.

arabic character-segmentation computer-vision dataset image-processing machine-learning neural-network ocr opencv-python scikit-learn segmentation

Last synced: 30 Jun 2024

https://github.com/clovaai/TedEval

TedEval: A Fair Evaluation Metric for Scene Text Detectors

evaluation icdar ocr ocr-detection ocr-evaluation scene-text-detectors tedeval text-detection text-detectors

Last synced: 30 Jun 2024

https://github.com/qurator-spk/sbb_binarization

Document Image Binarization

binarization ocr qurator

Last synced: 30 Jun 2024

https://github.com/deepinsight/insightocr

MXNet OCR implementation. Including text recognition and detection.

crnn mxnet ocr text-recognition

Last synced: 30 Jun 2024

https://github.com/yinchangchang/ocr_densenet

第一届西安交通大学人工智能实践大赛(2018AI实践大赛--图片文字识别)第一名;仅采用densenet识别图中文字

densenet ocr ocr-recognition python pytorch

Last synced: 30 Jun 2024

https://github.com/watsonyanghx/CNN_LSTM_CTC_Tensorflow

CNN+LSTM+CTC based OCR implemented using tensorflow.

cnn ctc lstm ocr tensorflow

Last synced: 30 Jun 2024

https://github.com/maxim2266/go-ocr

A tool for extracting text from scanned documents (via OCR), with user-defined post-processing.

extract-images go ocr scanned-documents

Last synced: 30 Jun 2024

https://github.com/courao/ocr.pytorch

A pure pytorch implemented ocr project including text detection and recognition

crnn ctpn ocr ocr-pytorch text-detection text-recognition

Last synced: 30 Jun 2024

https://github.com/BowieHsu/tensorflow_ocr

OCR detection implement with tensorflow v1.4

ocr tensorflow

Last synced: 30 Jun 2024

https://github.com/LanguageMachines/PICCL

A set of workflows for corpus building through OCR, post-correction and normalisation

computational-linguistics corpus-linguistics corpus-tools folia nlp ocr workflow

Last synced: 30 Jun 2024

https://github.com/FLming/CRNN.tf2

Convolutional Recurrent Neural Network(CRNN) for End-to-End Text Recognition - TensorFlow 2

crnn ctc keras ocr scene-text-recognition tensorflow-lite tensorflow2 tf2

Last synced: 30 Jun 2024

https://github.com/weinman/cnn_lstm_ctc_ocr

Tensorflow-based CNN+LSTM trained with CTC-loss for OCR

convolutional-neural-networks ctc lstm ocr tensorflow text-recognition

Last synced: 30 Jun 2024

https://github.com/cseas/ocr-table

Extract tables from scanned image PDFs using Optical Character Recognition.

extract-tables ocr ocr-table optical-character-recognition pdfminer python scanned-image-pdfs shell tesseract

Last synced: 30 Jun 2024

https://github.com/Breta01/handwriting-ocr

OCR software for recognition of handwritten text

handwriting-ocr machine-learning ocr opencv python recognition tensorflow

Last synced: 30 Jun 2024

https://github.com/ExtractTable/ExtractTable-py

Python library to extract tabular data from images and scanned PDFs

extracttable image-table-recognition ocr pdf-table-extract table-extraction tabular-data

Last synced: 30 Jun 2024

https://github.com/githubharald/SimpleHTR

Handwritten Text Recognition (HTR) system implemented with TensorFlow.

deep-learning handwritten-text-recognition machine-learning ocr recurrent-neural-networks tensorflow

Last synced: 30 Jun 2024

https://github.com/githubharald/WordDetector

Detect handwritten words (classic image processing based method).

detector handwriting-recognition ocr segmentation text-detection

Last synced: 30 Jun 2024

https://github.com/qurator-spk/sbb_textline_detection

Detect textlines in document images

ocr qurator textline-segmentation

Last synced: 30 Jun 2024

https://github.com/bgshih/aster

Recognizing cropped text in natural images.

computer-vision ocr recognition scene-text

Last synced: 30 Jun 2024

https://github.com/DetectionTeamUCAS/R2CNN_Faster-RCNN_Tensorflow

Rotational region detection based on Faster-RCNN.

dota face faster-rcnn ocr r2cnn remote-sensing tensorflow

Last synced: 30 Jun 2024

https://github.com/beacandler/R2CNN

caffe re-implementation of R2CNN: Rotational Region CNN for Orientation Robust Scene Text Detection

caffe deep-learning ocr scene-text-detection

Last synced: 30 Jun 2024

https://github.com/Media-Smart/vedastr

A scene text recognition toolbox based on PyTorch

ocr ocr-recognition pytorch scene-text-recognition text-recognition transformer

Last synced: 30 Jun 2024

https://github.com/thomasjhuang/deep-learning-for-document-dewarping

An application of high resolution GANs to dewarp images of perturbed documents

document gan ocr ocr-recognition pix2pix

Last synced: 30 Jun 2024

https://github.com/gpadmaku1/card-reader

Card Reader Android Application. Scan business cards and directly save the details to your Google Contacts.

android-application business-card google-contacts google-libphonenumber google-vision-api ocr textrecognition

Last synced: 30 Jun 2024

https://github.com/kdzwinel/JS-OCR-demo

JavaScript optical character recognition demo

demo javascript ocr

Last synced: 30 Jun 2024

https://github.com/Azornes/ocrTranslator

Convert captured images to text using BaiduOCR, GoogleOCR, WindowsOCR, tesseractOCR, RapidOCR or Capture2Text, and translate the resulting text using Google, Chatgpt, Edgegpt, DeepL or many more. Desktop application with a nice GUI provided by customtkinter.

baidu-ocr bing capture2text chatgpt deepl edgegpt google-ocr google-translate ocr python rapidocr tesseract-ocr translator windows-ocr

Last synced: 29 Jun 2024

https://github.com/kha-white/manga-ocr

Optical character recognition for Japanese text, with the main focus being Japanese manga

comics computer-vision deep-learning japanese manga ocr transformers

Last synced: 29 Jun 2024

https://github.com/prabhakar267/image2text

:clipboard: Python wrapper to grab text from images and save as text files using Tesseract Engine

image2text ocr optical-character-recognition python-wrapper tesseract tesseract-engine tesseract-installation tesseract-ocr

Last synced: 27 Jun 2024

https://github.com/stafazzoli/FarsiOCR

An OCR application for Farsi/ Persian documents.

farsi ocr python tesseract tesseract-ocr

Last synced: 27 Jun 2024

https://github.com/krviolent/subtitles_extract

Tool for extraction hard-coded (hardsub) Chinese subtitles from video files with 720p resolution

chinese chinese-translation easyocr machine-learning ocr python srt-subtitles subtitles video

Last synced: 27 Jun 2024

https://github.com/SWHL/RapidVideOCR

Extract video hard subtitles and automatically generate corresponding srt files.

ocr subtitle video videosubfinder

Last synced: 27 Jun 2024

https://github.com/elahe-dastan/first-grade

OCR for Iranian national ID card, etc.

cnn deep-learning machine-learning ocr pytorch

Last synced: 27 Jun 2024

https://github.com/Belval/TextRecognitionDataGenerator

A synthetic data generator for text recognition

data dataset fake ocr synthetic text text-recognition training-set-generator

Last synced: 27 Jun 2024

https://github.com/AgentMaker/AgentOCR

一个多语言支持、易使用的 OCR 项目。An easy-to-use OCR project with multilingual support.

easy-deploy multilingual ocr onnx

Last synced: 27 Jun 2024

https://github.com/Kocarus/Manga-Translator-TesseractOCR

Automatically translates manga pages with Tesseract-OCR and Google Translate API for Python

google-translate-api manga ocr opencv-python pytesseract python27 tesseract-ocr translator

Last synced: 27 Jun 2024

https://github.com/hrishikeshrt/google_drive_ocr

Perform OCR using Google's Drive API v3

command-line-tool drive-api google-ocr multiprocessing ocr python

Last synced: 27 Jun 2024

https://github.com/Coolshanlan/HighlightTranslator

Highlight Translator can help you to translate the words quickly and accurately. By only highlighting, copying, or screenshoting the content you want to translate anywhere on your computer (ex. PDF, PPT, WORD etc.), the translated results will then be automatically displayed before you.

cambridge-dictionary cambridge-translate-api capture copy google-api google-translate google-translate-api highlight ocr python screenshot tesseract-ocr translate translator

Last synced: 27 Jun 2024

https://github.com/HamidRezaAttar/Persian-OCR-Streamlit

Persian OCR allows users to scan documents and extract text from scanned image.

ocr ocr-python ocr-recognition persian persian-ocr

Last synced: 27 Jun 2024

https://github.com/bgshih/crnn

Convolutional Recurrent Neural Network (CRNN) for image-based sequence recognition.

computer-vision machine-learning ocr sequence-recognition torch7

Last synced: 27 Jun 2024

https://github.com/argman/EAST

A tensorflow implementation of EAST text detector

deep-learning ocr tensorflow text-detection

Last synced: 27 Jun 2024

https://github.com/dengdan/seglink

An Implementation of the seglink alogrithm in paper Detecting Oriented Text in Natural Images by Linking Segments

ocr robust-reading text-detection

Last synced: 27 Jun 2024

https://github.com/MhLiao/TextBoxes_plusplus

TextBoxes++: A Single-Shot Oriented Scene Text Detector

ocr scene-text scene-text-detection scene-text-recognition

Last synced: 27 Jun 2024

https://github.com/mindee/react-mindee-js

Front-End Computer Vision SDK for React

computer-vision javascript ocr react reactjs sdk

Last synced: 27 Jun 2024

https://github.com/theos-ai/easy-paddle-ocr

This a clean and easy-to-use implementation of Paddle OCR. Made with ❤️ by Theos AI.

custom-ocr license-plate-recognition machine-learning ocr optical-character-recognition paddle paddleocr paddlepaddle python text-recognition

Last synced: 27 Jun 2024

https://github.com/epsylon/cintruder

Captcha Intruder (CIntrud3r) is an automatic pentesting tool to bypass captchas.

bruteforcer captcha cintruder ocr pentesting

Last synced: 26 Jun 2024

https://github.com/AnyListen/tools-ocr

树洞 OCR 文字识别(一款跨平台的 OCR 小工具)

cross-platform javafx mac ocr screenshot windows

Last synced: 26 Jun 2024

https://github.com/entropy2333/awesome-key-information-extraction

A curated list of papers about key information extraction.

kie ocr

Last synced: 25 Jun 2024

https://github.com/card-io/card.io-Android-source

The open-source code for the card.io-Android-SDK: provides fast, easy credit card scanning in mobile apps

android card-scanning credit-card ocr sdk

Last synced: 25 Jun 2024

https://github.com/aheze/OpenFind

An app to find text in real life.

app camera find hacktoberfest ios ocr photos realm swift swiftui uikit vision

Last synced: 25 Jun 2024

https://github.com/cloudy-sfu/GUI-for-paddlepaddle-OCR

The GUI for "paddlepaddle" OCR

ocr paddleocr paddlepaddle pyqt5 python

Last synced: 24 Jun 2024

https://github.com/WZBSocialScienceCenter/pdftabextract

A set of tools for extracting tables from PDF files helping to do data mining on (OCR-processed) scanned documents.

data-mining image-processing ocr pdf python tables

Last synced: 24 Jun 2024

https://github.com/hikopensource/DAVAR-Lab-OCR

OCR toolbox from Davar-Lab

dar ocr

Last synced: 24 Jun 2024

https://github.com/siyuan-note/siyuan

A privacy-first, self-hosted, fully open source personal knowledge management software, written in typescript and golang.

anki chatgpt electron evernote knowledge-base local-first markdown note-taking notebook notes-app notion obsidian ocr openai pdf pkm s3 self-hosted webdav

Last synced: 24 Jun 2024

https://github.com/altomator/ALTO-HTML

Conversion of ALTO files (including tags) to HTML

alto-xml ocr

Last synced: 24 Jun 2024

https://github.com/ARBML/Calliar

A dataset for online Arabic calligraphy. A collection of 2500 annotated calligraphic styles.

arabic calliar calligraphy ocr online-handwritten

Last synced: 24 Jun 2024

https://github.com/cv-small-snails/Awesome-Table-Recognition

A curated list of resources dedicated to table recognition

dataset ocr ocr-recognition papers papers-with-code table-recognition

Last synced: 22 Jun 2024

https://github.com/zelon88/HRCloud2

A full-featured home hosted Cloud Drive, Personal Assistant, App Launcher, File Converter, Streamer, Share Tool & More!

antivirus applauncher cloud-drive cloud-platform cloud-storage cms editor enterprise file-converter nextcloud ocr owncloud paas personal-assistants security self-hosted server share-tool streamer wordpress

Last synced: 19 Jun 2024

https://github.com/dmMaze/BallonsTranslator

深度学习辅助漫画翻译工具, 支持一键机翻和简单的图像/文本编辑 | Yet another computer-aided comic/manga translation tool powered by deeplearning

anime auto-translation chinese-translation comics computer-aided-translation computer-vision deep-learning inpainting manga ocr pyqt pyqt6 pytorch qt qt6 scene-text-detection

Last synced: 16 Jun 2024

https://github.com/openrecall/openrecall

OpenRecall is a fully open-source, privacy-first alternative to proprietary solutions like Microsoft's Windows Recall. With OpenRecall, you can easily access your digital history, enhancing your memory and productivity without compromising your privacy.

ai alternative history macos ocr open-source privacy python recall search self-hosted semantic windows

Last synced: 16 Jun 2024

https://github.com/jyp-studio/Invoice_detection

This is an AI model for detecting and recognizing invoice information by yolov5 and OCR.

detection object-detection ocr pytesseract recognition yolo yolov5

Last synced: 15 Jun 2024

https://github.com/008karan/PAN_OCR

Building OCR using YOLO and Tesseract

deep-learning ocr tesseract yolo

Last synced: 15 Jun 2024

https://github.com/xuguodong1999/COCR

OCR/OCSR on handwritting ⏣/chemical-structural-formulas with YOLO & CRNN models.

chemical-formula chemistry handwriting-recognition ocr ocsr

Last synced: 15 Jun 2024

https://github.com/pannous/tensorflow-ocr

🖺 OCR using tensorflow with attention

ocr tensorflow

Last synced: 15 Jun 2024

https://github.com/breezedeus/Pix2Text

An Open-Source Python3 tool for recognizing layouts, tables, math formulas (LaTeX), and text in images, converting them into Markdown format. A free alternative to Mathpix, empowering seamless conversion of visual content into text-based representations. 80+ languages are supported.

image-to-markdown latex latex-pdf layout-analysis math-formula math-formula-recognition math-ocr mathpix ocr python pytorch table-ocr

Last synced: 14 Jun 2024

https://github.com/zhoubear/open-paperless

Scan, index, and archive all of your paper documents (acquired by Mayan EDMS)

documents groupware ocr office paperless pdf scanner

Last synced: 14 Jun 2024

https://github.com/YangDai2003/CopilotOCR-Android

Fast OCR, multi-language support, sleek design. Scan, edit, and share text effortlessly. Smart barcode and QR code scanning.

android android-ocr-application java ocr ocr-android ocr-text-reader

Last synced: 14 Jun 2024