Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/pymupdf/PyMuPDF
PyMuPDF is a high performance Python library for data extraction, analysis, conversion & manipulation of PDF (and other) documents.
data-science epub extract-data font mupdf ocr pdf pdf-documents pymupdf python table-extraction tesseract text-processing text-shaping xps
Last synced: 03 Jul 2024
https://github.com/lvgithub/blog
技术资料日常积累(欢迎投稿)
c chrome-extension http linux nodejs ocr python3 tools
Last synced: 02 Jul 2024
https://github.com/wzx54321/LockDemo
指纹识别、图形识别、aliOCR识别
demo fingerprint fingerprint-lock gesture-lock gesturelock ocr pattern pattern-lock
Last synced: 02 Jul 2024
https://github.com/RD17/ambar
:mag: Ambar: Document Search Engine
ambar ambar-search ocr pdf search search-engine search-in-text self-hosted
Last synced: 02 Jul 2024
https://github.com/datasciencecampus/readpyne
Toolkit for extracting relevant lines from receipts or similar image data.
dsc-projects extraction ocr receipts research
Last synced: 01 Jul 2024
https://github.com/TalkUHulk/ai.deploy.box
A toolbox for deep learning model deployment using C++ YoloX | YoloV7 | YoloV8 | Gan | OCR | MobileVit | Scrfd | MobileSAM | StableDiffusion
controlnet cpp face gan lora mnn mobilesam ncnn ocr onnx paddlelite scrfd stablediffusion tnn webassembly yolov7 yolov8 yolox
Last synced: 01 Jul 2024
https://github.com/autonise/CRAFT-Remade
Implementation of CRAFT Text Detection
craft detection ocr pytorch pytorch-implementation text-detection weak-supervision
Last synced: 30 Jun 2024
https://github.com/aim-uofa/AdelaiDet
AdelaiDet is an open source toolbox for multiple instance-level detection and recognition tasks.
abcnet adelaidet blendmask boxinst condinst densecl fcos instance-segmentation meinst object-detection ocr solo solov2 text-detection text-recognition
Last synced: 30 Jun 2024
https://github.com/Masao-Taketani/FOTS_OCR
TensorFlow Implementation of FOTS, Fast Oriented Text Spotting with a Unified Network.
computer-vision deep-learning image-recognition ocr scene-text-recognition tensorflow
Last synced: 30 Jun 2024
https://github.com/fcakyon/craft-text-detector
Packaged, Pytorch-based, easy to use, cross-platform version of the CRAFT text detector
actions anaconda computer-vision craft deep-learning document hacktoberfest linux macos neural-network ocr pypi python pytorch text text-detection vision windows workflow
Last synced: 30 Jun 2024
https://github.com/HaozhengLi/EAST_ICPR
Forked from argman/EAST for the ICPR MTWI 2018 CHALLENGE
detect detecting detection detector east icdar icdar2015 icpr icpr2018 mtwi mtwi2018 ocr tensorflow text text-detect text-detecting text-detection text-detector
Last synced: 30 Jun 2024
https://github.com/liuheng92/tensorflow_PSENet
This is a tensorflow re-implementation of PSENet: Shape Robust Text Detection with Progressive Scale Expansion Network.My blog:
cpp ocr psenet python tensorflow text-detection
Last synced: 30 Jun 2024
https://github.com/yizt/keras-ctpn
keras复现场景文本检测网络CPTN: 《Detecting Text in Natural Image with Connectionist Text Proposal Network》;欢迎试用,关注,并反馈问题...
ctpn deep-learning keras ocr text-detection
Last synced: 30 Jun 2024
https://github.com/shinjayne/textboxes
Textboxes implementation with Tensorflow (python)
Last synced: 30 Jun 2024
https://github.com/YongWookHa/craft-text-detector
CRAFT text detector for high resolution image
craft high-resolution ocr pytorch pytorch-lightning text-detector
Last synced: 30 Jun 2024
https://github.com/songdejia/EAST
This is a pytorch re-implementation of EAST: An Efficient and Accurate Scene Text Detector.
deeplearning east icdar ocr pytorch textdetection
Last synced: 30 Jun 2024
https://github.com/HusseinYoussef/Arabic-OCR
OCR system for Arabic language that converts images of typed text to machine-encoded text.
arabic character-segmentation computer-vision dataset image-processing machine-learning neural-network ocr opencv-python scikit-learn segmentation
Last synced: 30 Jun 2024
https://github.com/clovaai/TedEval
TedEval: A Fair Evaluation Metric for Scene Text Detectors
evaluation icdar ocr ocr-detection ocr-evaluation scene-text-detectors tedeval text-detection text-detectors
Last synced: 30 Jun 2024
https://github.com/qurator-spk/sbb_binarization
Document Image Binarization
Last synced: 30 Jun 2024
https://github.com/deepinsight/insightocr
MXNet OCR implementation. Including text recognition and detection.
crnn mxnet ocr text-recognition
Last synced: 30 Jun 2024
https://github.com/yinchangchang/ocr_densenet
第一届西安交通大学人工智能实践大赛(2018AI实践大赛--图片文字识别)第一名;仅采用densenet识别图中文字
densenet ocr ocr-recognition python pytorch
Last synced: 30 Jun 2024
https://github.com/watsonyanghx/CNN_LSTM_CTC_Tensorflow
CNN+LSTM+CTC based OCR implemented using tensorflow.
Last synced: 30 Jun 2024
https://github.com/maxim2266/go-ocr
A tool for extracting text from scanned documents (via OCR), with user-defined post-processing.
extract-images go ocr scanned-documents
Last synced: 30 Jun 2024
https://github.com/courao/ocr.pytorch
A pure pytorch implemented ocr project including text detection and recognition
crnn ctpn ocr ocr-pytorch text-detection text-recognition
Last synced: 30 Jun 2024
https://github.com/open-mmlab/mmocr
OpenMMLab Text Detection, Recognition and Understanding Toolbox
abcnet abinet crnn dbnet deep-learning fcenet key-information-extraction maskrcnn ocr pan panet psenet pytorch sar sdmg-r segmentation-based-text-recognition spts svtr text-detection text-recognition
Last synced: 30 Jun 2024
https://github.com/BowieHsu/tensorflow_ocr
OCR detection implement with tensorflow v1.4
Last synced: 30 Jun 2024
https://github.com/LanguageMachines/PICCL
A set of workflows for corpus building through OCR, post-correction and normalisation
computational-linguistics corpus-linguistics corpus-tools folia nlp ocr workflow
Last synced: 30 Jun 2024
https://github.com/FLming/CRNN.tf2
Convolutional Recurrent Neural Network(CRNN) for End-to-End Text Recognition - TensorFlow 2
crnn ctc keras ocr scene-text-recognition tensorflow-lite tensorflow2 tf2
Last synced: 30 Jun 2024
https://github.com/weinman/cnn_lstm_ctc_ocr
Tensorflow-based CNN+LSTM trained with CTC-loss for OCR
convolutional-neural-networks ctc lstm ocr tensorflow text-recognition
Last synced: 30 Jun 2024
https://github.com/cseas/ocr-table
Extract tables from scanned image PDFs using Optical Character Recognition.
extract-tables ocr ocr-table optical-character-recognition pdfminer python scanned-image-pdfs shell tesseract
Last synced: 30 Jun 2024
https://github.com/Breta01/handwriting-ocr
OCR software for recognition of handwritten text
handwriting-ocr machine-learning ocr opencv python recognition tensorflow
Last synced: 30 Jun 2024
https://github.com/ExtractTable/ExtractTable-py
Python library to extract tabular data from images and scanned PDFs
extracttable image-table-recognition ocr pdf-table-extract table-extraction tabular-data
Last synced: 30 Jun 2024
https://github.com/githubharald/SimpleHTR
Handwritten Text Recognition (HTR) system implemented with TensorFlow.
deep-learning handwritten-text-recognition machine-learning ocr recurrent-neural-networks tensorflow
Last synced: 30 Jun 2024
https://github.com/githubharald/WordDetector
Detect handwritten words (classic image processing based method).
detector handwriting-recognition ocr segmentation text-detection
Last synced: 30 Jun 2024
https://github.com/qurator-spk/sbb_textline_detection
Detect textlines in document images
ocr qurator textline-segmentation
Last synced: 30 Jun 2024
https://github.com/bgshih/aster
Recognizing cropped text in natural images.
computer-vision ocr recognition scene-text
Last synced: 30 Jun 2024
https://github.com/DetectionTeamUCAS/R2CNN_Faster-RCNN_Tensorflow
Rotational region detection based on Faster-RCNN.
dota face faster-rcnn ocr r2cnn remote-sensing tensorflow
Last synced: 30 Jun 2024
https://github.com/beacandler/R2CNN
caffe re-implementation of R2CNN: Rotational Region CNN for Orientation Robust Scene Text Detection
caffe deep-learning ocr scene-text-detection
Last synced: 30 Jun 2024
https://github.com/HsiehYiChia/Scene-text-recognition
Scene text detection and recognition based on Extremal Region(ER)
adaboost algorithm canny cascade-classifier chaincode classifier computer-vision detection image-processing lbp machine-learning mser non-maximum-suppression ocr opencv scene-text-detection scene-text-recognition spelling-checker svm text-recognition
Last synced: 30 Jun 2024
https://github.com/Media-Smart/vedastr
A scene text recognition toolbox based on PyTorch
ocr ocr-recognition pytorch scene-text-recognition text-recognition transformer
Last synced: 30 Jun 2024
https://github.com/jiangxiluning/MASTER-TF
MASTER
cv deep-learning ocr ocr-recognition scene-text-recognition transformer
Last synced: 30 Jun 2024
https://github.com/thomasjhuang/deep-learning-for-document-dewarping
An application of high resolution GANs to dewarp images of perturbed documents
document gan ocr ocr-recognition pix2pix
Last synced: 30 Jun 2024
https://github.com/gpadmaku1/card-reader
Card Reader Android Application. Scan business cards and directly save the details to your Google Contacts.
android-application business-card google-contacts google-libphonenumber google-vision-api ocr textrecognition
Last synced: 30 Jun 2024
https://github.com/kdzwinel/JS-OCR-demo
JavaScript optical character recognition demo
Last synced: 30 Jun 2024
https://github.com/bsoyka/advent-of-code-ocr
Convert Advent of Code ASCII art
advent-of-code advent-of-code-2016 advent-of-code-2019 hacktoberfest ocr
Last synced: 29 Jun 2024
https://github.com/Azornes/ocrTranslator
Convert captured images to text using BaiduOCR, GoogleOCR, WindowsOCR, tesseractOCR, RapidOCR or Capture2Text, and translate the resulting text using Google, Chatgpt, Edgegpt, DeepL or many more. Desktop application with a nice GUI provided by customtkinter.
baidu-ocr bing capture2text chatgpt deepl edgegpt google-ocr google-translate ocr python rapidocr tesseract-ocr translator windows-ocr
Last synced: 29 Jun 2024
https://github.com/kha-white/manga-ocr
Optical character recognition for Japanese text, with the main focus being Japanese manga
comics computer-vision deep-learning japanese manga ocr transformers
Last synced: 29 Jun 2024
https://github.com/HUYDGD/awesome-japanese
🎎 Japanese awesome lists about everything
awesome awesome-japan awesome-japanese awesome-list awesome-lists awesome-resources grammar huydgd huydgd-210s japanese japanese-kana japanese-language japanese-language-learners japanese-study learn learning list ocr tools vocabulary
Last synced: 29 Jun 2024
https://github.com/prabhakar267/image2text
:clipboard: Python wrapper to grab text from images and save as text files using Tesseract Engine
image2text ocr optical-character-recognition python-wrapper tesseract tesseract-engine tesseract-installation tesseract-ocr
Last synced: 27 Jun 2024
https://github.com/stafazzoli/FarsiOCR
An OCR application for Farsi/ Persian documents.
farsi ocr python tesseract tesseract-ocr
Last synced: 27 Jun 2024
https://github.com/krviolent/subtitles_extract
Tool for extraction hard-coded (hardsub) Chinese subtitles from video files with 720p resolution
chinese chinese-translation easyocr machine-learning ocr python srt-subtitles subtitles video
Last synced: 27 Jun 2024
https://github.com/SWHL/RapidVideOCR
Extract video hard subtitles and automatically generate corresponding srt files.
ocr subtitle video videosubfinder
Last synced: 27 Jun 2024
https://github.com/elahe-dastan/first-grade
OCR for Iranian national ID card, etc.
cnn deep-learning machine-learning ocr pytorch
Last synced: 27 Jun 2024
https://github.com/Belval/TextRecognitionDataGenerator
A synthetic data generator for text recognition
data dataset fake ocr synthetic text text-recognition training-set-generator
Last synced: 27 Jun 2024
https://github.com/filyp/autocorrect
Spelling corrector in python
autocorrect autocorrection czech english languages levenshtein-distance multilanguage multilingual nlp ocr polish portuguese python russian spanish spellchecker spelling spelling-corrector turkish ukrainian
Last synced: 27 Jun 2024
https://github.com/AgentMaker/AgentOCR
一个多语言支持、易使用的 OCR 项目。An easy-to-use OCR project with multilingual support.
easy-deploy multilingual ocr onnx
Last synced: 27 Jun 2024
https://github.com/Kocarus/Manga-Translator-TesseractOCR
Automatically translates manga pages with Tesseract-OCR and Google Translate API for Python
google-translate-api manga ocr opencv-python pytesseract python27 tesseract-ocr translator
Last synced: 27 Jun 2024
https://github.com/hrishikeshrt/google_drive_ocr
Perform OCR using Google's Drive API v3
command-line-tool drive-api google-ocr multiprocessing ocr python
Last synced: 27 Jun 2024
https://github.com/Coolshanlan/HighlightTranslator
Highlight Translator can help you to translate the words quickly and accurately. By only highlighting, copying, or screenshoting the content you want to translate anywhere on your computer (ex. PDF, PPT, WORD etc.), the translated results will then be automatically displayed before you.
cambridge-dictionary cambridge-translate-api capture copy google-api google-translate google-translate-api highlight ocr python screenshot tesseract-ocr translate translator
Last synced: 27 Jun 2024
https://github.com/HamidRezaAttar/Persian-OCR-Streamlit
Persian OCR allows users to scan documents and extract text from scanned image.
ocr ocr-python ocr-recognition persian persian-ocr
Last synced: 27 Jun 2024
https://github.com/bgshih/crnn
Convolutional Recurrent Neural Network (CRNN) for image-based sequence recognition.
computer-vision machine-learning ocr sequence-recognition torch7
Last synced: 27 Jun 2024
https://github.com/argman/EAST
A tensorflow implementation of EAST text detector
deep-learning ocr tensorflow text-detection
Last synced: 27 Jun 2024
https://github.com/dengdan/seglink
An Implementation of the seglink alogrithm in paper Detecting Oriented Text in Natural Images by Linking Segments
ocr robust-reading text-detection
Last synced: 27 Jun 2024
https://github.com/MhLiao/TextBoxes_plusplus
TextBoxes++: A Single-Shot Oriented Scene Text Detector
ocr scene-text scene-text-detection scene-text-recognition
Last synced: 27 Jun 2024
https://github.com/mindee/react-mindee-js
Front-End Computer Vision SDK for React
computer-vision javascript ocr react reactjs sdk
Last synced: 27 Jun 2024
https://github.com/theos-ai/easy-paddle-ocr
This a clean and easy-to-use implementation of Paddle OCR. Made with ❤️ by Theos AI.
custom-ocr license-plate-recognition machine-learning ocr optical-character-recognition paddle paddleocr paddlepaddle python text-recognition
Last synced: 27 Jun 2024
https://github.com/epsylon/cintruder
Captcha Intruder (CIntrud3r) is an automatic pentesting tool to bypass captchas.
bruteforcer captcha cintruder ocr pentesting
Last synced: 26 Jun 2024
https://github.com/AnyListen/tools-ocr
树洞 OCR 文字识别(一款跨平台的 OCR 小工具)
cross-platform javafx mac ocr screenshot windows
Last synced: 26 Jun 2024
https://github.com/kili-technology/awesome-datasets
A comprehensive list of annotated training datasets classified by use case.
annotation awesome-data-science awesome-datasets awesome-public-datasets corpora data dataset datasets document-processing entity-extraction entity-recognition ner nlp ocr open-datasets opendata opendatasets public-data public-dataset public-datasets
Last synced: 25 Jun 2024
https://github.com/entropy2333/awesome-key-information-extraction
A curated list of papers about key information extraction.
Last synced: 25 Jun 2024
https://github.com/card-io/card.io-Android-source
The open-source code for the card.io-Android-SDK: provides fast, easy credit card scanning in mobile apps
android card-scanning credit-card ocr sdk
Last synced: 25 Jun 2024
https://github.com/doo/scanbot-sdk-ios-spm
barcode document image-filter image-processing ios mrz ocr pdf qr-code scanner sdk
Last synced: 25 Jun 2024
https://github.com/cloudy-sfu/GUI-for-paddlepaddle-OCR
The GUI for "paddlepaddle" OCR
ocr paddleocr paddlepaddle pyqt5 python
Last synced: 24 Jun 2024
https://github.com/WZBSocialScienceCenter/pdftabextract
A set of tools for extracting tables from PDF files helping to do data mining on (OCR-processed) scanned documents.
data-mining image-processing ocr pdf python tables
Last synced: 24 Jun 2024
https://github.com/qurator-spk/eynollah
Document Layout Analysis
document-layout-analysis ocr qurator segmentation textline-detection
Last synced: 24 Jun 2024
https://github.com/qurator-spk/dinglehopper
An OCR evaluation tool
alto alto-xml ocr ocr-d ocr-evaluation page page-xml qurator
Last synced: 24 Jun 2024
https://github.com/siyuan-note/siyuan
A privacy-first, self-hosted, fully open source personal knowledge management software, written in typescript and golang.
anki chatgpt electron evernote knowledge-base local-first markdown note-taking notebook notes-app notion obsidian ocr openai pdf pkm s3 self-hosted webdav
Last synced: 24 Jun 2024
https://github.com/altomator/EN-data_mining
Data Mining Historical Newspaper Metadata (METS/ALTO formats)
alto alto-xml basex data-mining digital-humanities digital-libraries digital-library metadata mets-xml ocr perl-script xml
Last synced: 24 Jun 2024
https://github.com/altomator/ALTO-HTML
Conversion of ALTO files (including tags) to HTML
Last synced: 24 Jun 2024
https://github.com/ARBML/Calliar
A dataset for online Arabic calligraphy. A collection of 2500 annotated calligraphic styles.
arabic calliar calligraphy ocr online-handwritten
Last synced: 24 Jun 2024
https://github.com/cv-small-snails/Awesome-Table-Recognition
A curated list of resources dedicated to table recognition
dataset ocr ocr-recognition papers papers-with-code table-recognition
Last synced: 22 Jun 2024
https://github.com/lizadaly/blackout
NaNoGenMo 2016 entry #2
blackout grammar nlp ocr tesseract-ocr tracery tracery-grammar
Last synced: 19 Jun 2024
https://github.com/zelon88/HRCloud2
A full-featured home hosted Cloud Drive, Personal Assistant, App Launcher, File Converter, Streamer, Share Tool & More!
antivirus applauncher cloud-drive cloud-platform cloud-storage cms editor enterprise file-converter nextcloud ocr owncloud paas personal-assistants security self-hosted server share-tool streamer wordpress
Last synced: 19 Jun 2024
https://github.com/ReceiptManager/receipt-manager-app
Receipt parser application written in dart.
android android-studio application flutter flutter-apps ocr ocr-recognition receipt receipt-parser receipt-scanner tesseract-ocr
Last synced: 18 Jun 2024
https://github.com/maxent-ai/ocrpy
OCR, Archive, Index and Search: Implementation agnostic OCR framework.
aws azure computer-vision cv deep-learning google-vision-api image-processing information-retrieval nlp ocr ocr-python python semantic-search tesseract-ocr transformers
Last synced: 17 Jun 2024
https://github.com/dmMaze/BallonsTranslator
深度学习辅助漫画翻译工具, 支持一键机翻和简单的图像/文本编辑 | Yet another computer-aided comic/manga translation tool powered by deeplearning
anime auto-translation chinese-translation comics computer-aided-translation computer-vision deep-learning inpainting manga ocr pyqt pyqt6 pytorch qt qt6 scene-text-detection
Last synced: 16 Jun 2024
https://github.com/openrecall/openrecall
OpenRecall is a fully open-source, privacy-first alternative to proprietary solutions like Microsoft's Windows Recall. With OpenRecall, you can easily access your digital history, enhancing your memory and productivity without compromising your privacy.
ai alternative history macos ocr open-source privacy python recall search self-hosted semantic windows
Last synced: 16 Jun 2024
https://github.com/jyp-studio/Invoice_detection
This is an AI model for detecting and recognizing invoice information by yolov5 and OCR.
detection object-detection ocr pytesseract recognition yolo yolov5
Last synced: 15 Jun 2024
https://github.com/008karan/PAN_OCR
Building OCR using YOLO and Tesseract
deep-learning ocr tesseract yolo
Last synced: 15 Jun 2024
https://github.com/xuguodong1999/COCR
OCR/OCSR on handwritting ⏣/chemical-structural-formulas with YOLO & CRNN models.
chemical-formula chemistry handwriting-recognition ocr ocsr
Last synced: 15 Jun 2024
https://github.com/SciPhi-AI/R2R
The framework for fast development and deployment of RAG backends.
artificial-intelligence chatbot data-pipelines deep-learning langchain large-language-models llama-index llm machine-learning ocr pdf question-answering retrieval retrieval-augmented-generation retrieval-systems search
Last synced: 15 Jun 2024
https://github.com/pannous/tensorflow-ocr
🖺 OCR using tensorflow with attention
Last synced: 15 Jun 2024
https://github.com/breezedeus/Pix2Text
An Open-Source Python3 tool for recognizing layouts, tables, math formulas (LaTeX), and text in images, converting them into Markdown format. A free alternative to Mathpix, empowering seamless conversion of visual content into text-based representations. 80+ languages are supported.
image-to-markdown latex latex-pdf layout-analysis math-formula math-formula-recognition math-ocr mathpix ocr python pytorch table-ocr
Last synced: 14 Jun 2024
https://github.com/jaeksoft/opensearchserver
Open-source Enterprise Grade Search Engine Software
crawler custom-search enterprise indexing java lucene ocr opensearchserver search search-engine synonyms webcrawler webcrawling
Last synced: 14 Jun 2024
https://github.com/YangDai2003/CopilotOCR-Android
Fast OCR, multi-language support, sleek design. Scan, edit, and share text effortlessly. Smart barcode and QR code scanning.
android android-ocr-application java ocr ocr-android ocr-text-reader
Last synced: 14 Jun 2024