Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

wanghaisheng-awesome-ocr

https://github.com/jiajunhua/wanghaisheng-awesome-ocr

百度api store
阿里云市场
腾讯云
Codes And Documents For OcrKing Api
Tesseract-OCR
Tesseract.js is a pure Javascript port of the popular Tesseract OCR engine.
tesseract is an R package providing bindings to Tesseract.
List of Tesseract add-ons including wrappers in different languages.
Ocular is a state-of-the-art historical OCR system.
sfhistory Making a map of historical SF photos -博文4所带库
ocropy-论文1所带库 by Adnan Ul-Hasan
A small C++ implementation of LSTM networks, focused on OCR.by Adnan Ul-Hasan
End to end OCR system for Telugu. Based on Convolutional Neural Networks.
Telugu OCR framework using RNN, CTC in Theano & Python3.
Recurrent Neural Network and Long Short Term Memory (LSTM) with Connectionist Temporal Classification implemented in Theano. Includes a Toy training example.
implement CTC with keras? #383
mxnet and ocr
An OCR-system based on Torch using the technique of LSTM/GRU-RNN, CTC and referred to the works of rnnlib and clstm.
pure javascript lstm rnn implementation based on ocropus
'caffe-ocr - OCR with caffe deep learning framework' by pannous
A implementation of LSTM and CTC to recognize image without splitting
RNNSharp is a toolkit of deep recurrent neural network which is widely used for many different kinds of tasks, such as sequence labeling. It's written by C# language and based on .NET framework 4.6 or above version. RNNSharp supports many different types of RNNs, such as BPTT and LSTM RNN, forward and bi-directional RNNs, and RNN-CRF.
warp-ctc A fast parallel implementation of CTC, on both CPU and GPU. by BAIDU
Test mxnet with own trained model,用训练好的网络模型进行数字，少量汉字，特殊字符（./等）的识别（总共有210类）
An expandable and scalable OCR pipeline
OpenOCR makes it simple to host your own OCR REST API.
OCRmyPDF uses Tesseract for OCR, and relies on its language packs.
OwncloudOCR uses tesseract OCR and OCRmyPDF for reading text from images and images in PDF files.
Nextcloud OCR (optical character recoginition) processing for images and PDF with tesseract-ocr, OCRmyPDF and php-native message queueing for asynchronous purpose. http://janis91.github.io/ocr/
多标签分类,端到端的中文车牌识别基于mxnet, End-to-End Chinese plate recognition base on mxnet
中国二代身份证光学识别
SwiftOCR:Fast and simple OCR library written in Swift
Attention-OCR :Visual Attention based OCR
Added support for CTC in both Theano and Tensorflow along with image OCR example. #3436
EasyPR是一个开源的中文车牌识别系统，其目标是成为一个简单、高效、准确的车牌识别库。
Deep Embedded Clustering for OCR based on caffe
Deep Embedded Clustering for OCR based on MXNet
The minimum OCR server by Golang The minimum OCR server by Golang, and a tiny sample application of gosseract.
A comparasion among different variant of gradient descent algorithm This script implements and visualizes the performance the following algorithms, based on the MNIST hand-written digit recognition dataset:
A curated list of resources dedicated to scene text localization and recognition
Convolutional Recurrent Neural Network (CRNN) for image-based sequence recognition.
Implementation of the method proposed in the papers " TextProposals: a Text-specific Selective Search Algorithm for Word Spotting in the Wild" and "Object Proposals for Text Extraction in the Wild" (Gomez & Karatzas), 2016 and 2015 respectively.
Word Spotting and Recognition with Embedded Attributes http://www.cvc.uab.es/~almazan/index/projects/words-att/index.html
Part of eMOP: Franken+ tool for creating font training for Tesseract OCR engine from page images.
NOCR NOCR is an open source C++ software package for text recognition in natural scenes, based on OpenCV. The package consists of a library, console program and GUI program for text recognition.
An OpenCV based OCR system, base to other projects Uses Histogram of Oriented Gradients (HOG) to extract characters features and Support Vector Machines as a classifier. It serves as basis for other projects that require OCR functionality.
Recognize bib numbers from racing photos
Automatic License Plate Recognition library http://www.openalpr.com
汽车挡风玻璃VIN码识别
Image Recognition for the Democracy Project with codes
Tools to be evaluated prior to integration into Newman
Text Recognition in Natural Images in Python
运用tensorflow实现自然场景文字检测,keras/pytorch实现crnn+ctc实现不定长中文OCR识别
A generative vision model that trains with high data efficiency and breaks text-based CAPTCHAs
STN-OCR: A single Neural Network for Text Detection and Text Recognition
Digit Segmentation and Recognition using OpenCV and MLP test
ctpn based on tensorflow
ctpn based on caffe
A Python/OpenCV-based scene detection program, using threshold/content analysis on a given video. http://pyscenedetect.readthedocs.org
Implementation of the seglink alogrithm in paper Detecting Oriented Text in Natural Images by Linking Segments
Arbitrary-Oriented Scene Text Detection via Rotation Proposals
通过旋转候选框实现任意方向的场景文本检测 Arbitrary-Oriented Scene Text Detection via Rotation Proposals
Seven Segment Optical Character Recognition
SVHN yolo-v2 digit detector
Reads Scene Text in Tilted orientation.
ocr, cnn+lstm (CTPN/CRNN) for image text detection
A stand alone character recognition micro-service with a RESTful API
Single Shot Text Detector with Regional Attention
gocr is a go based OCR module
GOCR is an optical character recognition program, released under the
UFOCR (User-Friendly OCR). It is YAGF fork: https://github.com/andrei-b/YAGF Supported input format: PDF, TIFF, JPEG, PNG, BMP, PBM, PGM, PPM, XBM, XPM.
论文1 can we build language-independent ocr using lstm networks by Adnan Ul-Hasan
Adnan Ul-Hasan的博士论文
Applying OCR Technology for Receipt Recognition
An End-to-End Trainable Neural Network for Image-based Sequence Recognition and Its Application to Scene Text Recognition
Reading Scene Text in Deep Convolutional Sequences
What You Get Is What You See:A Visual Markup Decompiler
Recursive Recurrent Nets with Attention Modeling for OCR in the Wild
#ICML 2016#【通过DNN把数据空间映射到latent的特征空间做聚类，目标函数是最小化软分配与辅助分布直接的KL距离，来迭代优化，思想类似于t-SNE，只不过这里使用了DNN】《Unsupervised Deep Embedding for Clustering Analysis》
SEE: Towards Semi-Supervised End-to-End Scene Text Recognition
A generative vision model that trains with high data efficiency and breaks text-based CAPTCHAs
EXTENDING THE PAGE SEGMENTATION ALGORITHMS OF THE OCROPUS DOCUMENTATION LAYOUT ANALYSIS SYSTEM
Text Recognition in Scene Image and Video Frame using Color Channel Selection
Scene Text Detection via Holistic, Multi-Channel Prediction
Tesseract-OCR引擎入门
博文1 Training an Ocropus OCR model
博文2 Extracting text from an image using Ocropus
博文3 Working with Ground Truth
博文4 Finding blocks of text in an image using Python, OpenCV and numpy
Applying OCR Technology for Receipt Recognition
Writing a Fuzzy Receipt Parser in Python
Number plate recognition with Tensorflow
车牌识别中的不分割字符的端到端(End-to-End)识别
端到端的OCR：基于CNN的实现
腾讯OCR—自动识别技术，探寻文字真实的容颜
验证码识别
Bank check OCR with OpenCV and Python (Part I)
Common Sense, Cortex, and CAPTCHA
Project Naptha :highlight, copy, search, edit and translate text in any image
ABBYY
OCR.space is a service of a9t9 software GmbH. The goal of OCR.space is to bring fresh ideas, methods and products to the OCR community.

Programming Languages

Python 15 C++ 9 Jupyter Notebook 6 JavaScript 3 Go 2 Shell 2 Lua 2 C 1 C# 1 Cuda 1

Keywords

ocr 10 text-detection 4 tesseract-ocr 4 machine-learning 3 computer-vision 3 deep-learning 3 robust-reading 2 lstm 2 tesseract 2 text-recognition 2 opencv 2 python 2 end-to-end 2 semi-supervised-learning 2 api-server 1 api 1 tesseract-js 1 ocr-processing 1 nextcloud-ocr 1 curl 1 docker 1 go 1 heroku 1 ocr-server 1 natural-images 1 ocr-engine 1 r 1 r-package 1 rstats 1 captcha 1 ctc 1 ctc-loss 1 gru 1 neural-network 1 recurrent-neural-networks 1 rnn 1 rnn-ctc 1 speech-recognition 1 speech-to-text 1 theano 1 nextcloud 1 nextcloud-app 1 yolov2 1 ocr-service 1 tesseract-ocr-api 1 iccv-17 1 scene-text 1 sstd 1 golang-library 1 chainer 1