Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/bgshih/aster
Recognizing cropped text in natural images.
https://github.com/bgshih/aster
computer-vision ocr recognition scene-text
Last synced: 4 days ago
JSON representation
Recognizing cropped text in natural images.
- Host: GitHub
- URL: https://github.com/bgshih/aster
- Owner: bgshih
- License: mit
- Created: 2017-12-03T08:50:02.000Z (almost 7 years ago)
- Default Branch: master
- Last Pushed: 2023-03-25T00:44:11.000Z (over 1 year ago)
- Last Synced: 2024-11-03T10:32:06.503Z (10 days ago)
- Topics: computer-vision, ocr, recognition, scene-text
- Language: Python
- Size: 358 KB
- Stars: 726
- Watchers: 21
- Forks: 195
- Open Issues: 82
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# ASTER: Attentional Scene Text Recognizer with Flexible Rectification
ASTER is an accurate scene text recognizer with flexible rectification mechanism. The research paper can be found [here](https://ieeexplore.ieee.org/abstract/document/8395027/).
![ASTER Overview](overview.png)
The implementation of ASTER reuses code from [Tensorflow Object Detection API](https://github.com/tensorflow/models/tree/master/research/object_detection).
## Update
**[07/13/2019] A PyTorch [port](https://github.com/ayumiymk/aster.pytorch) has been made by [@ayumiymk](https://github.com/ayumiymk).**## Correction (10/22/2018)
We have identified a bug we accidentally made in the code that causes only part of SVT images being tested and results in higher results. The bug has been fixed in commit [a7e8613](https://github.com/bgshih/aster/commit/a7e8613d6308e5a7aacb1237dfa0286d73cef342). Below are the corrected numbers on SVT. The results are still state-of-the-art, so the conclusions are not affected.
- SVT (50) ASTER: 97.4%; ASTER-A: 96.3%; ASTER-B: 96.1%;
- SVT (None): ASTER: 89.5%; ASTER-A: 80.2%; ASTER-B: 81.6%## Prerequisites
ASTER was developed and tested with **TensorFlow r1.4**. Higher versions may not work.
ASTER requires [Protocol Buffers](https://github.com/google/protobuf) (version>=2.6). Besides, in Ubuntu 16.04:
```
sudo apt install cmake libcupti-dev
pip3 install --user protobuf tqdm numpy editdistance
```## Installation
1. Go to `c_ops/` and run `build.sh` to build the custom operators
2. Execute `protoc aster/protos/*.proto --python_out=.` to build the protobuf files
3. Add `/path/to/aster` to `PYTHONPATH`, or set this variable for every run## Demo
A demo program is located at `aster/demo.py`, accompanied with pretrained model files available on our [release page](https://github.com/bgshih/aster/releases). Download `model-demo.zip` and extract it under `aster/experiments/demo/` before running the demo.
To run the demo, simply execute:
```
python3 aster/demo.py
```This will output the recognition result of the demo image and the rectified image.
## Training and on-the-fly evaluation
Data preparation scripts for several popular scene text datasets are located under `aster/tools`. See their source code for usage.
To run the example training, execute
```
python3 aster/train.py \
--exp_dir experiments/demo \
--num_clones 2
```Change the configuration in `experiments/aster/trainval.prototxt` to configure your own training process.
During the training, you can run a separate program to repeatedly evaluates the produced checkpoints.
```
python3 aster/eval.py \
--exp_dir experiments/demo
```Evaluation configuration is also in `trainval.prototxt`.
## Citation
If you find this project helpful for your research, please cite the following papers:
```
@article{bshi2018aster,
author = {Baoguang Shi and
Mingkun Yang and
Xinggang Wang and
Pengyuan Lyu and
Cong Yao and
Xiang Bai},
title = {ASTER: An Attentional Scene Text Recognizer with Flexible Rectification},
journal = {IEEE Transactions on Pattern Analysis and Machine Intelligence},
volume = {},
number = {},
pages = {1-1},
year = {2018},
}@inproceedings{ShiWLYB16,
author = {Baoguang Shi and
Xinggang Wang and
Pengyuan Lyu and
Cong Yao and
Xiang Bai},
title = {Robust Scene Text Recognition with Automatic Rectification},
booktitle = {2016 {IEEE} Conference on Computer Vision and Pattern Recognition,
{CVPR} 2016, Las Vegas, NV, USA, June 27-30, 2016},
pages = {4168--4176},
year = {2016}
}
```IMPORTANT NOTICE: Although this software is licensed under MIT, our intention is to make it free for academic research purposes. If you are going to use it in a product, we suggest you [contact us]([email protected]) regarding possible patent issues.