Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/Megvii-CSG/MegReader
A research project for text detection and recognition using PyTorch 1.2.
https://github.com/Megvii-CSG/MegReader
ctc deep-learning ocr pytorch text-detection text-detection-recognition text-recognition
Last synced: 7 days ago
JSON representation
A research project for text detection and recognition using PyTorch 1.2.
- Host: GitHub
- URL: https://github.com/Megvii-CSG/MegReader
- Owner: Megvii-CSG
- Archived: true
- Created: 2019-09-28T01:09:23.000Z (about 5 years ago)
- Default Branch: master
- Last Pushed: 2019-12-24T03:56:39.000Z (almost 5 years ago)
- Last Synced: 2024-08-01T13:32:10.712Z (3 months ago)
- Topics: ctc, deep-learning, ocr, pytorch, text-detection, text-detection-recognition, text-recognition
- Language: Python
- Homepage:
- Size: 306 KB
- Stars: 348
- Watchers: 27
- Forks: 67
- Open Issues: 11
-
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
Awesome Lists containing this project
README
# MegReader
A project for research in text detection and recognition using PyTorch 1.2.This project is originated from the research repo, which heavily relies on closed-source libraries, of CSG-Algorithm team of Megvii(https://megvii.com).
We are in ongoing progress to transfer models into this repo gradually, released implementations are listed in [Progress](#progress).## Highlights
- Implementations of representative text detection and recognition methods.
- An effective framework for conducting experiments: We use yaml files to configure experiments, making it convenient to take experiments.
- Thorough logging features which make it easy to follow and analyze experimental results.
- CPU/GPU compatible for training and inference.
- Distributed training support.## Install
### Requirements
`pip install -r requirements.txt`
- Python3.7
- PyTorch 1.2 and CUDA 10.0.
- gcc 5.5(Important for compiling)### Compile cuda ops (If needed)
```
cd PATH_TO_OPSpython setup.py build_ext --inplace
```
ops may be used:
- DeformableConvV2 `assets/ops/dcn`
- CTC2DLoss `ops/ctc_2d`### Configuration(optional)
Edit configurations in `config.py`.
## Training
See detailed options: `python3 train.py --help`
## Datasets
We provide data loading implementation with annotation packed with json for quick start.
Also, lmdb format data are now available too.
You can refer the usage in [demo](experiments/recognition/crnn-lmdb.yaml).
Datasets used in our recognition experiments can be downloaded from [onedrive](https://megvii-my.sharepoint.cn/:f:/g/personal/wanzhaoyi_megvii_com/EjkcrpmiW6hJrUKY-0fEBRABvNMtYniUPfWLVptMmy9-6w?e=bJaYFo). The transform [script](scripts/json_to_lmdb.py) are provide to convert json format data to lmdb.### Non-distributed
`python3 train.py PATH_TO_EXPERIMENT.yaml --validate --visualize --name NAME_OF_EXPERIMENT`
Following we provide some of configurations of the released recognition models:
- CRNN: `experiments/recognition/crnn.yaml`
- 2D CTC: `experiments/recognition/res50-ppm-2d-ctc.yaml`
- Attention Decoder: `experiments/recognition/fpn50-attention-decoder.yaml`### Distributed(recommended for multi-gpu training)
`python3 -m torch.distributed.launch --nproc_per_node=NUM_GPUS train.py PATH_TO_EXPERIMENT.yaml -d --validate`
## Evaluating
See detailed options: `python3 eval.py --help`.
Keeping ratio tesing is recommended: `python3 eval.py PATH_TO_EXPERIMENT.yaml --resize_mode keep_ratio`
### Model zoo
Trained models are comming soon.## Progress
### Recognition Methods
- [x] 2D CTC
- [x] CRNN
- [x] Attention Decoder
- [ ] Rectification### Detection Methods
- [x] Text Snake
- [x] EAST### End-to-end
- [ ] Mask Text Spotter## Contributing
[Contributing.md](CONTRIBUTING.md)