Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/jyotidabass/document_text_recognition
https://github.com/jyotidabass/document_text_recognition
Last synced: 3 months ago
JSON representation
- Host: GitHub
- URL: https://github.com/jyotidabass/document_text_recognition
- Owner: jyotidabass
- License: apache-2.0
- Created: 2022-03-09T11:02:36.000Z (over 2 years ago)
- Default Branch: main
- Last Pushed: 2022-03-09T11:12:43.000Z (over 2 years ago)
- Last Synced: 2024-06-28T11:32:55.251Z (5 months ago)
- Language: Python
- Size: 195 KB
- Stars: 2
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Changelog: changelog.rst
- Contributing: CONTRIBUTING.md
- License: LICENSE
- Code of conduct: CODE_OF_CONDUCT.md
Awesome Lists containing this project
README
# Text recognition
The sample training script was made to train text recognition model with docTR.
## Setup
First, you need to install `doctr` (with pip, for instance)
```shell
pip install -e . --upgrade
pip install -r references/requirements.txt
```## Usage
You can start your training in TensorFlow:
```shell
python references/recognition/train_tensorflow.py crnn_vgg16_bn --train_path path/to/your/train_set --val_path path/to/your/val_set --epochs 5
```
or PyTorch:```shell
python references/recognition/train_pytorch.py crnn_vgg16_bn --train_path path/to/your/train_set --val_path path/to/your/val_set --epochs 5 --device 0
```## Data format
You need to provide both `train_path` and `val_path` arguments to start training.
Each of these paths must lead to a 2-elements folder:```shell
├── images
├── img_1.jpg
├── img_2.jpg
├── img_3.jpg
└── ...
├── labels.json
```The JSON files must contain word-labels for each picture as a string.
The order of entries in the json does not matter.```shell
labels = {
'img_1.jpg': 'I',
'img_2.jpg': 'am',
'img_3.jpg': 'a',
'img_4.jpg': 'Jedi',
'img_5.jpg': '!',
...
}
```## Advanced options
Feel free to inspect the multiple script option to customize your training to your own needs!
```python
python references/recognition/train_pytorch.py --help
```