Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/MauritsBleeker/Bi-STET
Implementation of Bidirectional Scene Text Recognition with a Single Decoder
https://github.com/MauritsBleeker/Bi-STET
Last synced: 10 days ago
JSON representation
Implementation of Bidirectional Scene Text Recognition with a Single Decoder
- Host: GitHub
- URL: https://github.com/MauritsBleeker/Bi-STET
- Owner: MauritsBleeker
- Created: 2019-12-08T11:22:31.000Z (almost 5 years ago)
- Default Branch: master
- Last Pushed: 2020-07-13T15:03:46.000Z (over 4 years ago)
- Last Synced: 2024-08-02T11:14:52.633Z (3 months ago)
- Language: Python
- Homepage: https://arxiv.org/abs/1912.03656
- Size: 78.1 KB
- Stars: 67
- Watchers: 9
- Forks: 15
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Bi-STET
This is the repository for 'Bidirectional Scene Text Recognition with a Single Decoder', by Maurits Bleeker and Maarten de Rijke [[pdf](https://arxiv.org/pdf/1912.03656.pdf)]
The base source-code for this project comes from: http://nlp.seas.harvard.edu/2018/04/03/attention.html
I have tried to keep the code as general as possible. However, some elements of the pipeline are specially for the environment I worked with.
## Model weights and reproducibility
To reproduce the results of the paper, please use the final model parameters.
https://drive.google.com/file/d/1OwJ3iVpRhnjIZyOi7aOQIeLv7N1DHZkC/view?usp=sharing
In the folder data_utils/, all the scripts to generate the train and test sets as used for this paper are provided.
# Python and package versions
* Python 3.7
* Pillow 5.4.1
* nltk 3.4.5
* numpy 1.17.1
* scipy 1.2.0
* seaborn 0.9.0
* tensorboard-logger 0.1.0
* tensorboardX 1.7
* torch 1.1.0.post2
* torchvision 0.2.1
* transformers 2.1.1# Run
To run the code, just run ```main.py```, and set all the configurations in the Config.py. The configurations to reproduce the results are set in the ```Config.py``` file.
# TrainingThere are two options to load the training/test data:
- From disk. This can be done by using the annotation file(s).
- From a pickle file. The pickle file should contain a python dict with the following data format.```
{
image_id : {
'data' : 'binary image string',
'label' : 'word'
}
}```
## Test and train annotations
The annotations text files are formatted as 'path/to/image.jpg annotation'. The path to image is always relative to a root folder.
Example root folder: User/Documents/Project/data/IIITK/
In User/Documents/Project/data/IIITK/, we have an annotation.txt and the images.
An example of the annotation file:
```
test/1002_1.png private```
# Data processing
All the files to process the original provided train datasets are given in /data_utils.
# Reference
If you found this code useful, please cite the following paper:
```
@article{bleeker2019bidirectional,
title={Bidirectional Scene Text Recognition with a Single Decoder},
author={Bleeker, Maurits and de Rijke, Maarten},
journal={arXiv preprint arXiv:1912.03656},
year={2019}
}
```