Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/rakeshvar/rnn_ctc
Recurrent Neural Network and Long Short Term Memory (LSTM) with Connectionist Temporal Classification implemented in Theano. Includes a Toy training example.
https://github.com/rakeshvar/rnn_ctc
captcha ctc ctc-loss deep-learning gru lstm neural-network ocr python recurrent-neural-networks rnn rnn-ctc speech-recognition speech-to-text theano
Last synced: 3 months ago
JSON representation
Recurrent Neural Network and Long Short Term Memory (LSTM) with Connectionist Temporal Classification implemented in Theano. Includes a Toy training example.
- Host: GitHub
- URL: https://github.com/rakeshvar/rnn_ctc
- Owner: rakeshvar
- License: apache-2.0
- Created: 2014-12-12T06:35:10.000Z (about 10 years ago)
- Default Branch: master
- Last Pushed: 2016-07-26T04:48:11.000Z (over 8 years ago)
- Last Synced: 2024-08-03T04:05:45.119Z (6 months ago)
- Topics: captcha, ctc, ctc-loss, deep-learning, gru, lstm, neural-network, ocr, python, recurrent-neural-networks, rnn, rnn-ctc, speech-recognition, speech-to-text, theano
- Language: Python
- Homepage:
- Size: 604 KB
- Stars: 220
- Watchers: 27
- Forks: 82
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
- awesome-ocr - Recurrent Neural Network and Long Short Term Memory (LSTM) with Connectionist Temporal Classification implemented in Theano. Includes a Toy training example.
README
# RNN CTC
Recurrent Neural Network with Connectionist Temporal Classifical implemented
in Theano. Includes toy training examples.## Use
The goal of this problem is to train a Neural Network (with recurrent connections) to learn to read
sequences. As a part of the training we show it a series of such sequences (tablets of text in
our examples) and also tell it what the tablet contains (the labels of the written characters).
## MethodologyWe need to keep feeding our RNN the samples of text in two forms (written and labelled). If you
have your own written samples you can train our system the **offline** way. If you have a
*scribe* that can generate samples as you go, you can train one sample at a time,
the **online** way.## Specifying parameters
You will need to specify a lot of parameters. Here is a overview. The file `configs/default.ast`
has all the parameters specified (as a python dictionary), so compare that with these instructions.* Data Generation (cf. `configs/alphabets.ast`)
* Scribe (The class that generates the samples)
* `alphabet`: 'ascii_alphabet' (0-9a-zA-Z etc.) or 'hindu_alphabet' (0-9 hindu numerals)
* `noise`: Amount of noise in the image
* `vbuffer`, `hbuffer`: horizontal and vertical buffers
* `avg_seq_len`: Average length of the tablet
* `varying_len`: (bool) Make the length random
* `nchars_per_sample`: This will make each tablet have the same number of characters. This
over-rides `avg_seq_len`.
* `num_samples`* Training (cf. `configs/default.ast`)
* `num_epochs`
* Offline case: Goes over the same data `num_epochs` times.
* Online case: Each epoch has different data, resulting in generating a total of
`num_epochs * num_samples` unique data samples!
* `train_on_fraction`
* Offline case: Fraction of samples that are used as training data
* Neural Network (cf. `configs/midlayer.ast` and `configs/optimizers.ast`)
* `use_log_space`: Perform calculations via the logarithms of probabilities.
* `mid_layer`: The middle layer to be used. See the `nnet/layers` module for all the options you have.
* `mid_layer_args`: The arguments needed for the middle layer. Depends on the `mid_layer`.
See the constructor of the corresponding `mid_layer` class.
* `optimizer`: The optimization algorithm to be used. `sgd`, `adagrad`, `rmsprop`,
`adadelta` etc.
* `optimzier_args`: The arguments that the optimizer needs. See the corresponding function in
the file `nnet/updates.py`.
Note: This should **not** contain the learning rate.
* `learning_rate_args`:
* `initial_rate`: Initial learning rate.
* `anneal`:
* `constant`: Learning rate will be kept constant
* `inverse`: Will decay as the inverse of the epoch.
* `inverse_sqrt`: Will decay as the inverse of the square root of the epoch.
* `epochs_to_half`: Rate at which the learning_rate is annealed. Higher number means
slower rate.## Usage
### Offline Training
For this you need to generate data first and then train it using `train_offline.py`.##### Generate Data
You can use *hindu numerals* or the entire *ascii* set, specified via an ast file.```sh
python3 gen_data.py [config=configs/default.ast]*
```##### Train Network
You can train on the generated pickle file as:```sh
python3 train_offline.py data.pkl [config=configs/default.ast]*
```### Online Training
You can generate and train simultaneously as:```sh
python3 train_online.py [config=configs/default.ast]*
```## Examples
All the programs mentioned above can take multiple config files, later files override former ones.
`configs/default.ast` is loaded by default.### Offline
```sh
# First generate the ast files based on given examples then...
python3 gen_data.py hindu_avg_len_60.py configs/hindu.ast configs/len_60.ast
python3 train_offline.py hindu_3chars.py configs/adagrad.ast configs/bilstm.ast configs/ilr.01.ast
```### Online
```sh
python3 train_online.py configs/hindu.ast configs/adagrad.ast configs/bilstm.ast configs/ilr.01.ast
```### Working Example
```sh
# Offline
python3 gen_data.py hindu3.py configs/working_eg.ast
python3 train_offline.py hindu3.py configs/working_eg.ast
# Online
python3 train_online.py configs/working_eg.ast
```#Offline
## Sample Output
```
# Using data from scribe.py hindu
Shown : 0 2 2 5
Seen : 0 2 2 5
Images (Shown & Seen) :0¦ ¦
1¦ ██ ██ ¦
2¦ █ ██ ████ ¦
3¦ █ █ █ ¦
4¦ ██ █ █ ███ ¦
5¦ █ █████████ █ ¦
6¦ █ █ █ █ ¦
7¦ ██ ███ ¦
0¦░░░░░░░░░█░░░░░░░░░░░░░░░░░░¦
1¦░░░░░░░░░░░░░░░░░░░░░░░░░░░░¦
2¦░░░░░░░░░░░░░█░░░█░░░░░░░░░░¦
3¦░░░░░░░░░░░░░░░░░░░░░░░░░░░░¦
4¦░░░░░░░░░░░░░░░░░░░░░░░░░░░░¦
5¦░░░░░░░░░░░░░░░░░░░█▓░░░░░░░¦
6¦█████████░███░███░█░▒███████¦```
## References
* Graves, Alex. **Supervised Sequence Labelling with Recurrent Neural Networks.** Chapters 2, 3, 7 and 9.
* Available at [Springer](http://www.springer.com/engineering/computational+intelligence+and+complexity/book/978-3-642-24796-5)
* [University Edition](http://link.springer.com/book/10.1007%2F978-3-642-24797-2) via. Springer Link.
* Free [Preprint](http://www.cs.toronto.edu/~graves/preprint.pdf)## Credits
* Theano implementation of CTC by [Shawn Tan](https://github.com/shawntan/theano-ctc/)
* Updates.py from [Lasagne](http://lasagne.readthedocs.org/en/latest/modules/updates.html)## Dependencies
* Numpy
* TheanoCan easily port to python2 by adding lines like these where necessary. In the interest of the
future generations, we highly recommend you do not do that.
``` python
from __future__ import print_function
```