{"id":13655826,"url":"https://github.com/rakeshvar/rnn_ctc","last_synced_at":"2025-04-23T17:30:40.920Z","repository":{"id":24503043,"uuid":"27908518","full_name":"rakeshvar/rnn_ctc","owner":"rakeshvar","description":"Recurrent Neural Network and Long Short Term Memory (LSTM) with Connectionist Temporal Classification implemented in Theano. Includes a Toy training example.","archived":false,"fork":false,"pushed_at":"2016-07-26T04:48:11.000Z","size":618,"stargazers_count":220,"open_issues_count":1,"forks_count":82,"subscribers_count":27,"default_branch":"master","last_synced_at":"2024-08-03T04:05:45.119Z","etag":null,"topics":["captcha","ctc","ctc-loss","deep-learning","gru","lstm","neural-network","ocr","python","recurrent-neural-networks","rnn","rnn-ctc","speech-recognition","speech-to-text","theano"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/rakeshvar.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2014-12-12T06:35:10.000Z","updated_at":"2024-01-04T15:58:26.000Z","dependencies_parsed_at":"2022-08-23T00:00:52.407Z","dependency_job_id":null,"html_url":"https://github.com/rakeshvar/rnn_ctc","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rakeshvar%2Frnn_ctc","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rakeshvar%2Frnn_ctc/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rakeshvar%2Frnn_ctc/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rakeshvar%2Frnn_ctc/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/rakeshvar","download_url":"https://codeload.github.com/rakeshvar/rnn_ctc/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":223931580,"owners_count":17227256,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["captcha","ctc","ctc-loss","deep-learning","gru","lstm","neural-network","ocr","python","recurrent-neural-networks","rnn","rnn-ctc","speech-recognition","speech-to-text","theano"],"created_at":"2024-08-02T04:00:38.302Z","updated_at":"2024-11-10T08:30:36.461Z","avatar_url":"https://github.com/rakeshvar.png","language":"Python","funding_links":[],"categories":["Librarys"],"sub_categories":[],"readme":"# RNN CTC\n\nRecurrent Neural Network with Connectionist Temporal Classifical implemented \nin Theano. Includes toy training examples.\n\n## Use\n\nThe goal of this problem is to train a Neural Network (with recurrent connections) to learn to read \nsequences. As a part of the training we show it a series of such sequences (tablets of text in \nour examples) and also tell it what the tablet contains  (the labels of the written characters). \n \n## Methodology\n\nWe need to keep feeding our RNN the samples of text in two forms (written and labelled). If you \nhave your own written samples you can train our system the **offline** way. If you have a \n*scribe* that can generate samples as you go, you can train one sample at a time, \nthe **online** way. \n\n## Specifying parameters\n\nYou will need to specify a lot of parameters. Here is a overview. The file `configs/default.ast` \nhas all the parameters specified (as a python dictionary), so compare that with these instructions.\n\n* Data Generation (cf. `configs/alphabets.ast`)\n    * Scribe (The class that generates the samples)\n        * `alphabet`: 'ascii_alphabet' (0-9a-zA-Z etc.) or 'hindu_alphabet' (0-9 hindu numerals)\n        * `noise`: Amount of noise in the image\n        * `vbuffer`, `hbuffer`: horizontal and vertical buffers\n        * `avg_seq_len`: Average length of the tablet  \n        * `varying_len`: (bool) Make the length random\n        * `nchars_per_sample`: This will make each tablet have the same number of characters. This \n        over-rides `avg_seq_len`.\n    * `num_samples`\n\n* Training (cf. `configs/default.ast`)\n    * `num_epochs`\n        * Offline case: Goes over the same data `num_epochs` times.\n        * Online case: Each epoch has different data, resulting in generating a total of \n        `num_epochs * num_samples` unique data samples!\n    * `train_on_fraction`\n        * Offline case: Fraction of samples that are used as training data\n        \n* Neural Network (cf. `configs/midlayer.ast` and `configs/optimizers.ast`)\n    * `use_log_space`: Perform calculations via the logarithms of probabilities.\n    * `mid_layer`: The middle layer to be used. See the `nnet/layers` module for all the options you have.\n    * `mid_layer_args`: The arguments needed for the middle layer. Depends on the `mid_layer`. \n    See the constructor of the corresponding `mid_layer` class. \n    * `optimizer`: The optimization algorithm to be used. `sgd`, `adagrad`, `rmsprop`, \n    `adadelta` etc. \n    * `optimzier_args`: The arguments that the optimizer needs. See the corresponding function in\n     the file `nnet/updates.py`. \n        Note: This should **not** contain the learning rate.\n    * `learning_rate_args`: \n        * `initial_rate`: Initial learning rate.\n        * `anneal`: \n            * `constant`: Learning rate will be kept constant\n            * `inverse`: Will decay as the inverse of the epoch.\n            * `inverse_sqrt`: Will decay as the inverse of the square root of the epoch.\n        * `epochs_to_half`: Rate at which the learning_rate is annealed. Higher number means \n        slower rate.\n\n## Usage\n\n### Offline Training\n  \nFor this you need to generate data first and then train it using `train_offline.py`. \n\n##### Generate Data\nYou can use *hindu numerals* or the entire *ascii* set, specified via an ast file.\n\n```sh\npython3 gen_data.py \u003coutput_name.pkl\u003e [config=configs/default.ast]*\n```\n\n##### Train  Network\nYou can train on the generated pickle file as:\n\n```sh\npython3 train_offline.py data.pkl [config=configs/default.ast]*\n```\n\n### Online Training\nYou can generate and train simultaneously as:\n\n```sh\npython3 train_online.py [config=configs/default.ast]*\n```\n\n## Examples\n\nAll the programs mentioned above can take multiple config files, later files override former ones.\n `configs/default.ast` is loaded by default.  \n\n### Offline\n```sh\n# First generate the ast files based on given examples then...\npython3 gen_data.py hindu_avg_len_60.py configs/hindu.ast configs/len_60.ast\npython3 train_offline.py hindu_3chars.py configs/adagrad.ast configs/bilstm.ast configs/ilr.01.ast\n```\n\n### Online\n```sh\npython3 train_online.py configs/hindu.ast configs/adagrad.ast configs/bilstm.ast configs/ilr.01.ast\n```\n\n### Working Example\n```sh\n# Offline\npython3 gen_data.py hindu3.py configs/working_eg.ast\npython3 train_offline.py hindu3.py configs/working_eg.ast\n# Online\npython3 train_online.py configs/working_eg.ast\n```\n\n\n#Offline\n\n\n## Sample Output\n```\n# Using data from scribe.py hindu\nShown : 0 2 2 5 \nSeen  : 0 2 2 5 \nImages (Shown \u0026 Seen) : \n\n 0¦                            ¦\n 1¦          ██  ██            ¦\n 2¦         █  ██  ████        ¦\n 3¦           █   █ █          ¦\n 4¦      ██  █   █  ███        ¦\n 5¦     █  █████████  █        ¦\n 6¦     █  █        █ █        ¦\n 7¦      ██         ███        ¦\n \n 0¦░░░░░░░░░█░░░░░░░░░░░░░░░░░░¦\n 1¦░░░░░░░░░░░░░░░░░░░░░░░░░░░░¦\n 2¦░░░░░░░░░░░░░█░░░█░░░░░░░░░░¦\n 3¦░░░░░░░░░░░░░░░░░░░░░░░░░░░░¦\n 4¦░░░░░░░░░░░░░░░░░░░░░░░░░░░░¦\n 5¦░░░░░░░░░░░░░░░░░░░█▓░░░░░░░¦\n 6¦█████████░███░███░█░▒███████¦\n\n```\n## References\n* Graves, Alex. **Supervised Sequence Labelling with Recurrent Neural Networks.** Chapters 2, 3, 7 and 9.\n * Available at [Springer](http://www.springer.com/engineering/computational+intelligence+and+complexity/book/978-3-642-24796-5)\n * [University Edition](http://link.springer.com/book/10.1007%2F978-3-642-24797-2) via. Springer Link.\n * Free [Preprint](http://www.cs.toronto.edu/~graves/preprint.pdf)\n\n## Credits\n* Theano implementation of CTC by [Shawn Tan](https://github.com/shawntan/theano-ctc/)\n* Updates.py from [Lasagne](http://lasagne.readthedocs.org/en/latest/modules/updates.html)\n\n## Dependencies\n* Numpy\n* Theano\n\nCan easily port to python2 by adding lines like these where necessary. In the interest of the \nfuture generations, we highly recommend you do not do that.\n``` python\nfrom __future__ import print_function\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frakeshvar%2Frnn_ctc","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Frakeshvar%2Frnn_ctc","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frakeshvar%2Frnn_ctc/lists"}