https://github.com/shambac/project-sem-6
Handwritten text recognition with LSTM RNN using IAM dataset
https://github.com/shambac/project-sem-6
handwritten-text-recognition tensorflow
Last synced: 2 months ago
JSON representation
Handwritten text recognition with LSTM RNN using IAM dataset
- Host: GitHub
- URL: https://github.com/shambac/project-sem-6
- Owner: ShambaC
- Created: 2023-03-11T05:54:59.000Z (over 3 years ago)
- Default Branch: main
- Last Pushed: 2023-03-13T17:44:16.000Z (over 3 years ago)
- Last Synced: 2024-12-24T11:52:20.883Z (over 1 year ago)
- Topics: handwritten-text-recognition, tensorflow
- Language: Python
- Homepage:
- Size: 17.6 KB
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Project SEM 6
Handwritten text recognition
## What ?
Handwritten text recognition using Long Short Term Memory implementation of RNN. Honestly I don't know what's happening here.
The [IAM Handwriting Database](https://fki.tic.heia-fr.ch/databases/iam-handwriting-database) is used to train the network. Specifically the words dataset.
## How to ?
Clone the repo and then do the following.
### Get the IAM Handwriting Dataset :
- Make an [account](https://fki.tic.heia-fr.ch/register) on the database website.
- Download the [words archive](https://fki.tic.heia-fr.ch/DBs/iamDB/data/words.tgz)
- Extract the contents to the following directory structure on the root of this project : `Datasets/words`
- So the resulting folder structure should be :
```
Datasets
|_words
|_a01
|_a01-000u
|_images.png
|_a01-000x
|_...
|_a02
|_...
```
- Now you need the labels for these images.
- Download the [ascii archive](https://fki.tic.heia-fr.ch/DBs/iamDB/data/ascii.tgz)
- Extract only the words.txt file to the `Datasets` folder.
### Install the dependencies
- install the requirements.
- Do in terminal : `pip install -r requirements.txt`
- The required packages are :
- numpy : 1.22.4
- opencv_python : 4.6.0.66
- pandas : 1.3.2
- Pillow : 9.4.0
- tensorflow : 2.10.0
- tqdm : 4.62.3
### Edit the configurations
- Open the `config.py` file.
- Make whatever changes you want.
- Mainly you might want to make changes to the following variables :
- batch_size
- learning_rate
- train_epochs
- train_workers
- validation_split
### Train the model
- Run the `train.py` script to train the model.
- No need for editing anything in this file.
- I trained the model with the following parameters :
- learning rate : 0.0005
- validation split : 0.9
- batch size = 16
- epochs = 100
- I trained it with my GTX 1650. It used 2132 MB of GPU memory. Usage was around 8-10 %. Took around 140-150 seconds each epoch. Took about 4 hours for all 100 epochs.
- CPU usage was around 50%. My CPU is Ryzen 7 3750H.
- RAM usage was around 2 GBs.
- You can visualise the training using tensorboard. Run `tensorboard --logdir = path_to_logs` in terminal to start the server.
- The logs are located at the following folder : `Models/Handwriting_recognition/{timestamp}/logs`
- After training, a bunch of files are generated. The only important files are `model.meow` and `configs.meow`.
### Run the model
- You can run the `inferenceModel.py` to check with only one image from the dataset itself.
- OR use the Paint GUI made with tkinter by running the `tkRecog.py`.
# IN BOTH CASES MAKE SURE TO EDIT THE unixTime VARIABLE WITH YOUR MODEL'S FOLDER
# Thank You
Totally not copied code.