Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/jedrzejboczar/tf-character-recognition

Character recognition from images using Tensorflow
https://github.com/jedrzejboczar/tf-character-recognition

Last synced: about 2 months ago
JSON representation

Character recognition from images using Tensorflow

Host: GitHub
URL: https://github.com/jedrzejboczar/tf-character-recognition
Owner: jedrzejboczar
License: mit
Created: 2019-09-10T11:36:30.000Z (over 5 years ago)
Default Branch: master
Last Pushed: 2021-05-21T15:50:25.000Z (over 3 years ago)
Last Synced: 2023-03-05T12:05:29.297Z (almost 2 years ago)
Language: Python
Size: 285 KB
Stars: 2
Watchers: 2
Forks: 1
Open Issues: 2
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

# tf-character-recognition

This project is an exploration of Tensorflow library (v1).
It has been used to train Convolutional Neural Networks (CNNs) to recognise characters (digits, lower- and uppercase letters)
from images.

As for the database, the [Chars74K](http://www.ee.surrey.ac.uk/CVSSP/demos/chars74k/) dataset has been used.
It contains 3 types of images: natural photos, handwritten characters (on a tablet) and images synthesised from fonts.
To download the dataset use the script at `database/chars74k/prepare_database.py`.
This will download all the necessary files, extract them and perform the initial preperation of image files for future use.

## Current state

I've written this some time ago. Since then tensoflow had some updates, along with version v2.
The requiremets.txt file has been generated with `pipreqs` after some time
and it doesn't reflect the state of version from development time,
so there are multiple deprecation warnings and some things do not work at all.
This is a big TODO, but at the moment I don't have time to fix it.

## Project organisation

```
.
├── cnn_model.py # definitions of tensorflow models
├── cv2_show.py # some utilities for viewing images dataset
├── database/ # all the datasets used and their loaders
├── data.py # generalization over datasets and utilities for image destortions, etc.
├── gui.py # PyQt GUI for testing recognition on hand written images
├── log.py # logging
├── models/ # here tensorflow models are stored (this is gitignored)
└── run.py # main launcher
```

The main script of interest is `run.py` (but other scripts may also provide mains for some special usage).
The models are stored in `models/` directory, but it is ignored, so first a model has to be trained before anything can be done.
Some pretrained and quite well-working models can be found in releases to avoid the time consuming training process.

## Examples

See help:
```
python run.py -h
```

Probably the most interesting thing that is also straghtforward to use is the GUI:
```
python run.py -G
```

Some screenshots:

Another quite straightforward option is the visualization of a walk in latent space:
```
python run.p -DW
```

Download this ![video](media/latent-space-walk.mp4) for an example result of latent space walk.