https://github.com/t-k-233/recognize-handwritten-alphabet

to identify hand-written alphabets using deep neural network
https://github.com/t-k-233/recognize-handwritten-alphabet

Last synced: 7 months ago
JSON representation

to identify hand-written alphabets using deep neural network

Host: GitHub
URL: https://github.com/t-k-233/recognize-handwritten-alphabet
Owner: T-K-233
Created: 2018-06-13T02:29:17.000Z (over 7 years ago)
Default Branch: master
Last Pushed: 2018-08-07T09:35:59.000Z (about 7 years ago)
Last Synced: 2025-02-28T08:55:55.505Z (8 months ago)
Language: Python
Homepage:
Size: 11.2 MB
Stars: 4
Watchers: 1
Forks: 2
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

          #  To Recognize Hand-written Alphabets

Tony Chi, Raymond Ji, Barry Jiang, Max Wang

## Abstract

Our project's aim is to program the computer to identify hand-written alphabets via matrix operations. Each alphabet image contains 28*28 pixels, and we create a matrix using these pixels. By multiply the matrix to several sample matrixes, the pixels are converted into a three-layer 1024x512x128 fully-connected deep neural network. And finally, we employ gradient decent method so that the computer can predict the highest possibility of the alphabet written.

## Introduction

Have you ever faced a problem, where you wanted to transform a paper document into a digital one, but you have to type it character by character? People have been trying to find a easy way to solve the problem, and we have offered a solution. The purpose of our project is to recognize hand-written alphabets, so the computer can automatically identify the characters without any manual input.

## Progress

#### 1. Getting Started

To begin with, we need [Python 3.6](https://www.python.org/downloads/release/python-365/) installed. Then, we need to import several libraries that will be used.

```Python

import tflearn

import keras

import pandas as pd

import numpy as np

from matplotlib import pyplot as plt

```

*Note: if you are using a brand new Python, you need to first install these libraries using pip install:

```bash

> pip install numpy pandas tensorflow tflearn keras matplotlib

```

#### 2. Gather Dataset

The [A_Z Handwritten Data](https://www.kaggle.com/sachinpatel21/az-handwritten-alphabets-in-csv-format) contains capitalized handwritten alphabet images (A-Z) in size of  `28x28`  pixels. Each alphabet in the image is centered at  `20x20`  pixel box. There are 372451 images in total, or approximately 14325 images for each of the alphabet, in the data file.  

As it is a csv file after decompression, we can use the  `loadtxt()`  method from numpy to load from the dataset. 

```Python

train_file = 'data/a_z_handwritten.csv'

num_classes = 26

np = np.loadtxt(train_file, delimiter=',')

```

The images looks like this: 

![img](https://images-cdn.shimo.im/ZSqYuXFvgEUAcHnc/po.png!thumbnail) 

#### 3. Preprocess Data

In order to achieve maximum efficient, we need to preprocess the data before we feed it into the neural network. 

```Python

def preprocess_data(raw):

    np.random.shuffle(raw)

    y_label = raw[:, 0]

    out_y = keras.utils.to_categorical(y_label, num_classes)

    num_images = raw.shape[0]

    out_x = raw[:, 1:785]

    return out_x, out_y

X, Y = load_data()

testX, testY = X[-1000:], Y[-1000:]

X, Y = X[:80000], Y[:80000]

```

In the code above, we first use the `shuffle()` method to randomize the data (the original one is sorted from A-Z, and it is not machine-friendly). Then, we extract the label, the number indicating which letter is this row representing, from the raw data. `keras.utils.to_categorical()` is used to convert the number, range from 0-25, into the one-hot array. 

The image is extracted from the raw data, and the function returns the processed X and Y.Finally, we use the last 1000 images of the dataset as the test set, and the first 80000 rows (this is limited by my 16G RAM) as the train set.   

#### 4. Building the Network   

```Python

def build_model():

    net = tflearn.input_data(shape=[None, 784])       # input_layer

    net = tflearn.fully_connected(net, 1024, activation='relu', regularizer='L2', weight_decay=0.001)                 # dense1

    net = tflearn.dropout(net, 0.9)                   # dropout1

    net = tflearn.fully_connected(net, 512, activation='relu', regularizer='L2', weight_decay=0.001)                 # dense2

    net = tflearn.dropout(net, 0.9)                   # dropout2

    net = tflearn.fully_connected(net, 128, activation='relu', regularizer='L2', weight_decay=0.001)                 # dense3

    net = tflearn.dropout(net, 0.9)                   # dropout3

    softmax = tflearn.fully_connected(net, 26, activation='softmax')

    # Regression using SGD with learning rate decay and Top-3 accuracy

    sgd = tflearn.SGD(learning_rate=0.04, lr_decay=0.98, decay_step=1000)

    net = tflearn.regression(softmax, optimizer=sgd, metric=tflearn.metrics.Accuracy(), loss='categorical_crossentropy')

    model = tflearn.DNN(net, tensorboard_verbose=0)

    return model

```

This is a three-layer 1024x512x128 fully-connected deep neural network.

In the image below this what the neural network looks like, when visualized: 

![img](https://images-cdn.shimo.im/IsUus4sbdVENnWX9/525338247769891090.jpg!thumbnail) 

#### 5. Train the model

```Python

model = build_model()

model.fit(X, Y, n_epoch=4, validation_set=(testX, testY), show_metric=True)

```

This step is surprisingly simple. 

First, we build the model. Then, we use the `fit()` method to train the network. 

#### 6. Evaluation

We will predict and show a random image in the test dataset to test the accuracy.

```Python

import random

preds = model.predict([testX[random.randint(0, len(testX))]])

pos = n_hi = 0

for i, n in enumerate(testY[2]):

 if n > n_hi:

  pos, n_hi = i, n

print(chr(pos+65))

pos = n_hi = 0

for i, n in enumerate(preds[0]):

 if n > n_hi:

  pos, n_hi = i, n

print(chr(pos+65))

plt.imshow(testX[2].reshape(28, 28), cmap=plt.cm.gray)

plt.pause(2)

plt.close()

```

## Result

````

Training Step: 12500  | total loss: [1m[32m0.15304[0m[0m | time: 61.852s

| SGD | epoch: 004 | loss: 0.15304 - acc: 0.9735 | val_loss: 0.03337 - val_acc: 0.9910 -- iter: 200000/200000

````

We achieve an accuracy of 0.9910 in the validation dataset after 4 epochs of training, the images that the neural network has not seen before. This is actually a surprising outcome with such a simple network structure. 

## Reference

	Patel S. (2018). A-Z Handwritten Alphabets in .csv format. Kaggle. Retrieved from https://www.kaggle.com/sachinpatel21/az-handwritten-

alphabets-in-csv-format

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/t-k-233/recognize-handwritten-alphabet

Awesome Lists containing this project

README