https://github.com/kamalarnav04/handwrittendigitrecognizer
A neural network-based handwritten digit recognition system built with TensorFlow/Keras that can classify handwritten digits (0-9) from the MNIST dataset with 97.8% validation accuracy.
https://github.com/kamalarnav04/handwrittendigitrecognizer
python tensorflow
Last synced: 3 months ago
JSON representation
A neural network-based handwritten digit recognition system built with TensorFlow/Keras that can classify handwritten digits (0-9) from the MNIST dataset with 97.8% validation accuracy.
- Host: GitHub
- URL: https://github.com/kamalarnav04/handwrittendigitrecognizer
- Owner: kamalarnav04
- Created: 2025-06-16T16:03:36.000Z (about 1 year ago)
- Default Branch: main
- Last Pushed: 2025-06-16T16:16:21.000Z (about 1 year ago)
- Last Synced: 2025-06-26T03:04:33.201Z (about 1 year ago)
- Topics: python, tensorflow
- Language: Jupyter Notebook
- Homepage:
- Size: 11 MB
- Stars: 1
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Handwritten Digit Recognizer 🔢
A neural network-based handwritten digit recognition system built with TensorFlow/Keras that can classify handwritten digits (0-9) from the MNIST dataset with **97.8% validation accuracy**.


## 🎯 Project Overview
This project implements a feedforward neural network to recognize handwritten digits using the famous MNIST dataset. The model can process both the original MNIST dataset and custom PNG images uploaded by users.
### Key Features
- **High Accuracy**: Achieves 97.8% validation accuracy
- **Fast Training**: Trains in under 15 seconds (5 epochs)
- **Custom Image Support**: Can predict digits from user-uploaded PNG images
- **Lightweight Architecture**: Simple yet effective 3-layer neural network
- **Easy to Use**: Interactive prediction interface
## 📊 Model Performance
| Metric | Training | Validation |
|--------|----------|------------|
| **Accuracy** | 98.55% | 97.80% |
| **Loss** | 0.0442 | 0.0854 |
| **Training Time** | ~12 seconds | - |
### Training Progress
```
Epoch 1/5: loss: 0.2573 - accuracy: 0.9249 - val_accuracy: 0.9665
Epoch 2/5: loss: 0.1086 - accuracy: 0.9663 - val_accuracy: 0.9762
Epoch 3/5: loss: 0.0766 - accuracy: 0.9763 - val_accuracy: 0.9745
Epoch 4/5: loss: 0.0572 - accuracy: 0.9816 - val_accuracy: 0.9760
Epoch 5/5: loss: 0.0442 - accuracy: 0.9855 - val_accuracy: 0.9780
```
## 🏗️ Model Architecture
The neural network consists of:
```
Input Layer: 784 neurons (28×28 flattened image)
Hidden Layer 1: 128 neurons (ReLU activation)
Hidden Layer 2: 64 neurons (ReLU activation)
Output Layer: 10 neurons (Softmax activation)
```
**Total Parameters**: ~101,770 trainable parameters
### Architecture Highlights
- **Activation Functions**: ReLU for hidden layers, Softmax for output
- **Optimizer**: Adam with learning rate 0.001
- **Loss Function**: Sparse Categorical Crossentropy
- **Regularization**: 10% validation split for monitoring
## 📁 Dataset
The project uses the **MNIST Dataset** with the following files:
- `train-images-idx3-ubyte`: 60,000 training images (28×28 pixels)
- `train-labels-idx1-ubyte`: 60,000 training labels
- `t10k-images-idx3-ubyte`: 10,000 test images
- `t10k-labels-idx1-ubyte`: 10,000 test labels
### Data Preprocessing
1. **Normalization**: Pixel values scaled from [0, 255] to [0, 1]
2. **Reshaping**: Images flattened from 28×28 to 784-dimensional vectors
3. **Type Conversion**: Images converted to float32 for efficient computation
## 📝 Usage
### Training the Model
The model trains automatically when you run all cells in the notebook. Training takes approximately 12 seconds on modern hardware.
### Making Predictions on Custom Images
```python
# Example usage for custom PNG images
file_path = "path/to/your/digit_image.png"
predicted_class, confidence = predict_from_png(file_path)
print(f"Predicted digit: {predicted_class}")
print(f"Confidence: {confidence:.4f}")
```
### Image Requirements for Custom Predictions
- **Format**: PNG
- **Content**: Single handwritten digit on light background
- **Automatic Processing**: The function handles resizing to 28×28 and grayscale conversion
## 🔬 Technical Details
### Data Loading
Custom functions to read MNIST's binary format:
- `load_images()`: Reads image data from idx3-ubyte files
- `load_labels()`: Reads label data from idx1-ubyte files
### Model Configuration
```python
model = tf.keras.Sequential([
layers.Dense(128, activation='relu', input_shape=(784,)),
layers.Dense(64, activation='relu'),
layers.Dense(10, activation='softmax')
])
model.compile(
optimizer=tf.keras.optimizers.Adam(learning_rate=1e-3),
loss='sparse_categorical_crossentropy',
metrics=['accuracy']
)
```
### Image Preprocessing Pipeline
```python
def predict_from_png(file_path):
img = Image.open(file_path).convert('L') # Grayscale
img = img.resize((28, 28)) # Resize to MNIST dimensions
img_array = np.array(img) / 255.0 # Normalize
img_array = img_array.reshape(1, 784) # Flatten
return model.predict(img_array)
```
## 📈 Results Analysis
### Strengths
- ✅ **High Accuracy**: 97.8% validation accuracy
- ✅ **Fast Training**: Converges quickly in 5 epochs
- ✅ **Robust Performance**: Consistent across training and validation
- ✅ **Practical Application**: Works with real-world PNG images
## 🛠️ File Structure
```
HandwrittenDigitRecognizer/
├── model.ipynb # Main Jupyter notebook
├── README.md # This file
├── train-images-idx3-ubyte # Training images
├── train-labels-idx1-ubyte # Training labels
├── t10k-images-idx3-ubyte # Test images
├── t10k-labels-idx1-ubyte # Test labels
```
## 📚 References
- [MNIST Database](http://yann.lecun.com/exdb/mnist/)
- [TensorFlow Documentation](https://www.tensorflow.org/)
- [Keras Sequential Model Guide](https://keras.io/guides/sequential_model/)
---