https://github.com/utkarsh-284/deep-learning
This repository contains the documents and main project file needed for building DNN from scratch using NumPy.
https://github.com/utkarsh-284/deep-learning
data-science deep-learning deep-neural-networks machine-learning mnist mnist-handwriting-recognition neural-networks numpy python3
Last synced: 6 months ago
JSON representation
This repository contains the documents and main project file needed for building DNN from scratch using NumPy.
- Host: GitHub
- URL: https://github.com/utkarsh-284/deep-learning
- Owner: utkarsh-284
- Created: 2025-06-08T12:35:06.000Z (8 months ago)
- Default Branch: main
- Last Pushed: 2025-06-08T13:33:12.000Z (8 months ago)
- Last Synced: 2025-06-19T11:50:11.451Z (8 months ago)
- Topics: data-science, deep-learning, deep-neural-networks, machine-learning, mnist, mnist-handwriting-recognition, neural-networks, numpy, python3
- Language: Jupyter Notebook
- Homepage:
- Size: 5.54 MB
- Stars: 1
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Handwritten Digit Recognition with NumPy: Building a Deep Neural Network from Scratch

## Overview
This project implements a **3-layer Deep Neural Network (DNN)** using only NumPy to classify handwritten digits from the MNIST dataset. By building all components from scratch—including forward/backward propagation, activation functions, and gradient descent—we demonstrate the foundational mechanics of deep learning without high-level frameworks like TensorFlow or PyTorch.
**Key Features**:
- Pure NumPy implementation (no DL frameworks)
- He initialization for efficient learning
- Mini-batch gradient descent optimization
- ReLU activation in hidden layers
- Softmax output layer with cross-entropy loss
- Achieves **95% accuracy** on validation set
## Project Structure
```
.
├── DNN_using_NumPy.ipynb # Main Jupyter notebook with implementation
├── Neural Networks and Deep Learning (Notes).pdf # Notes explaining Mathematical part of Neural Networks
├── train.csv # MNIST training data (CSV format)
├── README.md # This documentation
└── requirements.txt # Python dependencies
```
## How to Run
1. **Install Dependencies**:
```bash
pip install -r requirements.txt
```
Requirements: NumPy, pandas, scikit-learn
2. **Download Data**:
- Place `train.csv` in the project directory (available on [Kaggle](https://www.kaggle.com/competitions/digit-recognizer/data))
3. **Execute the Notebook**:
```bash
jupyter notebook DNN_using_NumPy.ipynb
```
4. **Training Process**:
- 3-layer network (128 → 64 → 10 units)
- 20 epochs with batch size 64
- Learning rate: 0.01
- Loss decreases from 1.12 to 0.13
## Results
| Epoch | Loss | Validation Accuracy |
|-------|----------|---------------------|
| 1 | 1.1287 | - |
| 10 | 0.2068 | - |
| **20**| **0.1366**| **95.07%** |
## Key Components
### 1. Network Architecture
```python
Input (784) → Hidden 1 (128, ReLU) → Hidden 2 (64, ReLU) → Output (10, Softmax)
```
### 2. Core Functions
- **Initialization**: He weight initialization (`init_params()`)
- **Forward Propagation**: `forward_prop()`
- **Activation Functions**: ReLU and Softmax
- **Loss Calculation**: Cross-entropy (`cross_entropy()`)
- **Backpropagation**: `backward_prop()`
- **Parameter Update**: Gradient descent (`update_params()`)
### 3. Training
Mini-batch gradient descent with shuffling:
```python
for epoch in range(epochs):
perm = np.random.permutation(n)
for i in range(0, n, batch_size):
# Forward pass, loss calculation, backprop, update
```
## Future Improvements
1. **Hyperparameter Tuning**:
- Implement learning rate scheduling
- Experiment with Adam optimizer
2. **Regularization Techniques**:
- Add L2 regularization
- Implement dropout layers
3. **Architecture Enhancements**:
- Add batch normalization
- Extend to convolutional layers (CNN)
4. **Deployment**:
- Create web interface for real-time predictions
- Optimize for mobile devices using ONNX
## Dependencies
- Python 3.7+
- NumPy 1.21+
- pandas 1.3+
- scikit-learn 1.0+
## Contributor
**Utkarsh Bhardwaj**
[](https://www.linkedin.com/in/utkarsh284/)
[](https://github.com/utkarsh-284)
**Contact**: ubhardwaj284@gmail.com
**Publish Date**: 8th June, 2025
---
**Why This Project Stands Out**: By implementing every neural network component from scratch, we demystify deep learning fundamentals while achieving competitive performance. This serves as both an educational resource and a foundation for more complex architectures.