https://github.com/csanri/simplenn
Simple Neural Network for MNIST Classification
https://github.com/csanri/simplenn
machine-learning-algorithms neural-network numpy python
Last synced: 2 months ago
JSON representation
Simple Neural Network for MNIST Classification
- Host: GitHub
- URL: https://github.com/csanri/simplenn
- Owner: csanri
- License: mit
- Created: 2025-03-20T20:28:07.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2025-03-20T21:42:23.000Z (over 1 year ago)
- Last Synced: 2025-07-25T15:54:16.485Z (11 months ago)
- Topics: machine-learning-algorithms, neural-network, numpy, python
- Language: Python
- Homepage:
- Size: 8.79 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Simple Neural Network for MNIST Classification
A from-scratch implementation of a 2-layer neural network using NumPy for handwritten digit recognition on the MNIST dataset. This project demonstrates fundamental deep learning concepts and achieves **~95% accuracy** on the test set.
## Key Features
- Pure NumPy implementation (no deep learning frameworks)
- Two-layer neural network architecture
- ReLU activation in hidden layer
- Softmax output layer
- Cross-entropy loss function
- Stochastic Gradient Descent (SGD) with backpropagation
- Vectorized operations for efficient computation
## Neural Network Architecture
### Forward Pass Equations
**Layer 1 (Input → Hidden):**
```math
\begin{aligned}
\mathbf{l}_1 &= \mathbf{W}_1 \mathbf{x} + \mathbf{b}_1 \\
\mathbf{y}_1 &= \text{ReLU}(\mathbf{l}_1)
\end{aligned}
```
**Layer 2 (Hidden → Output):**
```math
\begin{aligned}
\mathbf{l}_2 &= \mathbf{W}_2 \mathbf{y}_1 + \mathbf{b}_2 \\
\mathbf{y}_2 &= \text{softmax}(\mathbf{l}_2)
\end{aligned}
```
### Backward Pass Equations
**Output Layer Gradients:**
```math
\begin{aligned}
\frac{\partial \mathcal{L}}{\partial \mathbf{W}_2} &= \frac{1}{m} (\mathbf{y}_2 - \mathbf{y}_{\text{true}}) \mathbf{y}_1^\top \\
\frac{\partial \mathcal{L}}{\partial \mathbf{b}_2} &= \frac{1}{m} \sum (\mathbf{y}_2 - \mathbf{y}_{\text{true}})
\end{aligned}
```
**Hidden Layer Gradients:**
```math
\begin{aligned}
\frac{\partial \mathcal{L}}{\partial \mathbf{W}_1} &= \frac{1}{m} \left(\mathbf{W}_2^\top (\mathbf{y}_2 - \mathbf{y}_{\text{true}}) \odot \text{ReLU}'(\mathbf{l}_1)\right) \mathbf{x}^\top \\
\frac{\partial \mathcal{L}}{\partial \mathbf{b}_1} &= \frac{1}{m} \sum \left(\mathbf{W}_2^\top (\mathbf{y}_2 - \mathbf{y}_{\text{true}}) \odot \text{ReLU}'(\mathbf{l}_1)\right)
\end{aligned}
```
**ReLU Derivative:**
```math
\text{ReLU}'(x) = \begin{cases}
1 & \text{if } x > 0 \\
0 & \text{otherwise}
\end{cases}
```
**Softmax:**
```math
\text{softmax}(z_i) = \frac{\exp(z_i)}{\sum_{c=1}^C \exp(z_c)}
```
## Installation
1. Clone repository:
```bash
git clone https://github.com/csanri/SimpleNN
cd SimpleNN
```
2. Install requirements:
```bash
pip install -r requirements.txt
```
3. Run the script:
```bash
python main.py
```