https://github.com/dragonxdev/pytorch-mnist-nn

A neural network designed to convert handwritten digits into their numerical value.
https://github.com/dragonxdev/pytorch-mnist-nn

machine-learning mnist neural-network pytorch

Last synced: 11 months ago
JSON representation

A neural network designed to convert handwritten digits into their numerical value.

Host: GitHub
URL: https://github.com/dragonxdev/pytorch-mnist-nn
Owner: DragonXDev
Created: 2024-05-03T02:52:29.000Z (about 2 years ago)
Default Branch: main
Last Pushed: 2024-05-09T17:14:36.000Z (about 2 years ago)
Last Synced: 2025-02-25T00:35:23.818Z (over 1 year ago)
Topics: machine-learning, mnist, neural-network, pytorch
Language: Python
Homepage:
Size: 24.3 MB
Stars: 2
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

          # MNIST PyTorch Neural Network

A MNIST CNN developed to recognize handwritten digits; grayscale input channel of 1 accepted, ```1x28x28``` input.

![image](https://github.com/DragonXDev/pytorch-mnist-nn/assets/99859617/f77a5f9d-09bd-4624-bb14-cf64ca919dd3)

## Usage & Information

The CNN is trained in batches of ```32``` from the DB. Each of the 32 filters is a 2D ```3x3``` and extracted from the ```1x28x28```

A Visual of an MNIST CNN

![image](https://github.com/DragonXDev/pytorch-mnist-nn/assets/99859617/0f4a9be1-07e3-4632-9048-8a741a7fdeef)

To train the CNN, 

```bash

poetry run python src/train_model.py

```

This will contact the MNIST DB & run 10 epochs to train the model. The appropriate state after the model is finished executing will be created

as the ```digital_model.pt``` file.

Once completed, run the model with the binary generated using

```bash

poetry run python src/main.py

```

## Optimizers

For the purpose of MNIST, the Adam optimizer with ```lr=1e-3``` performs best

![image](https://github.com/DragonXDev/pytorch-mnist-nn/assets/99859617/8ac61337-e8e0-45aa-9383-35fb368731fa)

### Adam

Starting with Stochastic Gradient Descent

```math

g = \frac{1}{m} \nabla_{\theta}\sum_{i}L(f(x^{(i)};\theta),y^{(i)})

```

Where $\theta$ is the models' params, $g$ is the negative direction of the gradient, $m$ is the size of the mini-batches of data, $f(x^{(i)}; \theta)$ is the neural network, $x^{(i)}$ is the training data, $y^{(i)}$ are the training labels, and $L()$ is the loss function.

The Adam optimizer redefines SGD's params as such:

```math

m = \beta_{1}m + (1-\beta_{1})g

```

```math

s = \beta_{2}s + (1-\beta_{2})g^{T}g

```

```math

\theta = \theta - \epsilon_{k}\cdot\frac{m}{\sqrt{s+eps}}

```

For this project, $\beta_{1} = 0.9$, $\beta_{2} = 0.999$, and $eps$ (learning rate) $=$ 1e-3.

### Nesterov Momentum $\nabla$

Nesterov's Momentum Acceleration $\nabla$ (NAG) performs significantly worse than Adam.

```math

\nu = \alpha\nu - \epsilon\nabla_{\theta}(\frac{1}{m}\sum_{i}L(f(x^{(i)};\theta+\alpha\cdot\nu),y^{(i)})

```

```math

\theta = \theta + \nu

```

The loss / cost after ```Epoch 10``` ended at ~ ```0.01253``` on average.

## Contributors

I am the sole contributor of this project.

## License

This project is licensed under the MIT License - see the [LICENSE.md](LICENSE.md)

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/dragonxdev/pytorch-mnist-nn

Awesome Lists containing this project

README