Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/grantgasser/neural-network-mnist-keras

Simple neural network for the MNIST dataset
https://github.com/grantgasser/neural-network-mnist-keras

jupyter-notebook keras mnist neural-network python

Last synced: about 1 month ago
JSON representation

Simple neural network for the MNIST dataset

Host: GitHub
URL: https://github.com/grantgasser/neural-network-mnist-keras
Owner: grantgasser
Created: 2018-06-26T01:27:59.000Z (over 6 years ago)
Default Branch: master
Last Pushed: 2019-01-19T05:19:41.000Z (almost 6 years ago)
Last Synced: 2024-10-28T16:59:53.641Z (3 months ago)
Topics: jupyter-notebook, keras, mnist, neural-network, python
Language: Jupyter Notebook
Size: 325 KB
Stars: 1
Watchers: 2
Forks: 1
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# Simple Neural Network

### Description
This notebook uses Keras to implement a neural network for the MNIST dataset of handwritten images. The goal is to to have the
model correctly classify as many images as possible. One can see that using a library such as keras can lead to respectable results
(at least compared to the early days of neural nets) in a short amount of time.

### Parameters and network features
Without using cross validation for parameters such as the regularization constant, # of layers, # of units in each layer, and many other
considerations with neural nets, one can see that the network with 3 hidden layers works quite well (~90% test accuracy).

The input dimension is 784 (28x28 px images).

The activation function for each layer is the RELU = max(z, 0), where z = x'w and where x is the data (or the vector of outputs from the previous layer) and w is the vector of weights for that layer. The output layer is softmaxed to get a probability distribution for each example. The model predicts the value with the highest probability. We use the Adam optimizer. Results are much similar to stochastic gradient descent (SGD) in this case. The loss is measured as the categorical cross entropy of the output layer distribution.

#### Notes
The images help you see a common error. On this run, the model often confuses 6 and 5. 8 and 5 also seemed to be a common mistake. Re-training the model on your machine might create a model that makes different errors and the last few cells of code may not make sense.

See the [Jupyter Notebook documentation](https://jupyter.org/) for more information.
See the [Keras Documentation](https://keras.io/) for more information.