https://github.com/grantgasser/task3
Multiclass Classification
https://github.com/grantgasser/task3
keras multiclass-classification numpy pandas tensorflow
Last synced: about 1 month ago
JSON representation
Multiclass Classification
- Host: GitHub
- URL: https://github.com/grantgasser/task3
- Owner: grantgasser
- Created: 2018-04-23T12:27:16.000Z (almost 8 years ago)
- Default Branch: master
- Last Pushed: 2018-10-01T19:42:11.000Z (over 7 years ago)
- Last Synced: 2025-02-08T18:14:35.128Z (12 months ago)
- Topics: keras, multiclass-classification, numpy, pandas, tensorflow
- Language: Jupyter Notebook
- Homepage:
- Size: 18.3 MB
- Stars: 0
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Task 3 Multiclass Classification
### Description
This program uses a deep neural network using [keras](https://keras.io/).
### Data
The training and test data comes from two h5 files. There are 100 predictors (X variables) and an output Y (either 0, 1, 2, 3, 4).
### Implementation
The program first standardizes the x-values, assuring for each x_1 through x_100, the mean(x_i) ~= 0. This speeds up training. Next a neural network is set up with hidden layers of size 70, 30, 50, 20. The activation function used for the input layers is ReLU. The input layer is 100 (there are 100 predictors) and the output layer is 5 (there are 5 classes). The output layer is softmaxed in order to obtain a probability distribution for the classes. Some more model parameters:
* Optimizer: Adam
* Loss function: Categorical Cross Entropy
* Metric: Accuracy
#### Side notes
The data must be 1-hot encoded when passed to the keras fit function. One can use the `to_categorical` function from the keras utils module.
The predictions are output to a csv file, which was uploaded to be tested on a unknown data set. The accuracy of the model on the new test set was **90.6%**, which passed the "hard" baseline provided by the creators of the competition.