An open API service indexing awesome lists of open source software.

https://github.com/titu1994/kaggle

Kaggle competition library. Uses Python 3.4.1 with almost all known python libraries for Machine Learning
https://github.com/titu1994/kaggle

Last synced: 3 months ago
JSON representation

Kaggle competition library. Uses Python 3.4.1 with almost all known python libraries for Machine Learning

Awesome Lists containing this project

README

          

# Kaggle

A few python scripts that perform well on Kaggle competions.

# Competition
## MNIST

Contains various scripts using either XGBoost, PCA + SVM or Convolutional Neural Networks.

- CNNs tend to perform the best, with the VGG-like net performing the best but requiring the largest amount of time.
- SqueezeNet is fast to train but does not perform as well as VGG
- The current best script I posses is the DCCNN MNIST architecture, but since it has been trained on the entire data set it will
over fit in this Kaggle dataset and give 100% accuracy. (Original accuracy = 99.77 %)

## Titanic

Starter dataset to teach about the basics of Machine Learning principles such as data cleaning / preprocessing and feature construction.

- Best performing model is XGBoost

## Bike Sharing Demand

A time series data set which is very useful to understand how to manipulate and train time series datasets.

- Best performing model is an ensemble of two XGBoost which learn the two different time factored outputs and then merge them into a single output.

## BNP

Bank BNP Paribas contest.

- Tried various combinations of stacking and neural nets. Best was a combination of neural nets and XGBoost stacked with Logistic Regression as final layer.