An open API service indexing awesome lists of open source software.

https://github.com/peterchain/titanic

Script for the Titanic dataset for evaluating which passengers survived
https://github.com/peterchain/titanic

kaggle machine-learning pandas-dataframe python3 scikit-learn

Last synced: about 2 months ago
JSON representation

Script for the Titanic dataset for evaluating which passengers survived

Awesome Lists containing this project

README

          

# titanic
Script for the Titanic dataset for evaluating which passengers survived

## Dependencies

Requires the following python3 libraries

- Pandas
- Scikit Learn

## How to run

Just run the python script, the script needs 2 .csv files in directory, a train and a test dataset of the titanic dataset types.
The script will output several model evaluations and create a results.csv, in Kaggle's output format.

## What does it do

Fix some missing values from both train and test sets, normalize a few values (scale values to stay within a 0 and 1 range) and categorize others (Age into integer).
We run several CART models (AdaBoost and GradientBoost) with several estimators to try to find the best match.