https://github.com/peterchain/titanic
Script for the Titanic dataset for evaluating which passengers survived
https://github.com/peterchain/titanic
kaggle machine-learning pandas-dataframe python3 scikit-learn
Last synced: about 2 months ago
JSON representation
Script for the Titanic dataset for evaluating which passengers survived
- Host: GitHub
- URL: https://github.com/peterchain/titanic
- Owner: PeterChain
- Created: 2020-12-04T22:48:52.000Z (over 5 years ago)
- Default Branch: master
- Last Pushed: 2020-12-04T23:00:18.000Z (over 5 years ago)
- Last Synced: 2025-08-30T04:24:35.176Z (10 months ago)
- Topics: kaggle, machine-learning, pandas-dataframe, python3, scikit-learn
- Language: Python
- Homepage:
- Size: 33.2 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# titanic
Script for the Titanic dataset for evaluating which passengers survived
## Dependencies
Requires the following python3 libraries
- Pandas
- Scikit Learn
## How to run
Just run the python script, the script needs 2 .csv files in directory, a train and a test dataset of the titanic dataset types.
The script will output several model evaluations and create a results.csv, in Kaggle's output format.
## What does it do
Fix some missing values from both train and test sets, normalize a few values (scale values to stay within a 0 and 1 range) and categorize others (Age into integer).
We run several CART models (AdaBoost and GradientBoost) with several estimators to try to find the best match.