https://github.com/louisguitton/titanic-kaggle
R code for the "Hello World" Kaggle competition about the titanic dataset
https://github.com/louisguitton/titanic-kaggle
Last synced: 2 months ago
JSON representation
R code for the "Hello World" Kaggle competition about the titanic dataset
- Host: GitHub
- URL: https://github.com/louisguitton/titanic-kaggle
- Owner: louisguitton
- Created: 2015-11-16T23:53:40.000Z (almost 10 years ago)
- Default Branch: master
- Last Pushed: 2015-11-22T11:15:30.000Z (almost 10 years ago)
- Last Synced: 2025-03-29T06:12:13.475Z (6 months ago)
- Language: R
- Homepage:
- Size: 6.88 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.mdown
Awesome Lists containing this project
README
This repository contains the code I used to participate to the "Hello World" Kaggle competition about the titanic dataset.
https://www.kaggle.com/c/titanicI chose to start with R.
Decision Trees & Feature Engineering
====
As advised by Kaggle, I first went through the DataCamp tutorial to ML.
https://www.datacamp.com/courses/kaggle-tutorial-on-machine-learing-the-sinking-of-the-titanic
It introduces:
- decision trees with rPart
- feature engineering
- overfitting
- random forestsThe simple decision tree with feature engineering was my best entry so far.
General Linear Models & Complete Approach with R
====
I was looking to improve my score.
I went through this tutorial https://github.com/wehrley/wehrley.github.io/blob/master/SOUPTONUTS.md
It introduces:
- advances feature engineering
- logistic regression
- adaptative boosting
- random forest
- support vector machinesConditionnal Forest
====
At that point I learnt what I was looking for.
But still I wanted to see what the next step was.
this script shows the difference between rForest and cForest:
https://www.kaggle.com/uioreanu/titanic/randomforest-cforest-method