https://github.com/mrankitgupta/titanic-survival-prediction-93-xgboost

Titanic Survival Prediction Project (93% Accuracy)🛳️ In this notebook, The goal is to correctly predict if someone survived the Titanic shipwreck using different Machine Learning Model & Hyperparameter tunning.
https://github.com/mrankitgupta/titanic-survival-prediction-93-xgboost

classification data-analysis data-science data-visualization gradient-boosting kaggle-competition linear-regression logistic-regression machine-learning machine-learning-algorithms ml ml-models nlp prediction predictive-modeling random-forest titanic titanic-kaggle titanic-survival-prediction xgboost

Last synced: 3 months ago
JSON representation

Host: GitHub
URL: https://github.com/mrankitgupta/titanic-survival-prediction-93-xgboost
Owner: mrankitgupta
License: mit
Created: 2023-01-08T09:24:12.000Z (over 2 years ago)
Default Branch: main
Last Pushed: 2023-01-08T13:02:37.000Z (over 2 years ago)
Last Synced: 2025-01-17T18:57:04.012Z (5 months ago)
Topics: classification, data-analysis, data-science, data-visualization, gradient-boosting, kaggle-competition, linear-regression, logistic-regression, machine-learning, machine-learning-algorithms, ml, ml-models, nlp, prediction, predictive-modeling, random-forest, titanic, titanic-kaggle, titanic-survival-prediction, xgboost
Language: Jupyter Notebook
Homepage:
Size: 482 KB
Stars: 1
Watchers: 2
Forks: 1
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

Titanic - Machine Learning from Disaster | (Accuracy: 93%) XGBoost 🛳️

Titanic Survival Prediction: Machine Learning Model 🛳️

ML Models used: XGBoost, Random Forest, Logistic Regression

In this notebook, The goal is to correctly predict if someone survived the Titanic shipwreck using different Machine Learning Model and Hyperparameter tunning.

### **Prerequisites:**

[Data Analyst Roadmap](https://github.com/mrankitgupta/Data-Analyst-Roadmap) ⌛

[Python Lessons](https://github.com/mrankitgupta/PythonLessons) 📑

[Python Libraries for Data Science](https://github.com/mrankitgupta/PythonLibraries) 🗂️

### **Overview**

1. **Understand the shape of the data (Histograms, box plots, etc.)**

1. **Data Cleaning**

1. **Data Exploration**

1. **Feature Engineering**

1. **Data Preprocessing for Model**

1. **Basic Model Building**

1. **Model Tuning**

1. **Ensemble Modle Building**

1. **Results**

### **About the Project** 🛳️

Competition sites like Kaggle define the problem to solve or questions to ask while providing the datasets for training your data science model and testing the model results against a test dataset. The question or problem definition for Titanic Survival competition is [described here at Kaggle](https://www.kaggle.com/c/titanic).

Knowing from a training set of samples listing passengers who survived or did not survive the Titanic disaster, can our model determine based on a given test dataset not containing the survival information, if these passengers in the test dataset survived or not.

We may also want to develop some early understanding about the domain of our problem. This is described on the [Kaggle competition description page here](https://www.kaggle.com/c/titanic). Here are the highlights to note.

On April 15, 1912, during her maiden voyage, the Titanic sank after colliding with an iceberg, killing 1502 out of 2224 passengers and crew. Translated 32% survival rate.
One of the reasons that the shipwreck led to such loss of life was that there were not enough lifeboats for the passengers and crew.
Although there was some element of luck involved in surviving the sinking, some groups of people were more likely to survive than others, such as women, children, and the upper-class.

## **Workflow stages**

The competition solution workflow goes through seven stages described in the Data Science Solutions book.

1. Question or problem definition.
1. Acquire training and testing data.
1. Wrangle, prepare, cleanse the data.
1. Analyze, identify patterns, and explore the data.
1. Model, predict and solve the problem.
1. Visualize, report, and present the problem solving steps and final solution.
1. Supply or submit the results.

## Technologies used ⚙️

* Python

* Statistics

* Jupyter

##### Python Libraries :
* Pandas | NumPy | Matplotlib | Seaborn |

* Scikit-Learn | XGBoost

## Project - Titanic Survival Prediction: Machine Learning Model 🛳️

### **Kaggle Project Link: [Titanic Survival Prediction](https://www.kaggle.com/mrankitgupta/titanic-survival-prediction-93-xgboost)** 🛳️ 🔗

### Datasets

Kaggle Titanic Datasets: [Titanic Train](https://www.kaggle.com/competitions/titanic/data?select=train.csv) & [Titanic Test](https://www.kaggle.com/competitions/titanic/data?select=test.csv)

## Related Projects❓ 👨‍💻 🛰️

[Spotify Data Analysis using Python](https://github.com/mrankitgupta/Spotify-Data-Analysis-using-Python) 📊

[Data Analyst Roadmap](https://github.com/mrankitgupta/Data-Analyst-Roadmap) ⌛

[Statistics for Data Science using Python](https://github.com/mrankitgupta/Statistics-for-Data-Science-using-Python) 📊

[Sales Insights - Data Analysis using Tableau & SQL](https://github.com/mrankitgupta/Sales-Insights-Data-Analysis-using-Tableau-and-SQL) 📊

[Kaggle - Pandas Solved Exercises](https://github.com/mrankitgupta/Kaggle-Pandas-Solved-Exercises) 📊

[Python Lessons](https://github.com/mrankitgupta/PythonLessons) 📑

[Python Libraries for Data Science](https://github.com/mrankitgupta/PythonLibraries) 🗂️

### Liked my Contributions❓ Follow Me👉 [Kaggle](https://www.kaggle.com/MrAnkitGupta) and [GitHub](https://github.com/MrAnkitGupta)

[Nominate Me for GitHub Stars](https://stars.github.com/nominate/) ⭐ ✨

## For any queries/doubts 🔗 👇

### [Ankit Gupta](https://bio.link/AnkitGupta)