An open API service indexing awesome lists of open source software.

https://github.com/al-ghaly/titanic-machine-learning

Bunch of Machine Learning Classification Models to predict the if a passenger is most likely die
https://github.com/al-ghaly/titanic-machine-learning

classification-algorithm data-science machine-learning machine-learning-algorithms

Last synced: about 1 year ago
JSON representation

Bunch of Machine Learning Classification Models to predict the if a passenger is most likely die

Awesome Lists containing this project

README

          

# Titanic-Machine-Learning
## By MOHAMED ALGHALY

### Applied Models:
1. **Logistic Regression**
2. **Decision Tree**
3. **Random Forest**
4. **Support Vector Machine (SVM)**
5. **Ada Boost**
6. **Gradient Boosting**
7. **Naive Bayes**
8. **K-Nearest Neighbor (KNN)**

---

# I have made the data preprocessing dynamic to enable flexible modeling
## I implemented the transform function to clean the data
### You will just have to specify any parameter to overwrite the default data cleaning as follows:
![Screenshot (248)](https://github.com/al-ghaly/Titanic-Machine-Learning/assets/61648960/f6587744-2c35-4c0c-a9c4-eca2330e3083)
![Screenshot (249)](https://github.com/al-ghaly/Titanic-Machine-Learning/assets/61648960/d4777614-8d75-4c64-b2f0-6fb6a91302aa)
![Screenshot (250)](https://github.com/al-ghaly/Titanic-Machine-Learning/assets/61648960/124b936d-6615-495c-97d1-ad758ab45ee8)
* ### method: how to handle missing values
* ### inplace: whether to clean the data as a new dataframe or into the same one
* ### drop_features: whether or not we want to drop useless features
* ### combine_rel & remove: how to handle multicollinearity
---
## Attached Files
* ### train.csv:
* The dataset to model (Labeled)
* ### test.csv:
* The dataset to test your model on (UnLabeled)
* ### Titanic.ipynb:
* The Jupyter Notebook for the project
* ### Titanic.html:
* The project's report
* ### Predictions.csv:
* The predections the model made on the unlabeled test data
* ### scenarios.png:
* The possible scenarios to clean the data