Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/soumyaco/spaceship-titanic
Famous Kaggle competition solution notebook with step by step guide.
https://github.com/soumyaco/spaceship-titanic
data-analysis-python data-science kaggle-competition machine-learning python3 spaceship-titanic
Last synced: about 1 month ago
JSON representation
Famous Kaggle competition solution notebook with step by step guide.
- Host: GitHub
- URL: https://github.com/soumyaco/spaceship-titanic
- Owner: SoumyaCO
- License: mit
- Created: 2023-08-18T17:02:22.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2023-08-20T16:07:09.000Z (over 1 year ago)
- Last Synced: 2024-10-25T11:50:01.537Z (3 months ago)
- Topics: data-analysis-python, data-science, kaggle-competition, machine-learning, python3, spaceship-titanic
- Language: Jupyter Notebook
- Homepage: https://www.kaggle.com/code/soumyadipbhat/spaceship-titanic-solved-notebook-detailed
- Size: 2.45 MB
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Spaceship-Titanic Kaggle Competition Notebook
## Famous Kaggle Competition Solution Notebook with step by step guide
This repository contains the solved notebook of the competition.
This will help beginners to understand the workflow of a machine learning problem.
Machine Learning is not just about learning the algorithms, It consists of some crucial steps.
> * Data Preparation
> * Data Scaling
> * Dimensionality Reduction
> * Model choosing
> * Training and Testing Model
> * Hyperparamter tuning [*Grid Search*, *Random Search*]
> * Evaluation [`confusion_matix`, `F1_score`, `precision`]
> * Deployment [out of the scope of this repository]### Problem Description
You can Read the whole problem description in kaggle : [Kaggle|Spaceship Titanic Competition](https://www.kaggle.com/competitions/spaceship-titanic/overview)
There are features [passenger details], we have to predict whether a passenger has transported or not.
Basically this is a binary classification problem which needs advance feature engineering skills.**π₯ Challenges:**
* Have Categorical data which affect the performance if we just `LabelEncode()`it.
* Variance of the features are very uneven. Scaling the data is very necessary.
* Irrelivant columns of data like `PassengerId` have to be removed.
* Other Challenges: Intermediate to advance feature engineering skills needed### Edit and Experiment with the Notebook
π Click on the notebook `spaceship-titanic.ipynb` and click on the `open in kaggle` button.
π Or if you want to open it in google colab, Click on the `open in colab` button on top of the notebook or on top of this README file.### βοΈWarning
When opening in colab, dataset have to be downloaded and uploaded on google colab manually.### π Welcome Contributors
* Found a mistake?
* Improved Accuracy of the model?
* Any suggestion related to the notebook or the workflow
* Or any other types of contribution will be appriciated.π In the notebook I've provided detailed codes and concepts. If you like it please give a star βοΈ
π§π»βπ» My Profiles:
> * [π LinkedIn](https://www.linkedin.com/in/soumyadip-bhattacharjya-993974234/)
> * [π Kaggle](https://www.kaggle.com/soumyadipbhat)