https://github.com/Fedesgh/Asteorid_RandomForest_Classifier
Classifier model trained with unbalanced dataset ready for deployment
https://github.com/Fedesgh/Asteorid_RandomForest_Classifier
imblearn pandas pickle seaborn sklearn
Last synced: 9 months ago
JSON representation
Classifier model trained with unbalanced dataset ready for deployment
- Host: GitHub
- URL: https://github.com/Fedesgh/Asteorid_RandomForest_Classifier
- Owner: Fedesgh
- License: apache-2.0
- Created: 2024-08-31T17:24:46.000Z (almost 2 years ago)
- Default Branch: main
- Last Pushed: 2024-10-24T22:44:37.000Z (over 1 year ago)
- Last Synced: 2024-11-22T05:40:25.261Z (over 1 year ago)
- Topics: imblearn, pandas, pickle, seaborn, sklearn
- Language: Jupyter Notebook
- Homepage:
- Size: 13.8 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
## The project
The goal of the project is to create a classifier with **high recall**, and pickle it in order to be ready for deployment.
The dataset was downloaded from Kaggle: https://www.kaggle.com/datasets/ivansher/nasa-nearest-earth-objects-1910-2024
We must build a predictive model able to detect hazardous asteroids, for its nature we are intereset in **recall** in other words we prefer false alarms instead to dont detect such dangerous asteroids.
## Models
We train using **GridSearchCV** severals models: **Kneboirghs** , **SVC** , **RandomForest**, **VotingClassifiers**
Also we use **SMOTETomek** due to the imbalance of the data: **0.127 of the data are hazardous asteroid**, with a total data of 338166 rows.

## The best model
Our best model is **clf4** wich is **RandomForest** with 0.70 of recall and 0.82 of ROC_AUC
