Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/odeyiany2/ecx-4.0-21-days-data-science-challenge
https://github.com/odeyiany2/ecx-4.0-21-days-data-science-challenge
Last synced: 7 days ago
JSON representation
- Host: GitHub
- URL: https://github.com/odeyiany2/ecx-4.0-21-days-data-science-challenge
- Owner: Odeyiany2
- Created: 2024-04-15T17:37:41.000Z (7 months ago)
- Default Branch: main
- Last Pushed: 2024-05-02T07:08:46.000Z (7 months ago)
- Last Synced: 2024-05-02T13:25:30.962Z (7 months ago)
- Language: Jupyter Notebook
- Size: 3.45 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
![pic](https://github.com/Odeyiany2/ECX-4.0-21-Days-Data-Science-Challenge/blob/main/dataset-cover.jpg)
# ECX 4.0 21 Days Data Science Challenge: Iris Classification
This is a challenge organized by the Engineering Career Expo Unilag. A challenge to test our data science skills for 21 days.
In this challenge, I dealt with the Iris dataset and built a model that predicts the specie of an Iris by its measurements.## Tools
The following tools were used for different areas of the project:
* Python Libraries:
- `Pandas`: for data analysis and manipulation
- `Seaborn`: a library based on matplotlib and it provides a high-level interface for data visualization
- `matplotlib`: for data visualization
- `Joblib`: Saving our model for deployment* Scikit Learn (Python Machine Learning Library):
- `GridSearchCV and RandomSearchCV`: Hyperparameter tuning
- `StandardScaler`: for standardization of numeric features
- `LabelEncoder`: for encoding oyr categorical features
- `RandomForestClassifier, SVC, LogisticRegression, DecisionTreeClassifier`: ML algorithm for classification problems
* Evaluation Metrics:
- `Accuracy Score`: Number of correctly predicted class over the total classes
- `Precision`: ratio of correctly predicted positive classes over the total positive classes
- `Recall`: ratio of correctly predicted positive class over the total classes
- `Classification report`: a report showing precision, recall and F-1 score
- `ROC Curve`: a plot showing the true positive rate(TPR) over false positive rate(FPR)
- `Confusion matrix`: a table for assessing the quality of our classification model prediction
* Deployment: `Streamlit`
[Deployment Video](https://drive.google.com/file/d/1USr4u_sPX2mlkSUDhbb4Ctf8C_pTy3kT/view?usp=drive_link)[Streamlit App](https://iris-flower-specie-classifier.streamlit.app/)