https://github.com/francescocoding/mushroom-supervised-machine-learning-classification
🍄 The project involves using a supervised machine learning algorithm to classify mushroom samples as edible or poisonous. The dataset used includes various features such as cap shape and odor, and the models implemented include Logistic Regression, Decision Trees, and Random Forest.
https://github.com/francescocoding/mushroom-supervised-machine-learning-classification
classification decision-tree linear-regression machine-learning random-forest
Last synced: 4 months ago
JSON representation
🍄 The project involves using a supervised machine learning algorithm to classify mushroom samples as edible or poisonous. The dataset used includes various features such as cap shape and odor, and the models implemented include Logistic Regression, Decision Trees, and Random Forest.
- Host: GitHub
- URL: https://github.com/francescocoding/mushroom-supervised-machine-learning-classification
- Owner: FrancescoCoding
- Created: 2023-01-26T13:05:03.000Z (over 2 years ago)
- Default Branch: main
- Last Pushed: 2023-01-26T13:14:44.000Z (over 2 years ago)
- Last Synced: 2025-01-04T04:13:29.968Z (6 months ago)
- Topics: classification, decision-tree, linear-regression, machine-learning, random-forest
- Language: Jupyter Notebook
- Homepage:
- Size: 857 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# 🍄 Mushroom Dataset Supervised Machine Learning Classification
The project involves using a supervised machine learning 🤖 algorithm to classify mushroom samples as edible or poisonous. The dataset used includes various features such as cap shape and odor, and the models implemented include Logistic Regression, Decision Trees, and Random Forest.The aim is to accurately predict the class label of a mushroom based on its characteristics.
This machine learning project consisted of four tasks: Research and Data Exploration, Data Pre-processing, Modelling/Classification, and Solution Improvement.
## 🔍 Data Exploration
For the Research and Data Exploration task, I explored the dataset I had chosen and performed an initial analysis of its features and class labels. I also provided a brief background of the problem that the dataset aims to solve and the source of the data.## 🧮 Data Pre-processing
In the Data Pre-processing task, I applied various pre-processing techniques to the dataset. I checked for missing values and handled them accordingly. I also standardized/normalized the data to reduce noise, generated new features, dropped irrelevant features, and performed feature selection/engineering. In addition, I handled any imbalanced classes by using sampling methods.## 📰 Data Classification
For the Modelling/Classification task, I built a classification model to classify/predict the class labels in my dataset. I used a variety of models such as Logistic Regression, Decision Tree, and Random Forest. I justified my choice of method and wrote, explained, and commented on the code I produced. I divided the dataset into training and testing subsets, built a model using the training set, tested and evaluated my model, and reported and discussed the results.## 🔦 Solution improvement (hyperparameters tuning)
In the Solution Improvement task, I aimed to improve the performance of my model by fine-tuning the parameters, using different metrics for evaluation, trying different models, changing the partitioning of the dataset, and using cross-validation. I discussed, explained, and justified my approach and attempted to improve model performance.Finally, I presented my work in the Jupyter Notebook provided, structuring it into sections and subsections to reflect the different parts of the project. All code was explained and methods were discussed and justified. The results presented in the notebook were reproducible, meaning that if the code was run, similar results would be obtained.