https://github.com/arnabushna24/codealpha_iris_flower_classification
Classification of Iris Flowers
https://github.com/arnabushna24/codealpha_iris_flower_classification
flower-classification machine-learning python3
Last synced: about 1 year ago
JSON representation
Classification of Iris Flowers
- Host: GitHub
- URL: https://github.com/arnabushna24/codealpha_iris_flower_classification
- Owner: ArnabUshna24
- Created: 2025-05-17T14:04:07.000Z (about 1 year ago)
- Default Branch: main
- Last Pushed: 2025-05-20T04:36:35.000Z (about 1 year ago)
- Last Synced: 2025-06-08T21:44:41.510Z (about 1 year ago)
- Topics: flower-classification, machine-learning, python3
- Language: Jupyter Notebook
- Homepage: https://www.kaggle.com/datasets/saurabh00007/iriscsv
- Size: 536 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Iris Flower Classifier
## Overview
Iris flower has three (3) species - Setosa, Versicolor, and Virginica. They differ according to their measurements, i.e., sepal length, sepal width, petal length, and petal width. The goal of this project is to classify a sample Iris flower from the provided measurements according to the species by training an ML model that can learn from the given dataset and do the classification.
## Data Extraction
This dataset is available on [Kaggle](https://www.kaggle.com/datasets/saurabh00007/iriscsv), which consists of 150 Iris flower measurement data. It has 6 columns - `Id`, `SepalLengthCm`, `SepalWidthCm`, `PetalLengthCm`, `PetalWidthCm`, and `Species`. After importing all the necessary libraries, Google Drive was mounted to access the dataset. As there were no missing values, data preprocessing was not required.
Fig. 1: Distribution of Iris species
Fig. 2: Distribution of Iris features
## Feature and Target Preparation
After importing `LabelEncoder` from `sklearn.preprocessing` package, `Species` column was encoded into numeric labels. Other columns (except `Id`) were selected as feature columns. After that, data was split into training (80%) and testing (20%) categories. Also, the features were scaled.
## Model Training and Evaluation
For this project, `RandomForestClassifier` was used.
Table 1: Class-wise Performance Metrics
Species
Precision
Recall
F1-score
Support
Interpretation
Iris-setosa
1.00
1.00
1.00
10
All the samples are correctly predicted.
Iris-versicolor
0.82
0.90
0.86
10
Most predicted as versicolor are correct (82%). Of all versicolor samples, 90% were correctly identified.
Iris-virginica
0.89
0.80
0.84
10
Most predicted as virginica are correct (89%). Of all virginica samples, 80% were correctly identified.
Table 2: Overall Model Performance Metrics
Metric
Precision
Recall
F1-score
Support
Interpretation
Accuracy
-
-
0.90
30
Overall, 90% of all predictions were correct.
Macro Average
0.90
0.90
0.90
30
Simple (unweighted) average of metrics across all classes, which treats all classes equally regardless of their support.
Weighted Average
0.90
0.90
0.90
30
Average of metrics weighted by the number of true samples per class.
Fig. 1: Confusion matrix
Fig. 2: Feature importance from Random Forest (RF)
Fig. 3: Class cluster for petal features
If you have any queries, contact me: arnabnushna24@gmail.com