An open API service indexing awesome lists of open source software.

https://github.com/arnabushna24/codealpha_iris_flower_classification

Classification of Iris Flowers
https://github.com/arnabushna24/codealpha_iris_flower_classification

flower-classification machine-learning python3

Last synced: about 1 year ago
JSON representation

Classification of Iris Flowers

Awesome Lists containing this project

README

          

# Iris Flower Classifier

## Overview
Iris flower has three (3) species - Setosa, Versicolor, and Virginica. They differ according to their measurements, i.e., sepal length, sepal width, petal length, and petal width. The goal of this project is to classify a sample Iris flower from the provided measurements according to the species by training an ML model that can learn from the given dataset and do the classification.

## Data Extraction
This dataset is available on [Kaggle](https://www.kaggle.com/datasets/saurabh00007/iriscsv), which consists of 150 Iris flower measurement data. It has 6 columns - `Id`, `SepalLengthCm`, `SepalWidthCm`, `PetalLengthCm`, `PetalWidthCm`, and `Species`. After importing all the necessary libraries, Google Drive was mounted to access the dataset. As there were no missing values, data preprocessing was not required.


Iris Species
Iris Features


Fig. 1: Distribution of Iris species
Fig. 2: Distribution of Iris features

## Feature and Target Preparation
After importing `LabelEncoder` from `sklearn.preprocessing` package, `Species` column was encoded into numeric labels. Other columns (except `Id`) were selected as feature columns. After that, data was split into training (80%) and testing (20%) categories. Also, the features were scaled.

## Model Training and Evaluation
For this project, `RandomForestClassifier` was used.

Table 1: Class-wise Performance Metrics


Species
Precision
Recall
F1-score
Support
Interpretation


Iris-setosa
1.00
1.00
1.00
10
All the samples are correctly predicted.


Iris-versicolor
0.82
0.90
0.86
10
Most predicted as versicolor are correct (82%). Of all versicolor samples, 90% were correctly identified.


Iris-virginica
0.89
0.80
0.84
10
Most predicted as virginica are correct (89%). Of all virginica samples, 80% were correctly identified.

Table 2: Overall Model Performance Metrics


Metric
Precision
Recall
F1-score
Support
Interpretation


Accuracy
-
-
0.90
30
Overall, 90% of all predictions were correct.


Macro Average
0.90
0.90
0.90
30
Simple (unweighted) average of metrics across all classes, which treats all classes equally regardless of their support.


Weighted Average
0.90
0.90
0.90
30
Average of metrics weighted by the number of true samples per class.


Confusion Matrix
Feature Importance
Class Cluster for Petal Features


Fig. 1: Confusion matrix
Fig. 2: Feature importance from Random Forest (RF)
Fig. 3: Class cluster for petal features

If you have any queries, contact me: arnabnushna24@gmail.com