An open API service indexing awesome lists of open source software.

https://github.com/ascender1729/iris-flower-classification-2024

An exploratory data analysis and machine learning project using the Iris dataset to classify flower species with a K-Nearest Neighbors classifier. It includes data visualization, feature scaling, model training, and evaluation with 100% accuracy on the test set.
https://github.com/ascender1729/iris-flower-classification-2024

classification data-analysis iris-dataset k-nearest-neighbors machine-learning matplotlib pandas python scikit-learn seaborn

Last synced: 4 months ago
JSON representation

An exploratory data analysis and machine learning project using the Iris dataset to classify flower species with a K-Nearest Neighbors classifier. It includes data visualization, feature scaling, model training, and evaluation with 100% accuracy on the test set.

Awesome Lists containing this project

README

          

# Iris Flower Classification Analysis

The Iris Flower Classification Analysis is a comprehensive machine learning project that leverages the Iris dataset along with its Bezdek's variant to predict Iris species using the K-Nearest Neighbors (KNN) algorithm. This project includes enhanced data handling, visualization, and model evaluation techniques.

## Table of Contents

- [Project Overview](#project-overview)
- [Features](#features)
- [Data Description](#data-description)
- [Installation](#installation)
- [Usage](#usage)
- [Contributing](#contributing)
- [License](#license)
- [Contact](#contact)

## Project Overview

This project offers a detailed exploration and analysis of the Iris flower dataset, including data integrity checks, feature scaling, and dimensionality reduction through PCA to optimize classification performance. Enhanced visualization techniques aid in understanding the intricate relationships within the data.

## Features

- *Data Integration*: Utilizes Google Colab for seamless integration and data manipulation.
- *Dual Dataset Analysis*: Analysis includes both the original and Bezdek's Iris datasets to ensure robustness.
- *Advanced Data Handling*: Includes detection and removal of duplicate entries.
- *Feature Scaling and PCA*: Implements StandardScaler for normalization and PCA for reducing dimensionality.
- *Enhanced Visualization*: Uses Seaborn and Matplotlib to visualize data in reduced dimensions.
- *Precision Modeling*: Applies a KNN model with optimized parameters for superior prediction accuracy.
- *Model Evaluation*: Assesses the model's accuracy through advanced metrics.

## Data Description

Two Iris datasets are utilized, each containing 150 samples of Iris flowers with features:
- Sepal Length
- Sepal Width
- Petal Length
- Petal Width
- Species (Iris-setosa, Iris-versicolor, Iris-virginica)

## Installation

Setup for Google Colab:

```bash
from google.colab import drive
drive.mount('/content/drive')
```
Clone the repository and navigate to the project directory:

```bash
git clone https://github.com/ascender1729/iris-flower-classification-2024.git
cd iris-flower-classification-2024
```

## Usage

Install the required libraries:

```bash
pip install pandas numpy seaborn matplotlib scikit-learn
```

Run the Jupyter notebook via Google Colab for a comprehensive walkthrough.

## Contributing

Contributions are welcome to extend the analysis or improve the existing methodologies.

## License

This project is licensed under the MIT License - see the `LICENSE` file for details.

## Contact

Pavan Kumar - pavankumard.pg19.ma@nitp.ac.in

LinkedIn: [@ascender1729](https://www.linkedin.com/in/im-pavankumar)

Project Link: [iris-flower-classification-2024](https://github.com/ascender1729/iris-flower-classification-2024)