https://github.com/amirjahantab/iris_classification

This project analyzes the famous Iris dataset using various machine learning techniques. The goal is to classify the iris flowers into three species: Setosa, Versicolor, and Virginica based on the features provided in the dataset.
https://github.com/amirjahantab/iris_classification

classification data-science machine-learning scikit-learn

Last synced: 3 months ago
JSON representation

Host: GitHub
URL: https://github.com/amirjahantab/iris_classification
Owner: amirjahantab
Created: 2024-07-08T07:17:45.000Z (11 months ago)
Default Branch: master
Last Pushed: 2024-07-08T11:06:57.000Z (11 months ago)
Last Synced: 2025-01-11T20:15:23.992Z (4 months ago)
Topics: classification, data-science, machine-learning, scikit-learn
Language: Jupyter Notebook
Homepage:
Size: 57.6 KB
Stars: 1
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# Iris Classification Notebook

This Jupyter notebook demonstrates the process of classifying the Iris dataset using various machine learning techniques. The Iris dataset is a classic dataset in the field of machine learning and statistics, often used for testing algorithms.

## Introduction

The Iris dataset contains 150 samples of iris flowers, each with four features: sepal length, sepal width, petal length, and petal width. The samples belong to one of three species: Iris-setosa, Iris-versicolor, and Iris-virginica. This notebook demonstrates how to load the dataset, preprocess it, train different classifiers, evaluate their performance, and visualize the results.

## Requirements

To run the notebook, you need the following Python libraries:

- `numpy`
- `matplotlib`
- `scikit-learn`

You can install these dependencies using `pip`:

```sh
pip install numpy matplotlib scikit-learn jupyter
```

## Usage

1. Clone the repository or download the notebook file.
2. Ensure you have all the required libraries installed.
3. Open the notebook using Jupyter:

```sh
jupyter notebook iris.ipynb
```

4. Run the cells in the notebook to see the data loading, model training, evaluation, and visualization steps.

## Notebook Structure

### Data Loading and Exploration

- **Loading the Iris dataset**: The dataset is loaded using `scikit-learn`'s built-in function.
- **Basic data exploration and visualization**: Initial exploration of the dataset, including summary statistics and pair plots to visualize the relationships between features.

### Data Preprocessing

- **Splitting the dataset**: The dataset is split into training and testing sets to evaluate the performance of the models.

### Model Training

- **K-Nearest Neighbors (KNN) Classifier**: Classifier implementing the k-nearest neighbors vote.
- **Multi-layer Perceptron (MLP) classifier**:This model optimizes the log-loss function using LBFGS or stochastic
gradient descent.
### Model Evaluation

- **Evaluating the models**: The models are evaluated using accuracy score on the test set.
- **Identifying incorrect predictions**: Incorrect predictions are identified and analyzed.

### Visualization

- **Plotting the classification results**: A scatter plot of the classification results is generated, highlighting the incorrectly classified samples.

### Accuracy Scores

The accuracy scores of the different classifiers on the test set are as follows:

- **K-Nearest Neighbors (KNN)**: $94.66$ %
- **Multi-layer Perceptron classifier (MLP)**: $96.00%$ %

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/amirjahantab/iris_classification

Awesome Lists containing this project

README