Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/amirjahantab/iris_classification
This project analyzes the famous Iris dataset using various machine learning techniques. The goal is to classify the iris flowers into three species: Setosa, Versicolor, and Virginica based on the features provided in the dataset.
https://github.com/amirjahantab/iris_classification
classification data-science machine-learning scikit-learn
Last synced: 3 days ago
JSON representation
This project analyzes the famous Iris dataset using various machine learning techniques. The goal is to classify the iris flowers into three species: Setosa, Versicolor, and Virginica based on the features provided in the dataset.
- Host: GitHub
- URL: https://github.com/amirjahantab/iris_classification
- Owner: amirjahantab
- Created: 2024-07-08T07:17:45.000Z (6 months ago)
- Default Branch: master
- Last Pushed: 2024-07-08T11:06:57.000Z (6 months ago)
- Last Synced: 2024-11-12T18:45:57.694Z (2 months ago)
- Topics: classification, data-science, machine-learning, scikit-learn
- Language: Jupyter Notebook
- Homepage:
- Size: 57.6 KB
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Iris Classification Notebook
This Jupyter notebook demonstrates the process of classifying the Iris dataset using various machine learning techniques. The Iris dataset is a classic dataset in the field of machine learning and statistics, often used for testing algorithms.
## Introduction
The Iris dataset contains 150 samples of iris flowers, each with four features: sepal length, sepal width, petal length, and petal width. The samples belong to one of three species: Iris-setosa, Iris-versicolor, and Iris-virginica. This notebook demonstrates how to load the dataset, preprocess it, train different classifiers, evaluate their performance, and visualize the results.
## Requirements
To run the notebook, you need the following Python libraries:
- `numpy`
- `matplotlib`
- `scikit-learn`You can install these dependencies using `pip`:
```sh
pip install numpy matplotlib scikit-learn jupyter
```## Usage
1. Clone the repository or download the notebook file.
2. Ensure you have all the required libraries installed.
3. Open the notebook using Jupyter:```sh
jupyter notebook iris.ipynb
```4. Run the cells in the notebook to see the data loading, model training, evaluation, and visualization steps.
## Notebook Structure
### Data Loading and Exploration
- **Loading the Iris dataset**: The dataset is loaded using `scikit-learn`'s built-in function.
- **Basic data exploration and visualization**: Initial exploration of the dataset, including summary statistics and pair plots to visualize the relationships between features.### Data Preprocessing
- **Splitting the dataset**: The dataset is split into training and testing sets to evaluate the performance of the models.
### Model Training
- **K-Nearest Neighbors (KNN) Classifier**: Classifier implementing the k-nearest neighbors vote.
- **Multi-layer Perceptron (MLP) classifier**:This model optimizes the log-loss function using LBFGS or stochastic
gradient descent.
### Model Evaluation- **Evaluating the models**: The models are evaluated using accuracy score on the test set.
- **Identifying incorrect predictions**: Incorrect predictions are identified and analyzed.### Visualization
- **Plotting the classification results**: A scatter plot of the classification results is generated, highlighting the incorrectly classified samples.
### Accuracy Scores
The accuracy scores of the different classifiers on the test set are as follows:
- **K-Nearest Neighbors (KNN)**: $94.66$ %
- **Multi-layer Perceptron classifier (MLP)**: $96.00%$ %