Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/gurpreet0022/classify_that_fruit
https://github.com/gurpreet0022/classify_that_fruit
Last synced: 26 days ago
JSON representation
- Host: GitHub
- URL: https://github.com/gurpreet0022/classify_that_fruit
- Owner: Gurpreet0022
- Created: 2024-11-15T18:06:56.000Z (2 months ago)
- Default Branch: main
- Last Pushed: 2024-11-15T18:11:22.000Z (2 months ago)
- Last Synced: 2024-11-15T19:19:54.394Z (2 months ago)
- Language: Jupyter Notebook
- Size: 117 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Classify_That_Fruit
## Fruit Classification Using k-NN
This project demonstrates the use of the **k-Nearest Neighbors (k-NN)** algorithm for classifying fruits based on their physical properties such as mass, width, height, and color score. After evaluating multiple machine learning models, k-NN was chosen as the optimal classifier for this problem.
## Table of Contents
- [Project Overview](#project-overview)
- [Technologies Used](#technologies-used)
- [Features](#features)
- [Dataset](#dataset)
- [Model Selection](#model-selection)
- [Installation](#installation)
- [Usage](#usage)
- [Future Enhancements](#future-enhancements)
- [License](#license)## Project Overview
The project uses the **k-Nearest Neighbors (k-NN)** algorithm to classify fruit samples into four categories: **apple**, **mandarin**, **orange**, and **lemon**. The dataset includes fruit properties like mass, width, height, and color score.## Technologies Used
- Python
- Jupyter Notebook
- Libraries: `numpy`, `matplotlib`, `scikit-learn`, `pandas`## Features
- Train/test split for evaluating the k-NN model.
- Visualization of decision boundaries.
- Ability to adjust k-NN parameters like the number of neighbors and weight function.## Dataset
The dataset consists of fruit samples with the following features:
- **Mass**: Weight of the fruit.
- **Width**: Width of the fruit.
- **Height**: Height of the fruit.
- **Color Score**: Numerical representation of the fruit's color.
- **Label**: Category of fruit (apple, mandarin, orange, or lemon).## Model Selection
Initially, several classifiers were tested:- **Logistic Regression**: This model showed poor performance on the test set, with an accuracy of only 42%.
- **Decision Tree Classifier**: The decision tree achieved perfect accuracy on the training set (1.0) but overfitted the data, resulting in poor test set performance (42% accuracy).
After evaluating these models, **k-NN** was selected as the optimal model due to its better generalization performance on the test set (83%) compared to the other models.## Installation
1. Clone this repository:
```bash
git clone
```
2. Install the required libraries:
```bash
pip install numpy matplotlib scikit-learn pandas
```
3. Open the Jupyter Notebook file:
```bash
jupyter notebook ClassifyThatFruit.ipynb
```## Usage
1. Run the cells in the notebook to load the dataset, preprocess the data, and split it into training and testing sets.
2. Visualize the decision boundaries by calling:
```python
plot_fruit_knn(X_train, y_train, n_neighbors=5, weights='uniform')
```
3. Experiment with different `n_neighbors` and `weights` to observe their impact.## Future Enhancements
- Add more features like fruit texture or ripeness.
- Experiment with other classification models for comparison.
- Deploy the model using a web framework (e.g., Flask, Django).## License
This project is licensed under the MIT License. See the `LICENSE` file for more details.