https://github.com/faraazarsath/guvi-assignment_10
Dataset from USA Forensic Science Service which has description of 6 types of glass; defined in terms of their oxide content (i.e. Na, Fe, K, etc). Task is to use K-Nearest Neighbor (KNN) classifier to classify the glasses.
https://github.com/faraazarsath/guvi-assignment_10
euclidean-distances knn-classification manhattan-distance
Last synced: 3 days ago
JSON representation
Dataset from USA Forensic Science Service which has description of 6 types of glass; defined in terms of their oxide content (i.e. Na, Fe, K, etc). Task is to use K-Nearest Neighbor (KNN) classifier to classify the glasses.
- Host: GitHub
- URL: https://github.com/faraazarsath/guvi-assignment_10
- Owner: FaraazArsath
- Created: 2022-10-11T07:30:11.000Z (almost 3 years ago)
- Default Branch: main
- Last Pushed: 2022-10-11T07:38:18.000Z (almost 3 years ago)
- Last Synced: 2025-01-09T10:06:25.994Z (9 months ago)
- Topics: euclidean-distances, knn-classification, manhattan-distance
- Language: Jupyter Notebook
- Homepage:
- Size: 138 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# GUVI-Assignment_10
Provided with a dataset from USA Forensic Science Service which has description of 6 types of glass; defined in terms of theiroxide content (i.e. Na, Fe, K, etc).
The original dataset is available at (https://archive.ics.uci.edu/ml/datasets/glass+identification).
For detailed description on the attributes of the dataset, please refer to the original link of the dataset in the UCI MLrepository.Task is to use K-Nearest Neighbor (KNN) classifier to classify the glasses.
Perform exploratory data analysis on the dataset using Python Pandas, including dropping irrelevant fields for predicted values, and standardization of each attribute.
Following data cleaning, two Scikit-Learn KNN models should be created for two different distance metrics: Square Euclidean and Manhattan distance.
The performance of the two models using different distance metrics should be compared in terms of accuracy to the test data and Scikit-Learn Classification Report.