https://github.com/abdelrahman-amen/k-nearest-neighbors-knn-from-scratch-and-built_in
KNN is a basic machine learning algorithm used for classification and regression tasks. It predicts the class of a new data point based on the majority class of its nearest neighbors. KNN is simple, non-parametric, and learns directly from the training data without explicit training.
https://github.com/abdelrahman-amen/k-nearest-neighbors-knn-from-scratch-and-built_in
knearest-neighbor-classifier pca preprocessing python sklearn
Last synced: about 2 months ago
JSON representation
KNN is a basic machine learning algorithm used for classification and regression tasks. It predicts the class of a new data point based on the majority class of its nearest neighbors. KNN is simple, non-parametric, and learns directly from the training data without explicit training.
- Host: GitHub
- URL: https://github.com/abdelrahman-amen/k-nearest-neighbors-knn-from-scratch-and-built_in
- Owner: Abdelrahman-Amen
- Created: 2024-03-02T13:11:26.000Z (about 1 year ago)
- Default Branch: main
- Last Pushed: 2024-03-02T14:17:38.000Z (about 1 year ago)
- Last Synced: 2025-02-11T12:42:53.597Z (3 months ago)
- Topics: knearest-neighbor-classifier, pca, preprocessing, python, sklearn
- Language: Jupyter Notebook
- Homepage:
- Size: 168 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# k-nearest-neighbor 🚀
# Introduction to k-Nearest Neighbors (KNN)
k-Nearest Neighbors (KNN) is a simple yet powerful supervised machine learning algorithm used for both classification and regression tasks. It is a non-parametric and instance-based method, meaning it doesn't make assumptions about the underlying data distribution and learns directly from the training instances.

# Principle
KNN operates based on the principle that similar instances are likely to belong to the same class or have similar properties. It classifies a new data point by identifying the majority class among its k nearest neighbors in the feature space.
# steps to make KNN:
1. Choose the Value of K: Determine the value of K, which represents the number of nearest neighbors to consider. Typically, you can use techniques such as cross-validation to select the optimal K value.
2. Calculate Distance: For each data point in the test set, calculate its distance to all data points in the training set. Common distance metrics include Euclidean distance, Manhattan distance, Minkowski distance, or cosine similarity.
Euclidean distance:

Manhattan distance:

Minkowski distance:

cosine similarity:

3. Find K Nearest Neighbors: Select the K data points from the training set that are closest to the test data point based on the calculated distances.
4. Majority Vote (Classification) or Weighted Average (Regression): For classification problems, assign the class label that is most frequent among the K nearest neighbors. For regression problems, compute the weighted average of the target values of the K nearest neighbors, where the weights are inversely proportional to the distance.
5. Make Predictions: Use the majority class or the computed average to make predictions for the test data points.
6. Evaluate the Model: Assess the performance of the KNN model using evaluation metrics such as accuracy, precision, recall, F1-score (for classification), or Mean Squared Error, Mean Absolute Error, R-squared (for regression).
7. Tune Hyperparameters: Fine-tune hyperparameters such as the value of K or the choice of distance metric based on the model's performance on the validation set.