{"id":30462719,"url":"https://github.com/harshavardhanbommalata/knn_tutorial","last_synced_at":"2026-06-20T12:32:12.832Z","repository":{"id":309754426,"uuid":"1037438456","full_name":"harshavardhanBOMMALATA/KNN_TUTORIAL","owner":"harshavardhanBOMMALATA","description":"K-Nearest Neighbors (KNN) is a simple yet powerful machine learning algorithm. Unlike models that learn parameters during training, KNN uses lazy learning—it stores the dataset and predicts by finding the closest neighbors, making decisions through majority voting.","archived":false,"fork":false,"pushed_at":"2025-08-20T17:59:34.000Z","size":14,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-08-20T19:39:59.630Z","etag":null,"topics":["machine-learning-algorithms","mathematical-modelling","matplotlib","numpy","pandas"],"latest_commit_sha":null,"homepage":"","language":null,"has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/harshavardhanBOMMALATA.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-08-13T15:14:43.000Z","updated_at":"2025-08-20T17:59:37.000Z","dependencies_parsed_at":"2025-08-13T17:52:51.485Z","dependency_job_id":null,"html_url":"https://github.com/harshavardhanBOMMALATA/KNN_TUTORIAL","commit_stats":null,"previous_names":["harshavardhanbommalata/knn_tutorial"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/harshavardhanBOMMALATA/KNN_TUTORIAL","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/harshavardhanBOMMALATA%2FKNN_TUTORIAL","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/harshavardhanBOMMALATA%2FKNN_TUTORIAL/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/harshavardhanBOMMALATA%2FKNN_TUTORIAL/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/harshavardhanBOMMALATA%2FKNN_TUTORIAL/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/harshavardhanBOMMALATA","download_url":"https://codeload.github.com/harshavardhanBOMMALATA/KNN_TUTORIAL/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/harshavardhanBOMMALATA%2FKNN_TUTORIAL/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":273032934,"owners_count":25034067,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-08-31T02:00:09.071Z","response_time":79,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["machine-learning-algorithms","mathematical-modelling","matplotlib","numpy","pandas"],"created_at":"2025-08-23T23:02:12.787Z","updated_at":"2026-06-20T12:32:12.827Z","avatar_url":"https://github.com/harshavardhanBOMMALATA.png","language":null,"funding_links":[],"categories":[],"sub_categories":[],"readme":"# KNN_MODEL_FOR_MEDICAL_DIAGNOSIS\n---\n\nIn this project, we explore K-Nearest Neighbors (KNN) from scratch to advanced concepts. It’s not aimed specifically at beginners or experts, but at learners who want to understand KNN thoroughly and refresh their knowledge. We will build KNN completely from scratch, progressing from an introduction to a real-world project. This tutorial covers how KNN works with real-world datasets, the mathematical intuition behind it, and how majority voting and averaging are used for predictions. We will implement KNN for both classification and regression tasks, and discuss its advantages and disadvantages. By the end, you’ll have a strong practical and theoretical grasp of KNN.\n\n---\n\n### 🛠️ Tools \u0026 Technologies Used\n\n* **📘 Jupyter Notebook** – for writing and executing the code step-by-step with explanations\n* **🐍 Python** – core programming language used for building the logic\n* **📂 CSV Files** – dataset used for training and predictions\n* **📊 pandas** – for reading and handling the dataset\n* **📈 matplotlib** – for visualizing the data and regression line\n* **📐 numpy** – for performing numerical and statistical operations \n\n---\n\n## 📘 Project Overview\n\n**Project Title**: KNN Algorithm\n**Level**: Beginner to Advance  \n**Tool**: Jupyter Notebook  \n**Libraries Used**: `pandas`, `numpy`, `matplotlib`\n\n---\n\n## 📁 File Structure \n\n1. **Introduction to KNN**\n2. **Working and Why It Works**\n3. **Mathematical Intuition**\n4. **Implementation Without Scikit-Learn**\n5. **Implementation With Scikit-Learn**\n6. **Applications**\n7. **Advantages and Disadvantages**\n\n---\n\n## 📘 Introduction to KNN\n\n## What is KNN?\n\nKNN stands for **K-Nearest Neighbors**. It is one of the algorithms under **supervised machine learning**.  \n\n👉 If you want more clarity about machine learning and supervised learning, check my [Linear Regression repository](https://github.com/harshavardhanBOMMALATA/Linear-Regression.git).  \n\nKNN can be used for both **regression** and **classification**, but it is mostly applied in **classification problems**.  \n\nThe algorithm works based on the concept of **neighbors**.  \n\n**Example:**\n- A person with **black hair, light brown skin tone, and thick eyebrows** might be classified as **Indian**.  \n- A person with a **dark black skin tone** might be classified as **African**.  \n\nThis is how KNN makes predictions — by looking at the closest neighbors and deciding based on similarity.  \n\n---\n\n## How KNN Works and Why It Works\n\nKNN works on the basis of **neighbors**.  \n\n1. It takes all the data points.  \n2. For a given input data point, it calculates how close it is to the other points using a **distance formula** (like Euclidean distance).  \n3. It then selects the **k nearest points**.  \n4. For **classification**, it performs a **majority vote** among those neighbors and assigns the class based on the highest votes.  \n5. For **regression**, instead of voting, it takes the **average** of the neighbors’ values.  \n\n👉 **Why does it work?**  \nBecause the algorithm makes predictions by checking the closest neighbors. Data points that are close to each other usually share similar characteristics, so this method is both simple and effective.  \n\n---\n\n## KNN Example (Step by Step)\n\nWe will now understand KNN with a simple dataset.  \n\n### Dataset\n\n| Person | Height (cm) | Weight (kg) | Category |\n|--------|-------------|-------------|----------|\n| A      | 170         | 65          | Fit      |\n| B      | 160         | 60          | Fit      |\n| C      | 180         | 80          | Fit      |\n| D      | 155         | 72          | Unfit    |\n| E      | 165         | 85          | Unfit    |\n\nNew data point:  Height = 167 cm\n                 Weight = 70 kg\n\nOur task: **Predict whether this person is Fit or Unfit using KNN.**\n\n---\n\n### Step 1: Distance Formula\n\nWe use **Euclidean distance**:  \n\n\\[\nd = \\sqrt{(x_2 - x_1)^2 + (y_2 - y_1)^2}\n\\]\n\nWhere:  \n- \\(x\\) = Height  \n- \\(y\\) = Weight  \n\n---\n\n### Step 2: Calculate Distances\n\n- Distance(New, A) = √((167−170)² + (70−65)²) = √(9 + 25) = √34 ≈ **5.83**  \n- Distance(New, B) = √((167−160)² + (70−60)²) = √(49 + 100) = √149 ≈ **12.21**  \n- Distance(New, C) = √((167−180)² + (70−80)²) = √(169 + 100) = √269 ≈ **16.40**  \n- Distance(New, D) = √((167−155)² + (70−72)²) = √(144 + 4) = √148 ≈ **12.16**  \n- Distance(New, E) = √((167−165)² + (70−85)²) = √(4 + 225) = √229 ≈ **15.13**  \n\n---\n\n### Step 3: Select Nearest Neighbors\n\nLet’s take **k = 3** nearest neighbors:  \n- A → Fit (5.83)  \n- D → Unfit (12.16)  \n- B → Fit (12.21)  \n\n---\n\n### Step 4: Voting\n\n- Fit = 2 votes  \n- Unfit = 1 vote  \n\n👉 Majority = **Fit**\n\n---\n\n### Final Result\n\nThe new person with **Height = 167 cm** and **Weight = 70 kg** is predicted as:  \n\n✅ **Fit**\n\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fharshavardhanbommalata%2Fknn_tutorial","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fharshavardhanbommalata%2Fknn_tutorial","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fharshavardhanbommalata%2Fknn_tutorial/lists"}