{"id":14977175,"url":"https://github.com/howardyclo/kmeans-dbscan-tutorial","last_synced_at":"2026-04-01T23:37:26.754Z","repository":{"id":104756475,"uuid":"80803433","full_name":"howardyclo/kmeans-dbscan-tutorial","owner":"howardyclo","description":"A clustering tutorial with scikit-learn for beginners.","archived":false,"fork":false,"pushed_at":"2017-02-07T14:11:31.000Z","size":14349,"stargazers_count":22,"open_issues_count":0,"forks_count":13,"subscribers_count":3,"default_branch":"master","last_synced_at":"2025-10-28T01:33:40.933Z","etag":null,"topics":["clustering-algorithm","dbscan","ipython-notebook","kmeans","scikit-learn","tutorial"],"latest_commit_sha":null,"homepage":"","language":"HTML","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/howardyclo.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2017-02-03T06:28:58.000Z","updated_at":"2025-06-21T17:47:17.000Z","dependencies_parsed_at":"2023-06-15T00:00:51.948Z","dependency_job_id":null,"html_url":"https://github.com/howardyclo/kmeans-dbscan-tutorial","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/howardyclo/kmeans-dbscan-tutorial","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/howardyclo%2Fkmeans-dbscan-tutorial","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/howardyclo%2Fkmeans-dbscan-tutorial/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/howardyclo%2Fkmeans-dbscan-tutorial/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/howardyclo%2Fkmeans-dbscan-tutorial/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/howardyclo","download_url":"https://codeload.github.com/howardyclo/kmeans-dbscan-tutorial/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/howardyclo%2Fkmeans-dbscan-tutorial/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31292980,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-01T21:15:39.731Z","status":"ssl_error","status_checked_at":"2026-04-01T21:15:34.046Z","response_time":53,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["clustering-algorithm","dbscan","ipython-notebook","kmeans","scikit-learn","tutorial"],"created_at":"2024-09-24T13:55:14.734Z","updated_at":"2026-04-01T23:37:26.732Z","avatar_url":"https://github.com/howardyclo.png","language":"HTML","readme":"# kmeans-dbscan-tutorial\nA clustering tutorial with **scikit-learn** for beginners.\n\n## Contents\n1. Introduction to **k-means**, **k-means++** and **DBSCAN (Density-Based Spatial Clustering Algorithm with Noise)**.\n\n2. Explore common drawbacks of k-means, such as:\n  - Need to choose the right number of clusters.\n  - Cannot handle Noise Data and Outliers.\n  - Cannot handle Non-spherical Data.\nAnd of course, present solutions for the above drawbacks.\n\n3. Introduction to supervised and unsupervised methods for measuring cluster quality such as homogeneity, completeness and the Silhouette Coefficient (part of section 2).\n\n4. Two simple exercises (k-means \u0026 DBSCAN) along with the tutorial.\n\n## Get Started\n- Please refer to the slides in `slides/` or review then on google drive, there are [Chinese version](https://docs.google.com/presentation/d/1sgo4Bx0mF9fZXGZoD6F8wEUBPRWhR90ucoKwz8aLmCM/edit?usp=sharing) and [English version](https://docs.google.com/presentation/d/1o_rTjzkK7_q672rociNBu11R5dEDlACtrWrfR34FQ3s/edit?usp=sharing).\n- Codes are in `tutorial_and_labs/`, each `.ipynb` has its corresponding `.html`.\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhowardyclo%2Fkmeans-dbscan-tutorial","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fhowardyclo%2Fkmeans-dbscan-tutorial","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhowardyclo%2Fkmeans-dbscan-tutorial/lists"}