Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/howardyclo/kmeans-dbscan-tutorial

A clustering tutorial with scikit-learn for beginners.
https://github.com/howardyclo/kmeans-dbscan-tutorial

clustering-algorithm dbscan ipython-notebook kmeans scikit-learn tutorial

Last synced: 3 months ago
JSON representation

A clustering tutorial with scikit-learn for beginners.

Awesome Lists containing this project

README

        

# kmeans-dbscan-tutorial
A clustering tutorial with **scikit-learn** for beginners.

## Contents
1. Introduction to **k-means**, **k-means++** and **DBSCAN (Density-Based Spatial Clustering Algorithm with Noise)**.

2. Explore common drawbacks of k-means, such as:
- Need to choose the right number of clusters.
- Cannot handle Noise Data and Outliers.
- Cannot handle Non-spherical Data.
And of course, present solutions for the above drawbacks.

3. Introduction to supervised and unsupervised methods for measuring cluster quality such as homogeneity, completeness and the Silhouette Coefficient (part of section 2).

4. Two simple exercises (k-means & DBSCAN) along with the tutorial.

## Get Started
- Please refer to the slides in `slides/` or review then on google drive, there are [Chinese version](https://docs.google.com/presentation/d/1sgo4Bx0mF9fZXGZoD6F8wEUBPRWhR90ucoKwz8aLmCM/edit?usp=sharing) and [English version](https://docs.google.com/presentation/d/1o_rTjzkK7_q672rociNBu11R5dEDlACtrWrfR34FQ3s/edit?usp=sharing).
- Codes are in `tutorial_and_labs/`, each `.ipynb` has its corresponding `.html`.