https://github.com/finite-sample/lookahead-kmeans

Look Ahead Initialization of K-Means
https://github.com/finite-sample/lookahead-kmeans

Last synced: 8 months ago
JSON representation

Look Ahead Initialization of K-Means

Host: GitHub
URL: https://github.com/finite-sample/lookahead-kmeans
Owner: finite-sample
Created: 2025-06-18T23:24:14.000Z (12 months ago)
Default Branch: main
Last Pushed: 2025-06-18T23:35:49.000Z (12 months ago)
Last Synced: 2025-06-19T19:04:13.830Z (12 months ago)
Language: Jupyter Notebook
Size: 8.79 KB
Stars: 1
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

          # 🧠 Lookahead K-Means: Smarter Cluster Initialization

This repo implements and compares a **lookahead-based initialization** strategy for KMeans against standard `k-means++`. The lookahead approach generates multiple candidate initializations and runs a few K-Means steps (not the full algorithm) for each. It then selects the initialization that produces the best intermediate silhouette score after this limited rollout.

## 🔍 What’s Inside

* 📆 Evaluates both `k-means++` and **lookahead init**

* 📈 Tracks **silhouette scores** over iterations

* ⏱ Measures **runtime** and **peak memory**

* 🧪 Tested on real (Iris, Wine) and synthetic datasets (Overlapping, Noisy)

## Notebook

[Notebook](lookahead-kmeans.ipynb)

## 🧠 Lookahead Strategy

* Randomly initialize multiple candidate centroids

* For each, simulate several K-Means steps (rollout_depth)

* Pick the one with the best silhouette score

## 📈 Results

| Dataset | Std Sil. | LA Sil. | Std Time | LA Time | Std Mem | LA Mem  |

| ------- | -------- | ------- | -------- | ------- | ------- | ------- |

| Iris    | 0.55     | 0.55    | 0.05 s   | 0.12 s  | 0.36 MB | 0.36 MB |

| Noisy   | 0.18     | 0.23    | 0.13 s   | 0.31 s  | 2.01 MB | 2.00 MB |

## 📪 When to Use

* Useful for **noisy** or **high-dimensional** data

* Helps when **initialization quality matters**

* Offers better clustering at the cost of runtime

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/finite-sample/lookahead-kmeans

Awesome Lists containing this project

README