{"id":29830844,"url":"https://github.com/finite-sample/lookahead-kmeans","last_synced_at":"2025-10-18T18:49:09.121Z","repository":{"id":299914126,"uuid":"1004621101","full_name":"finite-sample/lookahead-kmeans","owner":"finite-sample","description":"Look Ahead Initialization of K-Means","archived":false,"fork":false,"pushed_at":"2025-06-18T23:35:49.000Z","size":9,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-06-19T19:04:13.830Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/finite-sample.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-06-18T23:24:14.000Z","updated_at":"2025-06-19T01:20:49.000Z","dependencies_parsed_at":"2025-06-19T00:36:36.106Z","dependency_job_id":null,"html_url":"https://github.com/finite-sample/lookahead-kmeans","commit_stats":null,"previous_names":["soodoku/lookahead-kmeans","finite-sample/lookahead-kmeans"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/finite-sample/lookahead-kmeans","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/finite-sample%2Flookahead-kmeans","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/finite-sample%2Flookahead-kmeans/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/finite-sample%2Flookahead-kmeans/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/finite-sample%2Flookahead-kmeans/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/finite-sample","download_url":"https://codeload.github.com/finite-sample/lookahead-kmeans/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/finite-sample%2Flookahead-kmeans/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":267668844,"owners_count":24124973,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-07-29T02:00:12.549Z","response_time":2574,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-07-29T10:11:50.812Z","updated_at":"2025-10-18T18:49:09.016Z","avatar_url":"https://github.com/finite-sample.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# 🧠 Lookahead K-Means: Smarter Cluster Initialization\n\nThis repo implements and compares a **lookahead-based initialization** strategy for KMeans against standard `k-means++`. The lookahead approach generates multiple candidate initializations and runs a few K-Means steps (not the full algorithm) for each. It then selects the initialization that produces the best intermediate silhouette score after this limited rollout.\n\n## 🔍 What’s Inside\n\n* 📆 Evaluates both `k-means++` and **lookahead init**\n* 📈 Tracks **silhouette scores** over iterations\n* ⏱ Measures **runtime** and **peak memory**\n* 🧪 Tested on real (Iris, Wine) and synthetic datasets (Overlapping, Noisy)\n\n## Notebook\n\n[Notebook](lookahead-kmeans.ipynb)\n\n## 🧠 Lookahead Strategy\n\n* Randomly initialize multiple candidate centroids\n* For each, simulate several K-Means steps (rollout_depth)\n* Pick the one with the best silhouette score\n\n## 📈 Results\n\n| Dataset | Std Sil. | LA Sil. | Std Time | LA Time | Std Mem | LA Mem  |\n| ------- | -------- | ------- | -------- | ------- | ------- | ------- |\n| Iris    | 0.55     | 0.55    | 0.05 s   | 0.12 s  | 0.36 MB | 0.36 MB |\n| Noisy   | 0.18     | 0.23    | 0.13 s   | 0.31 s  | 2.01 MB | 2.00 MB |\n\n## 📪 When to Use\n\n* Useful for **noisy** or **high-dimensional** data\n* Helps when **initialization quality matters**\n* Offers better clustering at the cost of runtime\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffinite-sample%2Flookahead-kmeans","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ffinite-sample%2Flookahead-kmeans","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffinite-sample%2Flookahead-kmeans/lists"}