{"id":18612373,"url":"https://github.com/winvector/kcomp","last_synced_at":"2025-06-25T01:41:36.886Z","repository":{"id":142726360,"uuid":"51389084","full_name":"WinVector/kcomp","owner":"WinVector","description":"Demonstration of parametric bootstrap to find k for kmeans","archived":false,"fork":false,"pushed_at":"2020-06-22T21:13:12.000Z","size":1991,"stargazers_count":10,"open_issues_count":0,"forks_count":5,"subscribers_count":6,"default_branch":"main","last_synced_at":"2025-05-13T14:34:59.900Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"HTML","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/WinVector.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2016-02-09T18:34:08.000Z","updated_at":"2023-07-25T14:00:08.000Z","dependencies_parsed_at":"2023-04-09T13:49:41.006Z","dependency_job_id":null,"html_url":"https://github.com/WinVector/kcomp","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/WinVector/kcomp","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/WinVector%2Fkcomp","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/WinVector%2Fkcomp/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/WinVector%2Fkcomp/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/WinVector%2Fkcomp/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/WinVector","download_url":"https://codeload.github.com/WinVector/kcomp/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/WinVector%2Fkcomp/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":261786793,"owners_count":23209534,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-07T03:16:51.011Z","updated_at":"2025-06-25T01:41:36.859Z","avatar_url":"https://github.com/WinVector.png","language":"HTML","readme":"# Find the K in K-means by Parametric Bootstrap\n\nNina Zumel, [Win-Vector LLC](http://www.win-vector.com)\n\nCode to explore the use of parametric bootstrap simulations to determine the appropriate *k* for K-means clustering a data set. The approach was inspired by the `boot.comp` function in [`mixtools`](https://cran.r-project.org/web/packages/mixtools/index.html), an R package for fitting and analyzing finite mixture models (see [this article](http://exploringdatablog.blogspot.com/2011/08/fitting-mixture-distributions-with-r.html) by Ron Pearson for an example of using `mixtools` to fit mixtures of gaussians).\n\nThe primary code is in `kcomp_functions.R`. The R markdown file `kcomp.Rmd` shows an example of running this code. The R markdown file `stepthrough.Rmd` steps through one iteration of the bootstrap simulation, and produces the graphs used in our blog post.\n\n## Links and References\n\n* Our blog post on this approach is [here](http://www.win-vector.com/blog/2016/02/finding-the-k-in-k-means-by-parametric-bootstrap/).\n\n* A Shiny app that demonstrates this code interactively is [here](https://win-vector.shinyapps.io/kcompshiny/).\n\n* We cover other approaches to estimating the number of clusters in Chapter 8 of our book [*Practical Data Science in R*](https://www.manning.com/books/practical-data-science-with-r) (Manning, 2014). This chapter is available as a free sample chapter, [here](https://manning-content.s3.amazonaws.com/download/e/dc31390-3cb7-49dd-ab02-937c1af1c2e1/PDSwR_CH08.pdf).\n\n\n\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fwinvector%2Fkcomp","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fwinvector%2Fkcomp","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fwinvector%2Fkcomp/lists"}