{"id":17594912,"url":"https://github.com/dstein64/kmeans1d","last_synced_at":"2025-03-20T15:32:52.569Z","repository":{"id":62574372,"uuid":"194734871","full_name":"dstein64/kmeans1d","owner":"dstein64","description":"A Python package for optimal 1D k-means clustering.","archived":false,"fork":false,"pushed_at":"2023-10-06T18:33:02.000Z","size":79,"stargazers_count":43,"open_issues_count":1,"forks_count":5,"subscribers_count":7,"default_branch":"master","last_synced_at":"2023-11-21T14:51:29.564Z","etag":null,"topics":["dynamic-programming","kmeans","optimization"],"latest_commit_sha":null,"homepage":"https://pypi.org/project/kmeans1d/","language":"C++","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/dstein64.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null}},"created_at":"2019-07-01T19:54:14.000Z","updated_at":"2024-03-17T04:55:05.222Z","dependencies_parsed_at":"2024-03-17T04:55:03.207Z","dependency_job_id":"55afbf09-5f66-4880-85f7-cbaa63398663","html_url":"https://github.com/dstein64/kmeans1d","commit_stats":{"total_commits":104,"total_committers":1,"mean_commits":104.0,"dds":0.0,"last_synced_commit":"aadad1e0a48073010344b1f5d00e3ed86667d86f"},"previous_names":[],"tags_count":3,"template":null,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dstein64%2Fkmeans1d","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dstein64%2Fkmeans1d/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dstein64%2Fkmeans1d/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dstein64%2Fkmeans1d/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/dstein64","download_url":"https://codeload.github.com/dstein64/kmeans1d/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":221776158,"owners_count":16878490,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["dynamic-programming","kmeans","optimization"],"created_at":"2024-10-22T07:21:48.229Z","updated_at":"2024-10-28T04:03:49.004Z","avatar_url":"https://github.com/dstein64.png","language":"C++","funding_links":[],"categories":[],"sub_categories":[],"readme":"[![Build Status](https://github.com/dstein64/kmeans1d/workflows/build/badge.svg)](https://github.com/dstein64/kmeans1d/actions)\n\nkmeans1d\n========\n\nA Python library with an implementation of *k*-means clustering on 1D data, based on the algorithm\nfrom Xiaolin (1991), as presented by Gronlund et al. (2017, Section 2.2).\n\nGlobally optimal *k*-means clustering is NP-hard for multi-dimensional data. Lloyd's algorithm is a\npopular approach for finding a locally optimal solution. For 1-dimensional data, there are polynomial\ntime algorithms. The algorithm implemented here is an *O(kn + n log n)* dynamic programming algorithm\nfor finding the globally optimal *k* clusters for *n* 1D data points.\n\nThe code is written in C++, and wrapped with Python.\n\nRequirements\n------------\n\n*kmeans1d* supports Python 3.x.\n\nInstallation\n------------\n\n[kmeans1d](https://pypi.python.org/pypi/kmeans1d) is available on PyPI, the Python Package Index.\n\n```sh\n$ pip3 install kmeans1d\n```\n\nExample Usage\n-------------\n\n```python\nimport kmeans1d\n\nx = [4.0, 4.1, 4.2, -50, 200.2, 200.4, 200.9, 80, 100, 102]\nk = 4\n\nclusters, centroids = kmeans1d.cluster(x, k)\n\nprint(clusters)   # [1, 1, 1, 0, 3, 3, 3, 2, 2, 2]\nprint(centroids)  # [-50.0, 4.1, 94.0, 200.5]\n```\n\nTests\n-----\n\nTests are in [tests/](https://github.com/dstein64/kmeans1d/blob/master/tests).\n\n```sh\n# Run tests\n$ python3 -m unittest discover tests -v\n```\n\nDevelopment\n-----------\n\nThe underlying C++ code can be built in-place, outside the context of `pip`. This requires Python\ndevelopment tools for building Python modules (e.g., the `python3-dev` package on Ubuntu). `gcc`,\n`clang`, and `MSVC` have been tested.\n\n```\n$ python3 setup.py build_ext --inplace\n```\n\nThe [packages](https://github.com/dstein64/kmeans1d/blob/master/.github/workflows/packages.yml)\nGitHub action can be manually triggered (`Actions` \u003e `packages` \u003e `Run workflow`) to build wheels\nand a source distribution.\n\nLicense\n-------\n\nThe code in this repository has an [MIT License](https://en.wikipedia.org/wiki/MIT_License).\n\nSee [LICENSE](https://github.com/dstein64/kmeans1d/blob/master/LICENSE).\n\nReferences\n----------\n\n[1] Wu, Xiaolin. \"Optimal Quantization by Matrix Searching.\" Journal of Algorithms 12, no. 4\n(December 1, 1991): 663\n\n[2] Gronlund, Allan, Kasper Green Larsen, Alexander Mathiasen, Jesper Sindahl Nielsen, Stefan Schneider,\nand Mingzhou Song. \"Fast Exact K-Means, k-Medians and Bregman Divergence Clustering in 1D.\"\nArXiv:1701.07204 [Cs], January 25, 2017. http://arxiv.org/abs/1701.07204.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdstein64%2Fkmeans1d","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdstein64%2Fkmeans1d","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdstein64%2Fkmeans1d/lists"}