Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/arjunsk/kmeans
Go library implementing Kmeans++ and Elkan's Kmeans algorithm
https://github.com/arjunsk/kmeans
centroid clustering elkan kmeans kmeans-plus-plus
Last synced: 30 days ago
JSON representation
Go library implementing Kmeans++ and Elkan's Kmeans algorithm
- Host: GitHub
- URL: https://github.com/arjunsk/kmeans
- Owner: arjunsk
- License: apache-2.0
- Created: 2023-10-01T10:00:12.000Z (over 1 year ago)
- Default Branch: master
- Last Pushed: 2024-02-15T21:59:56.000Z (11 months ago)
- Last Synced: 2024-10-16T08:12:07.205Z (3 months ago)
- Topics: centroid, clustering, elkan, kmeans, kmeans-plus-plus
- Language: Go
- Homepage:
- Size: 130 KB
- Stars: 1
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Go Kmeans
[![Go Reference](https://pkg.go.dev/badge/github.com/arjunsk/kmeans/kmeans.svg)](https://pkg.go.dev/github.com/arjunsk/kmeans)
[![Go Report Card](https://goreportcard.com/badge/github.com/arjunsk/kmeans)](https://goreportcard.com/report/github.com/arjunsk/kmeans)
[![Codecov](https://codecov.io/gh/arjunsk/kmeans/branch/master/graph/badge.svg)](https://codecov.io/gh/arjunsk/kmeans)This is a simple implementation of the [Elkan's Kmeans](https://cdn.aaai.org/ICML/2003/ICML03-022.pdf)
algorithm in Go.### Installing
```sh
$ go get github.com/arjunsk/kmeans
```### Usage
```go
package mainimport (
"fmt"
"github.com/arjunsk/kmeans"
"github.com/arjunsk/kmeans/elkans"
)func main() {
vectorList := [][]float64{
{1, 2, 3, 4},
{1, 2, 4, 5},
{1, 2, 4, 5},
{1, 2, 3, 4},
{1, 2, 4, 5},
{1, 2, 4, 5},
{10, 2, 4, 5},
{10, 3, 4, 5},
{10, 5, 4, 5},
{10, 2, 4, 5},
{10, 3, 4, 5},
{10, 5, 4, 5},
}clusterer, err := elkans.NewKMeans(vectorList, 2,
500, 0.5,
kmeans.L2Distance, kmeans.KmeansPlusPlus, false)
if err != nil {
panic(err)
}centroids, err := clusterer.Cluster()
if err != nil {
panic(err)
}for _, centroid := range centroids {
fmt.Println(centroid)
}
/*
[1 2 3.6666666666666665 4.666666666666666]
[10 3.333333333333333 4 5]
*/
}
```### FAQ
Read More
#### What should be the ideal Centroids Count?
Based on the recommendations from [PGVector](https://github.com/pgvector/pgvector/tree/master#ivfflat) IVF INDEX,
the idea K should> Choose an appropriate number of K - a good place to start is rows / 1000 for up to 1M rows and
> sqrt(rows) for over 1M rows