Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/philmod/node-kmeans
Node.js asynchronous implementation of the clustering algorithm k-means.
https://github.com/philmod/node-kmeans
Last synced: 4 days ago
JSON representation
Node.js asynchronous implementation of the clustering algorithm k-means.
- Host: GitHub
- URL: https://github.com/philmod/node-kmeans
- Owner: Philmod
- Created: 2012-10-21T15:43:12.000Z (about 12 years ago)
- Default Branch: master
- Last Pushed: 2022-12-30T17:54:49.000Z (about 2 years ago)
- Last Synced: 2024-12-31T09:13:44.029Z (11 days ago)
- Language: JavaScript
- Homepage:
- Size: 310 KB
- Stars: 102
- Watchers: 6
- Forks: 26
- Open Issues: 14
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# node-kmeans
Node.js asynchronous implementation of the clustering algorithm k-means
![k-means](http://www.aishack.in/static/img/tut/kmeans-example.jpg)
## Installation
$ npm install node-kmeans
## Example
```js
// Data source: LinkedIn
const data = [
{'company': 'Microsoft' , 'size': 91259, 'revenue': 60420},
{'company': 'IBM' , 'size': 400000, 'revenue': 98787},
{'company': 'Skype' , 'size': 700, 'revenue': 716},
{'company': 'SAP' , 'size': 48000, 'revenue': 11567},
{'company': 'Yahoo!' , 'size': 14000 , 'revenue': 6426 },
{'company': 'eBay' , 'size': 15000, 'revenue': 8700},
];// Create the data 2D-array (vectors) describing the data
let vectors = new Array();
for (let i = 0 ; i < data.length ; i++) {
vectors[i] = [ data[i]['size'] , data[i]['revenue']];
}const kmeans = require('node-kmeans');
kmeans.clusterize(vectors, {k: 4}, (err,res) => {
if (err) console.error(err);
else console.log('%o',res);
});
```
## Inputs
- **vectors** is a nXm array (n [lines] : number of points, m [columns] : number of dimensions)
- **options** object:
- **k** : number of clusters
- **distance** (optional) : custom distance function returning the distance between two points `(a,b) => number`, *default* Euclidian Distance
- **seed** (optional) : value that can be provided to get repeatable cluster generation
- **callback** node-style callback taking error and result argument## Outputs
An array of objects (one for each cluster) with the following properties:
- centroid : array of X elements (X = number of dimensions)
- cluster : array of X elements containing the vectors of the input data
- clusterInd : array of X integers which are the indexes of the input data## To do
- Technique to avoid local optima (mutation, ...)## Author
Philmod <[email protected]>