Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/t-ski/clustering-algorithms
Pure TypeScript implementations of common clustering algorithms.
https://github.com/t-ski/clustering-algorithms
clustering machine-learning typescript unsupervised-learning
Last synced: about 10 hours ago
JSON representation
Pure TypeScript implementations of common clustering algorithms.
- Host: GitHub
- URL: https://github.com/t-ski/clustering-algorithms
- Owner: t-ski
- License: mit
- Created: 2024-05-30T22:16:34.000Z (6 months ago)
- Default Branch: main
- Last Pushed: 2024-06-16T11:57:49.000Z (5 months ago)
- Last Synced: 2024-06-16T19:56:54.087Z (5 months ago)
- Topics: clustering, machine-learning, typescript, unsupervised-learning
- Language: TypeScript
- Homepage:
- Size: 2.32 MB
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Clustering Algorithms
Pure TypeScript implementations of common clustering algorithms.
[1. Interface](#interface)
|
[2. Algorithms](#algorithms)
|
[3. Utilities](#utilities)## Interface
``` ts
import * as clA from "t-ski/clustering-algorithms"
`````` ts
type TVector = number[]
type TMatrix[] = number[][]
type TCluster = TVector[]interface AClustering {
clusters: O[]
}
```## Algorithms
![Vector-based Plots](./examples.vector.png)
``` ts
clA.(data: I[], ...args): AClustering
```### Centroid-based
``` ts
clA.(data: number[]|TVector[], k: number = 2): AClustering
`````` ts
new clA.KMeans(DATA, 3).clusters;
```#### k-Means
`O(n²) | Θ(n)`
``` ts
clA.KMeans(data, k?)
```#### k-Means++
`O(n²) | Θ(n)`
``` ts
clA.KMeansPP(data, k?)
```#### k-Medoids
`O(n²) | Θ(n)`
``` ts
clA.KMedoids(data, k?)
```---
### Density-based
``` ts
clA.(data: number[]|TVector[], epsilon?: number): AClustering
```> If not defined, the epsilon parameter is estimated from the given data.
``` ts
new clA.MeanShift(DATA, 2).clusters;
```#### MeanShift
`O = Θ(n²)`
``` ts
clA.MeanShift(data, epsilon?): AClustering & { noise: TCluster }
```#### DBSCAN
`O(n²) | Θ(n log n)`
``` ts
clA.DBSCAN(data, epsilon?, minPoints: number = 4): AClustering & { noise: TCluster }
```#### OPTICS
`O(n²) | Θ(n log n)`
``` ts
clA.OPTICS(data, epsilon?, minPoints: number = 4): AClustering & { noise: TCluster }
```---
### Graph-based
![Graph-based Plots](./examples.graph.png)
``` ts
clA.(adjacencyMatrix: TMatrix, k: number = 2): AClustering
```> Input to graph-based clustering is a target graph's adjacency matrix. Output corresponds to clusters of related graph node indexes.
``` ts
new clA.ConnectedComponents(DATA).clusters;
```#### Connected Components
`O = Θ(n³)`
``` ts
clA.ConnectedComponents(adjacencyMatrix)
```#### Divisive Clustering
`O = Θ(n³)`
``` ts
clA.Divisive(adjacencyMatrix, k: number = 2)
```#### Markov Clustering (MCL)
`O = Θ(n³)`
``` ts
clA.Markov(adjacencyMatrix, e: number = 2, r: number = 2)
```---
### Hierarchy-based
``` ts
clA.(data: number[]|TVector[], k: number = 2): AClustering
`````` ts
new clA.AverageLinkage(DATA, 4).clusters;
```#### Average Linkage
`O = Θ(n³)`
``` ts
clA.AverageLinkage(data, k?)
```#### Centroid Linkage
`O = Θ(n³)`
``` ts
clA.CentroidLinkage(data, k?)
```#### Complete Linkage
`O = Θ(n³)`
``` ts
clA.CompleteLinkage(data, k?)
```#### Hausdorff Linkage
`O = Θ(n³)`
``` ts
clA.HausdorffLinkage(data, k?)
```#### Median Linkage
`O = Θ(n³)`
``` ts
clA.MedianLinkage(data, k?)
```#### Single Linkage
`O = Θ(n³)`
``` ts
clA.SingleLinkage(data, k?)
```#### Ward
`O = Θ(n³)`
``` ts
clA.Ward(data, k?)
```## Utilities
### Distance Metrics
`f: (ℝ×…×ℝ)² ↦ ℕ`
``` ts
clA.util.distance.(p1: TVector, p2: TVector = []): number
`````` ts
clA.AClustering.setDistanceMetric(util.distance.manhattan);const clusters = new clA.KMeans(DATA, 3).clusters;
```#### Euclidean (2-Norm) `default`
``` ts
clA.util.distance.euclidean(p1: TVector, p2: TVector = []): number
```#### Manhattan
``` ts
clA.util.distance.manhattan(p1: TVector, p2: TVector = []): number
```#### Chebyshev
``` ts
clA.util.distance.chebyshev(p1: TVector, p2: TVector = []): number
```#### Cosine (normal)
``` ts
clA.util.distance.cosine(p1: TVector, p2: TVector = []): number
```---
### Quality Metrics
``` ts
clA.util.quality.(clusters: TCluster[]): number
`````` ts
const clusters = new clA.KMeans(DATA, 3).clusters;const clusteringQuality = clA.util.quality.silhouetteCoefficient(clusters);
```#### Silhouette Coefficient
`f: ℕ×(ℝ×…×ℝ) ↦ [-1,1] → 1`
``` ts
clA.util.quality.silhouetteCoefficient(clusters: TCluster[]): number
```#### Dunn Index
`f: ℕ×(ℝ×…×ℝ) ↦ [0,∞) → ∞`
``` ts
clA.util.quality.dunnIndex(clusters: TCluster[]): number
```#### Davies-Bouldin Index
`f: ℕ×(ℝ×…×ℝ) ↦ [0,∞) → 0`
``` ts
clA.util.quality.daviesBouldinIndex(clusters: TCluster[]): number
```---
### Vector Data Map
The `VectorDataMap` utility class helps with associating arbitrary entites with data points ([view example](./example/example.js)).
``` ts
type TDataPoint = [ TVector, T ]new clA.util.VectorDataMap(data: TDataPoint[]): {
vectors: TVector[];
getEntity: (vector: TVector) => T|T[];
getVector: (entity: T) => TVector;
getCluster: (clusters: TCluster[], vector: TVector) => number|number[];
getCluster: (clusters: TCluster[], entity: T) => number|number[];
}
```##
© Thassilo Martin Schiepanski