Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/tyler-hilbert/cuda-kmeans
K-Means written from scratch in CUDA
https://github.com/tyler-hilbert/cuda-kmeans
cuda kmeans-clustering machine-learning nsight
Last synced: 10 days ago
JSON representation
K-Means written from scratch in CUDA
- Host: GitHub
- URL: https://github.com/tyler-hilbert/cuda-kmeans
- Owner: Tyler-Hilbert
- Created: 2024-08-07T21:28:30.000Z (6 months ago)
- Default Branch: main
- Last Pushed: 2025-01-29T07:11:04.000Z (17 days ago)
- Last Synced: 2025-01-29T08:23:00.627Z (17 days ago)
- Topics: cuda, kmeans-clustering, machine-learning, nsight
- Language: C++
- Homepage:
- Size: 328 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# K-Means implemented from scratch using CUDA
## Usage (C++)
```C++
KMeans_CUDA model (data, N, D, K); // Where data is AoS, N is number of data points, D is number of dimensions and K is number of clusters
model.one_epoch(); // Trains one epoch
model.print_predictions(); // Prints the classifications. Can be commented out.
// printf ("Error: %f\n", model.compute_error()); // Uncomment to print error
```## Kernels
```C++
// Updates each centroid using d_sum and d_count
// where the index is d * centroid number (out of k).
// d: number of dimensions
// k: number of clusters
__global__ void update_centroids(
float *d_centroids,
const float *d_sum,
const int *d_count,
int d,
int k
);
``````C++
// Computes the sum (d_sum) and count (d_count)
// for each of the k clusters labeled in d_centroids.
// n: number of data points
// d: number of dimensions
// k: number of clusters
// Uses shared memory of 3*k*d
__global__ void sum_and_count(
const float *d_data,
const float *d_centroids,
float *d_sum,
int *d_count,
int n,
int d,
int k
);
``````C++
// Computes error and updates d_error
// n: number of data points
// d: number of dimensions
// k: number of clusters
__global__ void calculate_error(
const float *d_data,
const float *d_centroids,
float *d_error,
int n,
int d,
int k
) {
```## Usage (Python Bindings)
```Python
# pip install pybind11
# (Look in the comments I left in the example if you also need to compile the library)
import KMeans_CUDA
import numpy as np
# Data
N=6
D=1
K=2
epochs=3
data = np.array([0.1, 0.2, 0.3, 1.5, 1.6, 1.7], dtype=np.float32) # AoS
# Constructor
KMeans_CUDA.KMeans_CUDA_constructor(data, N, D, K)
# Train
KMeans_CUDA.fit(epochs)
# Predict
print (KMeans_CUDA.predictions())# DON'T FORGET TO DELETE MEMORY!
KMeans_CUDA.release_memory()
```## Performance
Go to: https://github.com/Tyler-Hilbert/CUDA-KMeans/tree/52db75728794449dc152989c648e03b632d24c08 for most recent performance tests.## Performance compared to scikit-learn and ArrayFire
It is shown that this implementations of K-Means outperforms scitkit-learn and ArrayFire on a T4.
![CUDA KMeans Performance vs scikit-learn and ArrayFire](https://raw.githubusercontent.com/Tyler-Hilbert/CUDA-KMeans/main/Benchmark_Performance/Comparison.png)## Usage Continued
To compile the test program:
$git clone https://github.com/Tyler-Hilbert/CUDA-KMeans.git
$cd CUDA-KMeans
$nvcc main.cpp KMeans_CUDA.cu -o kmeans
$./kmeans