An open API service indexing awesome lists of open source software.

https://github.com/joeylemon/python-kmeans

an implementation of the K-means clustering algorithm in Python
https://github.com/joeylemon/python-kmeans

k-means-clustering matplotlib ml numpy python

Last synced: 30 days ago
JSON representation

an implementation of the K-means clustering algorithm in Python

Awesome Lists containing this project

README

        

# python-kmeans

An implementation of the [K-means clustering](https://en.wikipedia.org/wiki/K-means_clustering) unsupervised machine learning algorithm used to reduce the number of colors required to represent an image.



## Motivation

As students in COSC425: Introduction to Machine Learning at the [University of Tennessee](https://utk.edu/), we were tasked with implementing the K-means algorithm from scratch in Python. We were then to use the algorithm to determine the best set of RGB colors to represent a given image. Finally, we had to analyze the performance of our algorithm by observing how quickly the clusters reached a centroid convergence and the distribution of pixels and their corresponding clusters. We performed our algorithms on images with K values of 4, 16, or 32. We set a max iteration count of 24, and we determined convergence with a max RGB value delta of 1.

## Analysis

For the image above of [Smokey](https://en.wikipedia.org/wiki/Smokey_(mascot)), the mascot of the University of Tennessee, we can plot how quickly a set of K values reach a cluster convergence. We can also observe the distribution of pixels and their corresponding clusters:



As a further step to understanding how our K-means algorithm reaches a conclusion, we can plot the image's RGB values in 3D and observe how the cluster centroids converge as the algorithm iterates. The below figure observes a separate run of the algorithm with K=4 on the image of Smokey:



## How to Run

To run the program on other images and K values, edit the main function of `kmeans.py` with the appropriate values:

```py
if __name__ == "__main__":
k_values = [
{"K": 4, "color": "blue", "xticks": [1, 2, 3, 4]},
{"K": 16, "color": "red", "xticks": [1, 8, 16]},
{"K": 32, "color": "green", "xticks": [1, 8, 16, 32]}
]

perform_comparison("images/baboon.jpeg", k_values)
perform_comparison("images/rocket.jpeg", k_values)
perform_comparison("images/smokey.jpeg", k_values)
perform_comparison("images/truck.jpeg", k_values)
```

Then, execute the script:
```sh
> python kmeans.py
```

The program is accompanied by a unit test to ensure the image reduction is working correctly. You can run the test with:
```sh
> python test.py
```