https://github.com/mdh266/kmeans
Creating A Scikit-Learn Compatable Clustering Algorithm
https://github.com/mdh266/kmeans
algorithms clustering data-science machine-learning machine-learning-algorithms scikit-learn unsupervised-learning
Last synced: about 1 month ago
JSON representation
Creating A Scikit-Learn Compatable Clustering Algorithm
- Host: GitHub
- URL: https://github.com/mdh266/kmeans
- Owner: mdh266
- Created: 2022-05-05T02:44:14.000Z (about 4 years ago)
- Default Branch: main
- Last Pushed: 2022-05-26T00:19:39.000Z (about 4 years ago)
- Last Synced: 2025-03-26T13:22:24.237Z (about 1 year ago)
- Topics: algorithms, clustering, data-science, machine-learning, machine-learning-algorithms, scikit-learn, unsupervised-learning
- Language: Jupyter Notebook
- Homepage:
- Size: 305 KB
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Writing A Scikit Learn Compatible Clustering Algorithm
-----------------------
## About
---------
In this post, I will go over how to write a K-means clustering algorithm from scratch using [NumPy](https://numpy.org/). The algorithm will be explained in the next section and while seamingly simple, it can be tricky to implement efficiently! As an added bonus, I will go over how to implement a [Scikit-Learn](https://scikit-learn.org/stable/) compatible clustering algorithm so that we can using Scikit-Learn's framework including [Pipelines](https://scikit-learn.org/stable/modules/generated/sklearn.pipeline.Pipeline.html) and [GridSearchCV](https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.GridSearchCV.html).
## Using The Notebook
----------
You can install the dependencies and access the notebook using Docker by building the Docker image with the following:
docker build -t kmeans .
Followed by running the command container:
docker run -ip 8888:8888 -v `pwd`:/home/jovyan -t kmeans
See here for more info.
Otherwise without Docker, make sure to use Python 3.9 and install the libraries listed in requirements.txt. These can be installed with the command,
pip install -r requirements.txt