Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/jakevdp/mst_clustering
Scikit-learn style estimator for Minimum Spanning Tree Clustering in Python
https://github.com/jakevdp/mst_clustering
Last synced: 8 days ago
JSON representation
Scikit-learn style estimator for Minimum Spanning Tree Clustering in Python
- Host: GitHub
- URL: https://github.com/jakevdp/mst_clustering
- Owner: jakevdp
- License: bsd-2-clause
- Created: 2015-10-28T23:54:54.000Z (about 9 years ago)
- Default Branch: master
- Last Pushed: 2016-05-16T19:34:32.000Z (over 8 years ago)
- Last Synced: 2024-10-17T10:05:44.762Z (23 days ago)
- Language: Jupyter Notebook
- Homepage:
- Size: 439 KB
- Stars: 85
- Watchers: 8
- Forks: 19
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Minimum Spanning Tree Clustering
[![build status](http://img.shields.io/travis/jakevdp/mst_clustering/master.svg?style=flat)](https://travis-ci.org/jakevdp/mst_clustering)
[![version status](http://img.shields.io/pypi/v/mst_clustering.svg?style=flat)](https://pypi.python.org/pypi/mst_clustering)
[![license](http://img.shields.io/badge/license-BSD-blue.svg?style=flat)](https://github.com/jakevdp/mst_clustering/blob/master/LICENSE)
[![DOI](https://zenodo.org/badge/doi/10.5281/zenodo.50995.svg)](http://dx.doi.org/10.5281/zenodo.50995)
[![JOSS](http://joss.theoj.org/papers/10.21105/joss.00012/status.svg)](http://joss.theoj.org/papers/10.21105/joss.00012)This package implements a simple scikit-learn style estimator for clustering
with a minimum spanning tree.## Motivation
Automated clustering can be an important means of identifying structure in data,
but many of the more popular clustering algorithms do not perform well in the
presence of background noise. The clustering algorithm implemented here, based
on a trimmed Euclidean Minimum Spanning Tree, can be useful in this case.## Example
The API of the ``mst_clustering`` code is designed for compatibility with
the [scikit-learn](http://scikit-learn.org) project.```python
from mst_clustering import MSTClustering
from sklearn.datasets import make_blobs
import matplotlib.pyplot as plt# create some data with four clusters
X, y = make_blobs(200, centers=4, random_state=42)# predict the labels with the MST algorithm
model = MSTClustering(cutoff_scale=2)
labels = model.fit_predict(X)# plot the results
plt.scatter(X[:, 0], X[:, 1], c=labels, cmap='rainbow');
```![Simple Clustering Plot](https://raw.githubusercontent.com/jakevdp/mst_clustering/master/images/SimpleClustering.png)
For a detailed explanation of the algorithm and a more interesting example of it in action, see the [MST Clustering Notebook](http://nbviewer.jupyter.org/github/jakevdp/mst_clustering/blob/master/MSTClustering.ipynb).
## Installation & Requirements
The ``mst_clustering`` package itself is fairly lightweight. It is tested on
Python 2.7 and 3.4-3.5, and depends on the following packages:- [numpy](http://numpy.org)
- [scipy](http://scipy.org)
- [scikit-learn](http://scikit-learn.org)Using the cross-platform [conda](http://conda.pydata.org/miniconda.html)
package manager, these requirements can be installed as follows:```
$ conda install numpy scipy scikit-learn
```Finally, the current release of ``mst_clustering`` can be installed using ``pip``:
```
$ conda install pip # if using conda
$ pip install mst_clustering
```To install ``mst_clustering`` from source, first download the source repository and then run
```
$ python setup.py install
```## Contributing & Reporting Issues
Bug reports, questions, suggestions, and contributions are welcome.
For these, please make use the
[Issues](https://github.com/jakevdp/mst_clustering/issues)
or [Pull Requests](https://github.com/jakevdp/mst_clustering/pulls)
associated with this repository.## Citing
If you use this code in an academic publication, please consider
citing this [JOSS Paper](http://joss.theoj.org/papers/10.21105/joss.00012).