https://github.com/akapich/clustermatic
Python AutoML library for clustering tasks
https://github.com/akapich/clustermatic
automl clustering machine-learning scikit-learn
Last synced: 4 months ago
JSON representation
Python AutoML library for clustering tasks
- Host: GitHub
- URL: https://github.com/akapich/clustermatic
- Owner: AKapich
- Created: 2024-12-21T22:42:47.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2024-12-27T22:17:58.000Z (over 1 year ago)
- Last Synced: 2026-01-05T16:23:32.265Z (6 months ago)
- Topics: automl, clustering, machine-learning, scikit-learn
- Language: Python
- Homepage: https://pypi.org/project/clustermatic/
- Size: 826 KB
- Stars: 0
- Watchers: 1
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README

---


`clustermatic` is a Python library designed to accelerate clustering tasks using `scikit-learn`. It serves as a quick tool for selecting the optimal clustering algorithm and its hyperparameters, providing visualizations and metrics for comparison.
## Features
- **Clustering Algorithms**: Analyzes six clustering algorithms from `scikit-learn`:
- `KMeans`
- `DBSCAN`
- `MiniBatchKMeans`
- `AgglomerativeClustering`
- `OPTICS`
- `SpectralClustering`
- **Optimization Methods**: Includes Bayesian optimization and random search for hyperparameter tuning.
- **Flexible Preprocessing**: Allows users to customize how the data is meant to be preprocessed, adjusting methods such as scaling, normalization, and dimensionality reduction.
- **Evaluation Metrics**: Supports evaluation with `silhouette`, `calinski_harabasz`, and `davies_bouldin` scores.
- **Report Generation**: Generates reports in HTML format after optimization.
## Installation
To install `clustermatic`, use pip:
```bash
pip install clustermatic
```
## Usage
For a quick start, use the following code snippet:
```python
from clustermatic import AutoClusterizer
# Load data
from sklearn.datasets import make_moons
X, _ = make_moons(n_samples=200, noise=0.1, random_state=42)
# Initialize AutoClusterizer
ac = AutoClusterizer()
# Fit the data
ac.fit(X)
# Generate report
ac.evaluate()
```
For more detailed walkthrough, check out [this example Jupyter Notebook](https://github.com/AKapich/clustermatic/blob/main/examples/example.ipynb)