https://github.com/chen0040/java-clustering

Package provides java implementation of various clustering algorithms
https://github.com/chen0040/java-clustering

clustering-algorithm dbscan dbscan-clustering hierarchical-clustering k-means

Last synced: 3 months ago
JSON representation

Package provides java implementation of various clustering algorithms

Host: GitHub
URL: https://github.com/chen0040/java-clustering
Owner: chen0040
License: mit
Created: 2017-05-28T20:02:10.000Z (over 8 years ago)
Default Branch: master
Last Pushed: 2017-06-05T03:21:44.000Z (over 8 years ago)
Last Synced: 2025-07-24T03:53:59.240Z (5 months ago)
Topics: clustering-algorithm, dbscan, dbscan-clustering, hierarchical-clustering, k-means
Language: Java
Size: 130 KB
Stars: 11
Watchers: 4
Forks: 6
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

          # java-clustering

Package provides java implementation of various clustering algorithms

[![Build Status](https://travis-ci.org/chen0040/java-clustering.svg?branch=master)](https://travis-ci.org/chen0040/java-clustering) [![Coverage Status](https://coveralls.io/repos/github/chen0040/java-clustering/badge.svg?branch=master)](https://coveralls.io/github/chen0040/java-clustering?branch=master)

 

# Features

* Hierarchical Clustering

* KMeans Clustering

* DBSCAN

* Single Linkage Clustering

 

# Install

Add the following dependency to your POM file:

```xml

  com.github.chen0040

  java-clustering

  1.0.3

```

### Spatial Segmentation using Hierarchical Clustering

The following sample code shows how to use hierarchical clustering to separate two clusters:

```java

DataQuery.DataFrameQueryBuilder schema = DataQuery.blank()

      .newInput("c1")

      .newInput("c2")

      .newOutput("designed")

      .end();

Sampler.DataSampleBuilder negativeSampler = new Sampler()

      .forColumn("c1").generate((name, index) -> randn() * 0.3 + (index % 2 == 0 ? 2 : 4))

      .forColumn("c2").generate((name, index) -> randn() * 0.3 + (index % 2 == 0 ? 2 : 4))

      .forColumn("designed").generate((name, index) -> 0.0)

      .end();

Sampler.DataSampleBuilder positiveSampler = new Sampler()

      .forColumn("c1").generate((name, index) -> rand(-4, -2))

      .forColumn("c2").generate((name, index) -> rand(-2, -4))

      .forColumn("designed").generate((name, index) -> 1.0)

      .end();

DataFrame data = schema.build();

data = negativeSampler.sample(data, 50);

data = positiveSampler.sample(data, 50);

System.out.println(data.head(10));

HierarchicalClustering algorithm = new HierarchicalClustering();

algorithm.setLinkage(linkageCriterion);

algorithm.setClusterCount(2);

DataFrame learnedData = algorithm.fitAndTransform(data);

for(int i = 0; i < learnedData.rowCount(); ++i){

 DataRow tuple = learnedData.row(i);

 String clusterId = tuple.getCategoricalTargetCell("cluster");

 System.out.println("learned: " + clusterId +"\tknown: "+tuple.target());

}

```

### Spatial Segmentation using EM Clustering

The following sample code shows how to use EM clustering to separate two clusters:

```java

DataQuery.DataFrameQueryBuilder schema = DataQuery.blank()

      .newInput("c1")

      .newInput("c2")

      .newOutput("designed")

      .end();

Sampler.DataSampleBuilder negativeSampler = new Sampler()

      .forColumn("c1").generate((name, index) -> randn() * 0.3 + (index % 2 == 0 ? 2 : 4))

      .forColumn("c2").generate((name, index) -> randn() * 0.3 + (index % 2 == 0 ? 2 : 4))

      .forColumn("designed").generate((name, index) -> 0.0)

      .end();

Sampler.DataSampleBuilder positiveSampler = new Sampler()

      .forColumn("c1").generate((name, index) -> rand(-4, -2))

      .forColumn("c2").generate((name, index) -> rand(-2, -4))

      .forColumn("designed").generate((name, index) -> 1.0)

      .end();

DataFrame data = schema.build();

data = negativeSampler.sample(data, 50);

data = positiveSampler.sample(data, 50);

System.out.println(data.head(10));

EMClustering algorithm = new EMClustering();

algorithm.setSigma0(1.5);

algorithm.setClusterCount(2);

DataFrame learnedData = algorithm.fitAndTransform(data);

for(int i = 0; i < learnedData.rowCount(); ++i){

 DataRow tuple = learnedData.row(i);

 String clusterId = tuple.getCategoricalTargetCell("cluster");

 System.out.println("learned: " + clusterId +"\tknown: "+tuple.target());

}

```

### Spatial Segmentation using Single Linkage Clustering

The following sample code shows how to use single linkage clustering to separate two clusters:

```java

DataQuery.DataFrameQueryBuilder schema = DataQuery.blank()

      .newInput("c1")

      .newInput("c2")

      .newOutput("designed")

      .end();

Sampler.DataSampleBuilder negativeSampler = new Sampler()

      .forColumn("c1").generate((name, index) -> randn() * 0.3 + (index % 2 == 0 ? 2 : 4))

      .forColumn("c2").generate((name, index) -> randn() * 0.3 + (index % 2 == 0 ? 2 : 4))

      .forColumn("designed").generate((name, index) -> 0.0)

      .end();

Sampler.DataSampleBuilder positiveSampler = new Sampler()

      .forColumn("c1").generate((name, index) -> rand(-4, -2))

      .forColumn("c2").generate((name, index) -> rand(-2, -4))

      .forColumn("designed").generate((name, index) -> 1.0)

      .end();

DataFrame data = schema.build();

data = negativeSampler.sample(data, 50);

data = positiveSampler.sample(data, 50);

System.out.println(data.head(10));

SingleLinkageClustering algorithm = new SingleLinkageClustering();

algorithm.setClusterCount(2);

DataFrame learnedData = algorithm.fitAndTransform(data);

for(int i = 0; i < learnedData.rowCount(); ++i){

 DataRow tuple = learnedData.row(i);

 String clusterId = tuple.getCategoricalTargetCell("cluster");

 System.out.println("learned: " + clusterId +"\tknown: "+tuple.target());

}

```

### Spatial Segmentation using DBSCAN

The following sample code shows how to use DBSCAN to perform clustering:

```java

DataQuery.DataFrameQueryBuilder schema = DataQuery.blank()

      .newInput("c1")

      .newInput("c2")

      .newOutput("designed")

      .end();

Sampler.DataSampleBuilder negativeSampler = new Sampler()

      .forColumn("c1").generate((name, index) -> randn() * 0.3 + (index % 2 == 0 ? 2 : 4))

      .forColumn("c2").generate((name, index) -> randn() * 0.3 + (index % 2 == 0 ? 2 : 4))

      .forColumn("designed").generate((name, index) -> 0.0)

      .end();

Sampler.DataSampleBuilder positiveSampler = new Sampler()

      .forColumn("c1").generate((name, index) -> rand(-4, -2))

      .forColumn("c2").generate((name, index) -> rand(-2, -4))

      .forColumn("designed").generate((name, index) -> 1.0)

      .end();

DataFrame data = schema.build();

data = negativeSampler.sample(data, 200);

data = positiveSampler.sample(data, 200);

System.out.println(data.head(10));

DBSCAN algorithm = new DBSCAN();

algorithm.setEpsilon(0.5);

DataFrame learnedData = algorithm.fitAndTransform(data);

for(int i = 0; i < learnedData.rowCount(); ++i){

 DataRow tuple = learnedData.row(i);

 String clusterId = tuple.getCategoricalTargetCell("cluster");

 System.out.println("learned: " + clusterId +"\tknown: "+tuple.target());

}

```

### Image Segmentation (Clustering) using KMeans

The following sample code shows how to use FuzzyART to perform image segmentation:

```java

BufferedImage img= ImageIO.read(FileUtils.getResource("1.jpg"));

DataFrame dataFrame = ImageDataFrameFactory.dataFrame(img);

KMeans cluster = new KMeans();

DataFrame learnedData = cluster.fitAndTransform(dataFrame);

for(int i=0; i  classColors = new ArrayList();

for(int i=0; i < 5; ++i){

 for(int j=0; j < 5; ++j){

    classColors.add(ImageDataFrameFactory.get_rgb(255, rand.nextInt(255), rand.nextInt(255), rand.nextInt(255)));

 }

}

BufferedImage segmented_image = new BufferedImage(img.getWidth(), img.getHeight(), img.getType());

for(int x=0; x < img.getWidth(); x++)

{

 for(int y=0; y < img.getHeight(); y++)

 {

    int rgb = img.getRGB(x, y);

    DataRow tuple = ImageDataFrameFactory.getPixelTuple(x, y, rgb);

    int clusterIndex = cluster.transform(tuple);

    rgb = classColors.get(clusterIndex % classColors.size());

    segmented_image.setRGB(x, y, rgb);

 }

}

```

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/chen0040/java-clustering

Awesome Lists containing this project

README