https://github.com/chen0040/java-clustering
Package provides java implementation of various clustering algorithms
https://github.com/chen0040/java-clustering
clustering-algorithm dbscan dbscan-clustering hierarchical-clustering k-means
Last synced: 3 months ago
JSON representation
Package provides java implementation of various clustering algorithms
- Host: GitHub
- URL: https://github.com/chen0040/java-clustering
- Owner: chen0040
- License: mit
- Created: 2017-05-28T20:02:10.000Z (over 8 years ago)
- Default Branch: master
- Last Pushed: 2017-06-05T03:21:44.000Z (over 8 years ago)
- Last Synced: 2025-07-24T03:53:59.240Z (4 months ago)
- Topics: clustering-algorithm, dbscan, dbscan-clustering, hierarchical-clustering, k-means
- Language: Java
- Size: 130 KB
- Stars: 11
- Watchers: 4
- Forks: 6
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# java-clustering
Package provides java implementation of various clustering algorithms
[](https://travis-ci.org/chen0040/java-clustering) [](https://coveralls.io/github/chen0040/java-clustering?branch=master)
# Features
* Hierarchical Clustering
* KMeans Clustering
* DBSCAN
* Single Linkage Clustering
# Install
Add the following dependency to your POM file:
```xml
com.github.chen0040
java-clustering
1.0.3
```
### Spatial Segmentation using Hierarchical Clustering
The following sample code shows how to use hierarchical clustering to separate two clusters:
```java
DataQuery.DataFrameQueryBuilder schema = DataQuery.blank()
.newInput("c1")
.newInput("c2")
.newOutput("designed")
.end();
Sampler.DataSampleBuilder negativeSampler = new Sampler()
.forColumn("c1").generate((name, index) -> randn() * 0.3 + (index % 2 == 0 ? 2 : 4))
.forColumn("c2").generate((name, index) -> randn() * 0.3 + (index % 2 == 0 ? 2 : 4))
.forColumn("designed").generate((name, index) -> 0.0)
.end();
Sampler.DataSampleBuilder positiveSampler = new Sampler()
.forColumn("c1").generate((name, index) -> rand(-4, -2))
.forColumn("c2").generate((name, index) -> rand(-2, -4))
.forColumn("designed").generate((name, index) -> 1.0)
.end();
DataFrame data = schema.build();
data = negativeSampler.sample(data, 50);
data = positiveSampler.sample(data, 50);
System.out.println(data.head(10));
HierarchicalClustering algorithm = new HierarchicalClustering();
algorithm.setLinkage(linkageCriterion);
algorithm.setClusterCount(2);
DataFrame learnedData = algorithm.fitAndTransform(data);
for(int i = 0; i < learnedData.rowCount(); ++i){
DataRow tuple = learnedData.row(i);
String clusterId = tuple.getCategoricalTargetCell("cluster");
System.out.println("learned: " + clusterId +"\tknown: "+tuple.target());
}
```
### Spatial Segmentation using EM Clustering
The following sample code shows how to use EM clustering to separate two clusters:
```java
DataQuery.DataFrameQueryBuilder schema = DataQuery.blank()
.newInput("c1")
.newInput("c2")
.newOutput("designed")
.end();
Sampler.DataSampleBuilder negativeSampler = new Sampler()
.forColumn("c1").generate((name, index) -> randn() * 0.3 + (index % 2 == 0 ? 2 : 4))
.forColumn("c2").generate((name, index) -> randn() * 0.3 + (index % 2 == 0 ? 2 : 4))
.forColumn("designed").generate((name, index) -> 0.0)
.end();
Sampler.DataSampleBuilder positiveSampler = new Sampler()
.forColumn("c1").generate((name, index) -> rand(-4, -2))
.forColumn("c2").generate((name, index) -> rand(-2, -4))
.forColumn("designed").generate((name, index) -> 1.0)
.end();
DataFrame data = schema.build();
data = negativeSampler.sample(data, 50);
data = positiveSampler.sample(data, 50);
System.out.println(data.head(10));
EMClustering algorithm = new EMClustering();
algorithm.setSigma0(1.5);
algorithm.setClusterCount(2);
DataFrame learnedData = algorithm.fitAndTransform(data);
for(int i = 0; i < learnedData.rowCount(); ++i){
DataRow tuple = learnedData.row(i);
String clusterId = tuple.getCategoricalTargetCell("cluster");
System.out.println("learned: " + clusterId +"\tknown: "+tuple.target());
}
```
### Spatial Segmentation using Single Linkage Clustering
The following sample code shows how to use single linkage clustering to separate two clusters:
```java
DataQuery.DataFrameQueryBuilder schema = DataQuery.blank()
.newInput("c1")
.newInput("c2")
.newOutput("designed")
.end();
Sampler.DataSampleBuilder negativeSampler = new Sampler()
.forColumn("c1").generate((name, index) -> randn() * 0.3 + (index % 2 == 0 ? 2 : 4))
.forColumn("c2").generate((name, index) -> randn() * 0.3 + (index % 2 == 0 ? 2 : 4))
.forColumn("designed").generate((name, index) -> 0.0)
.end();
Sampler.DataSampleBuilder positiveSampler = new Sampler()
.forColumn("c1").generate((name, index) -> rand(-4, -2))
.forColumn("c2").generate((name, index) -> rand(-2, -4))
.forColumn("designed").generate((name, index) -> 1.0)
.end();
DataFrame data = schema.build();
data = negativeSampler.sample(data, 50);
data = positiveSampler.sample(data, 50);
System.out.println(data.head(10));
SingleLinkageClustering algorithm = new SingleLinkageClustering();
algorithm.setClusterCount(2);
DataFrame learnedData = algorithm.fitAndTransform(data);
for(int i = 0; i < learnedData.rowCount(); ++i){
DataRow tuple = learnedData.row(i);
String clusterId = tuple.getCategoricalTargetCell("cluster");
System.out.println("learned: " + clusterId +"\tknown: "+tuple.target());
}
```
### Spatial Segmentation using DBSCAN
The following sample code shows how to use DBSCAN to perform clustering:
```java
DataQuery.DataFrameQueryBuilder schema = DataQuery.blank()
.newInput("c1")
.newInput("c2")
.newOutput("designed")
.end();
Sampler.DataSampleBuilder negativeSampler = new Sampler()
.forColumn("c1").generate((name, index) -> randn() * 0.3 + (index % 2 == 0 ? 2 : 4))
.forColumn("c2").generate((name, index) -> randn() * 0.3 + (index % 2 == 0 ? 2 : 4))
.forColumn("designed").generate((name, index) -> 0.0)
.end();
Sampler.DataSampleBuilder positiveSampler = new Sampler()
.forColumn("c1").generate((name, index) -> rand(-4, -2))
.forColumn("c2").generate((name, index) -> rand(-2, -4))
.forColumn("designed").generate((name, index) -> 1.0)
.end();
DataFrame data = schema.build();
data = negativeSampler.sample(data, 200);
data = positiveSampler.sample(data, 200);
System.out.println(data.head(10));
DBSCAN algorithm = new DBSCAN();
algorithm.setEpsilon(0.5);
DataFrame learnedData = algorithm.fitAndTransform(data);
for(int i = 0; i < learnedData.rowCount(); ++i){
DataRow tuple = learnedData.row(i);
String clusterId = tuple.getCategoricalTargetCell("cluster");
System.out.println("learned: " + clusterId +"\tknown: "+tuple.target());
}
```
### Image Segmentation (Clustering) using KMeans
The following sample code shows how to use FuzzyART to perform image segmentation:
```java
BufferedImage img= ImageIO.read(FileUtils.getResource("1.jpg"));
DataFrame dataFrame = ImageDataFrameFactory.dataFrame(img);
KMeans cluster = new KMeans();
DataFrame learnedData = cluster.fitAndTransform(dataFrame);
for(int i=0; i classColors = new ArrayList();
for(int i=0; i < 5; ++i){
for(int j=0; j < 5; ++j){
classColors.add(ImageDataFrameFactory.get_rgb(255, rand.nextInt(255), rand.nextInt(255), rand.nextInt(255)));
}
}
BufferedImage segmented_image = new BufferedImage(img.getWidth(), img.getHeight(), img.getType());
for(int x=0; x < img.getWidth(); x++)
{
for(int y=0; y < img.getHeight(); y++)
{
int rgb = img.getRGB(x, y);
DataRow tuple = ImageDataFrameFactory.getPixelTuple(x, y, rgb);
int clusterIndex = cluster.transform(tuple);
rgb = classColors.get(clusterIndex % classColors.size());
segmented_image.setRGB(x, y, rgb);
}
}
```