https://github.com/hadjiprocopis/histocurse
A Java implementation of a multidimensional histogram backed on dense/conventional OR sparse array. Extremely efficient when number of dimensions is large and back-store is sparse array. This module depends on other projects which can be found on my repo here. See README below to see what you need to download.
https://github.com/hadjiprocopis/histocurse
data-analysis data-structures histogram multidimensional
Last synced: about 4 hours ago
JSON representation
A Java implementation of a multidimensional histogram backed on dense/conventional OR sparse array. Extremely efficient when number of dimensions is large and back-store is sparse array. This module depends on other projects which can be found on my repo here. See README below to see what you need to download.
- Host: GitHub
- URL: https://github.com/hadjiprocopis/histocurse
- Owner: hadjiprocopis
- Created: 2017-11-02T15:42:35.000Z (over 8 years ago)
- Default Branch: master
- Last Pushed: 2017-11-02T20:01:00.000Z (over 8 years ago)
- Last Synced: 2023-07-19T00:34:12.405Z (almost 3 years ago)
- Topics: data-analysis, data-structures, histogram, multidimensional
- Language: Java
- Homepage:
- Size: 1.15 MB
- Stars: 0
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# histocurse
author: andreas hadjiprocopis (andreashad2@gmail.com)
This is a module implementing a Histogram in N-dimensions.
Meaning that bins do not exist in a one-dimensional space
as they usually do but can be in N-dimensions.
N can be large. In the case of large N, it is better to
use the back-store.
This module depends on the following modules in this
repository:
containers (https://github.com/hadjiprocopis/containers)
statistics (https://github.com/hadjiprocopis/statistics)
cartesian (https://github.com/hadjiprocopis/cartesian)
hadjiprocopis_utils (https://github.com/hadjiprocopis/hadjiprocopis_utils)
Here is an example on how to use this module:
```
import java.util.Random;
import java.util.HashMap;
import java.util.ArrayList;
import java.io.PrintWriter;
import ahp.org.Histograms.*;
import ahp.org.Containers.*;
public class TestHistograms_SparseArray {
public static void main(String args[]) throws Exception {
// Here we create labels for each bin in the histogram.
// We are going to create 2-dimensional histogram with
// 2 bins in each dimension.
// Labels to bins are optional.
String labels[/*dims*/][/*num bins/labels per dim*/] =
new String[][]{
new String[]{"a", "b"},
new String[]{"a", "b"},
}
;
// Here we create the 2-dimensional histogram
Histogram ahist = new Histogram>(
// histogram name
"a histogram1",
// labels (a string array) if any, otherwise will create default
labels,
// specify the number of dimensions and the number of bins in each dimension
new int[]{2, 2}, // numbins per dim
// bin widths for each dimension
new double[]{1,1}, // bin widths
// bins start from 0 in both dimensions
new double[]{0,0}, // boundaries start
// Some magic: specify what backing store you want
// can also be DenseArray.class
// from: http://stackoverflow.com/questions/37231043/how-to-keep-generic-type-of-nested-generics-with-class-tokens
(Class> )(Class> )SparseArray.class
);
// and here we are adding data to the histogram
// this says increment the count of the bin with this LABEL
ahist.increment_bin_count(new String[]{"a", "a"});
// whereas this one finds the bin given coordinates of the data
// e.g. data (0.1,0.3) falls into the bin (0,0) (given that they
// start from 0 and have a width of 1 (see above)
ahist.increment_bin_count(new double[]{0.1, 0.3});
ahist.increment_bin_count(new double[]{0.1, 0.3});
ahist.increment_bin_count(new double[]{0.1, 0.3});
ahist.increment_bin_count(new double[]{0.1, 0.3});
ahist.increment_bin_count(new double[]{0.1, 0.3});
ahist.increment_bin_count(new double[]{0.2, 0.5});
ahist.increment_bin_count(new double[]{0.3, 1.2});
ahist.increment_bin_count(new double[]{0.4, 1.5});
ahist.increment_bin_count(new double[]{1.1, 0.1});
ahist.increment_bin_count(new double[]{1.2, 1.4});
System.out.println("*******************");
System.out.println("Histogram with preset-labels: "+ahist);
// here we decrement bin counts
System.out.println("******************* DECREMENTING ");
ahist.decrement_bin_count(new double[]{1.2, 1.4});
ahist.decrement_bin_count(new double[]{0.2, 0.2});
ahist.decrement_bin_count(new double[]{0.2, 0.2});
ahist.decrement_bin_count(new double[]{0.2, 0.2});
ahist.decrement_bin_count(new double[]{0.2, 0.2});
System.out.println("after decrement Histogram with preset-labels: "+ahist);
// Here we are sampling from the histogram
// basically we are asking for the content of a bin
// it will return a Histobin object
Histobin abin;
// select by a coordinate inside a bin:
abin = ahist.get_bin(new double[]{0.2, 0.2});
System.out.println("Selected bin: "+abin);
// select by bin-label
abin = ahist.get_bin(new String[]{"a", "b"});
System.out.println("Selected bin: "+abin);
// select by bin-coordinate (integer)
abin = ahist.get_bin(new int[]{0, 1});
System.out.println("Selected bin: "+abin);
// This is where the magic happens
// we are asking for bins selected by a wildcard: first dim to be 'a' and second dim any (*)
// (courtesy of the CARTESIAN module)
Random arng = new Random(1234);
HistogramSampler asampler = ahist.sampler(arng);
// get all bins whose first coordinate is 'a' and second is anything (*)
asampler.select_bins_using_wildcard_labels(new String[]{"a", "*"});
// and print those bins:
System.out.println("random bin: with '0-*' : "+asampler.random_bin_from_selection());
}
}
```
The above example can be run using
```
ant clean && ant && ant TestHistograms_SparseArray
```
There is more magic to this module which will be documented
in due time.
author: andreas hadjiprocopis (andreashad2@gmail.com)