https://github.com/surpsg/kombi
https://github.com/surpsg/kombi
cartesian-product combination java kotlin performance subsets
Last synced: 11 months ago
JSON representation
- Host: GitHub
- URL: https://github.com/surpsg/kombi
- Owner: SurpSG
- License: apache-2.0
- Created: 2018-03-25T17:30:26.000Z (about 8 years ago)
- Default Branch: master
- Last Pushed: 2021-08-13T21:08:27.000Z (almost 5 years ago)
- Last Synced: 2023-10-20T23:17:52.494Z (over 2 years ago)
- Topics: cartesian-product, combination, java, kotlin, performance, subsets
- Language: Kotlin
- Size: 301 KB
- Stars: 7
- Watchers: 4
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Kombi
[](https://jitpack.io/#surpsg/kombi)
[](https://github.com/SurpSG/Kombi/actions/workflows/ci.yml)
[](https://codecov.io/gh/SurpSG/Kombi)
[](https://github.com/surpsg/kombi/issues)
[](https://github.com/surpsg/kombi/stargazers)

**Kombi** is able to generate
* all possible [combinations](https://en.wikipedia.org/wiki/Combination) from given set of items
* [cartesian product](https://en.wikipedia.org/wiki/Cartesian_product) from given sets of items
**Kombi** advantages:
* All items computations are lazy that provides a small memory footprint
* tiny library(~30Kb)
* no external dependencies
To get even better throughput(see [Benchmarking](#benchmarking) section), computations can be easily parallelized by splitting to equal independent subsets(see [Parallel computation](#parallel-computation) section).
## Installation
### Maven
```xml
jitpack.io
https://jitpack.io
com.github.surpsg
kombi
3.1.0
```
### Gradle
```groovy
repositories {
maven { url 'https://jitpack.io' }
}
dependencies {
compile 'com.github.surpsg:kombi:3.1.0'
}
```
## Combinations
`Combination` is an iterable object, so you can use it in 'for-each' loop.
To get items lazily you can use the iterator by calling `iterator()` function or you can get them as the stream by calling `stream()` function.
### Usage for lists
```java
List inputData = Arrays.asList("A", "B", "C");
Combination> combinations = CombinationsBuilder.combinationsOf(inputData);
combinations.stream().forEach(System.out::println);
```
The output:
```
[A]
[B]
[A, B]
[C]
[A, C]
[B, C]
[A, B, C]
```
### Usage for maps
```java
Map data = new HashMap<>();
data.put(1, "1");
data.put(2, "2");
data.put(3, "3");
Combination> cartesianProduct = CombinationsBuilder.combinationsOf(data);
cartesianProduct.forEach(System.out::println);
```
The output:
```
{1=1}
{2=2}
{1=1, 2=2}
{3=3}
{1=1, 3=3}
{2=2, 3=3}
{1=1, 2=2, 3=3}
```
## Cartesian product
`CartesianProduct` is an iterable object, so you can use it in 'for-each' loop.
To get items lazily you can use the iterator by calling `iterator()` function or you can get them as the stream by calling `stream()` function.
### Usage for lists
```java
List> data = Arrays.asList(
Arrays.asList(1, 2, 3),
Arrays.asList(4, 5),
Arrays.asList(6)
);
CartesianProduct> cartesianProduct = CartesianBuilder.cartesianProductOf(data, false);
cartesianProduct.forEach(System.out::println);
```
The output:
```
[1, 4, 6]
[1, 5, 6]
[2, 4, 6]
[2, 5, 6]
[3, 4, 6]
[3, 5, 6]
```
### Usage for maps
```java
Map> data = new HashMap<>();
data.put(1, Arrays.asList(1, 2, 3));
data.put(2, Arrays.asList(4, 5));
data.put(3, Arrays.asList(6));
CartesianProduct> cartesianProduct = CartesianBuilder.cartesianProductOf(data, true);
cartesianProduct.forEach(System.out::println);
```
The output:
```
{1=1, 2=4, 3=6}
{1=1, 2=5, 3=6}
{1=2, 2=4, 3=6}
{1=2, 2=5, 3=6}
{1=3, 2=4, 3=6}
{1=3, 2=5, 3=6}
```
### Performance tip
There is an overloaded builder method `CartesianBuilder.cartesianProductOf(..., boolean keepOrder)` that accepts boolean parameter `keepOrder`. By default the parameter is `false` that provides a little bit better performance. See (see [Benchmarking](#benchmarking) section) to compare.
## Parallel computation
```java
// generate data for combinations
List data = IntStream.range(0, 10).boxed().collect(Collectors.toList());
Combination> combinations = CombinationsBuilder.combinationsOf(data);
int threads = 4;
ExecutorService threadPool = Executors.newFixedThreadPool(threads);
AtomicInteger count = new AtomicInteger();
// split to equal parts to compute parallelly
List>> subSets = combinations.split( threads );
for (Combination> subSet : subSets) {
threadPool.submit(() -> {
for (List combination : subSet) {
// uncomment line below to print computed combination
// System.out.println(combination);
count.incrementAndGet();
}
});
}
threadPool.shutdown();
threadPool.awaitTermination(20, TimeUnit.SECONDS);
System.out.println("Combinations: " + count + ", expectedCombinations: "+combinations.getCombinationsNumber());
```
## Benchmarking
Measured time of generation of combination/cartesian product items (microseconds to generate all items)
Feel free to run benchmarks by yourself:
```
./gradlew clean kombi-jmh:jmh
```
Benchmark results(less is better):
```
Ubuntu 18.04.4 LTS
Intel® Core™ i7-6500U CPU @ 2.50GHz × 4
# JMH version: 1.22
# VM version: JDK 1.8.0_242, OpenJDK 64-Bit Server VM, 25.242-b08
# VM invoker: /usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java
# VM options: -Xms512m -Xmx1g
# Warmup: 5 iterations, 10 s each
# Measurement: 5 iterations, 10 s each
# Timeout: 10 min per iteration
# Threads: 1 thread, will synchronize iterations
# Benchmark mode: Average time, time/op
Benchmark (itemsQuantity) Mode Cnt Score Error Units
c.s.b.cartesian.CartesianListBenchmark.Guava_cartesianProduct_Lists 3 avgt 10 0.396 ± 0.002 us/op
c.s.b.cartesian.CartesianListBenchmark.Guava_cartesianProduct_Lists 5 avgt 10 8.592 ± 0.190 us/op
c.s.b.cartesian.CartesianListBenchmark.Guava_cartesianProduct_Lists 7 avgt 10 507.613 ± 3.286 us/op
c.s.b.cartesian.CartesianListBenchmark.Guava_cartesianProduct_Lists 11 avgt 10 6047357.993 ± 11218.642 us/op
c.s.b.cartesian.CartesianListBenchmark.Kombi_cartesianProduct_Lists 3 avgt 10 0.363 ± 0.005 us/op
c.s.b.cartesian.CartesianListBenchmark.Kombi_cartesianProduct_Lists 5 avgt 10 6.838 ± 0.176 us/op
c.s.b.cartesian.CartesianListBenchmark.Kombi_cartesianProduct_Lists 7 avgt 10 374.914 ± 69.084 us/op
c.s.b.cartesian.CartesianListBenchmark.Kombi_cartesianProduct_Lists 11 avgt 10 3360446.209 ± 45311.037 us/op
c.s.b.cartesian.CartesianListBenchmark.Guava_cartesianProduct_Sets 3 avgt 10 0.815 ± 0.066 us/op
c.s.b.cartesian.CartesianListBenchmark.Guava_cartesianProduct_Sets 5 avgt 10 15.611 ± 0.578 us/op
c.s.b.cartesian.CartesianListBenchmark.Guava_cartesianProduct_Sets 7 avgt 10 645.309 ± 43.153 us/op
c.s.b.cartesian.CartesianListBenchmark.Guava_cartesianProduct_Sets 11 avgt 10 7492806.803 ± 137744.113 us/op
c.s.b.cartesian.CartesianListBenchmark.Kombi_cartesianProduct_Lists_keepingOrder 3 avgt 10 0.449 ± 0.059 us/op
c.s.b.cartesian.CartesianListBenchmark.Kombi_cartesianProduct_Lists_keepingOrder 5 avgt 10 8.432 ± 0.260 us/op
c.s.b.cartesian.CartesianListBenchmark.Kombi_cartesianProduct_Lists_keepingOrder 7 avgt 10 407.532 ± 1.454 us/op
c.s.b.cartesian.CartesianListBenchmark.Kombi_cartesianProduct_Lists_keepingOrder 11 avgt 10 4061078.368 ± 42057.603 us/op
c.s.b.cartesian.CartesianMapBenchmark.Kombi_cartesianProduct_Maps 5 avgt 10 18.971 ± 1.902 us/op
c.s.b.cartesian.CartesianMapBenchmark.Kombi_cartesianProduct_Maps 7 avgt 10 1050.295 ± 20.054 us/op
c.s.b.cartesian.CartesianMapBenchmark.Kombi_cartesianProduct_Maps_keepingOrder 5 avgt 10 19.619 ± 2.603 us/op
c.s.b.cartesian.CartesianMapBenchmark.Kombi_cartesianProduct_Maps_keepingOrder 7 avgt 10 1212.077 ± 139.650 us/op
c.s.b.combination.CombinationBenchmark.Kombi_combinations_list 11 avgt 10 216.704 ± 3.633 us/op
c.s.b.combination.CombinationBenchmark.Kombi_combinations_list 19 avgt 10 77641.630 ± 1561.054 us/op
c.s.b.combination.CombinationBenchmark.Kombi_combinations_map 11 avgt 10 467.513 ± 2.014 us/op
c.s.b.combination.CombinationBenchmark.Kombi_combinations_map 19 avgt 10 170390.506 ± 3108.922 us/op
```
Comparing performance with Guava(microseconds per generation, less is better):
