{"id":13694622,"url":"https://github.com/pivovarit/parallel-collectors","last_synced_at":"2026-05-01T07:05:07.689Z","repository":{"id":34088885,"uuid":"166660333","full_name":"pivovarit/parallel-collectors","owner":"pivovarit","description":"Parallel Collectors is a toolkit easing parallel collection processing in Java using Stream API.","archived":false,"fork":false,"pushed_at":"2026-04-25T05:21:11.000Z","size":6484,"stargazers_count":674,"open_issues_count":7,"forks_count":63,"subscribers_count":22,"default_branch":"main","last_synced_at":"2026-04-25T07:23:32.235Z","etag":null,"topics":["hacktoberfest","parallel-streams","parallelism","stream-api","virtual-threads"],"latest_commit_sha":null,"homepage":"https://pcollectors.pivovarit.com/","language":"Java","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/pivovarit.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.MD","contributing":"CONTRIBUTING.md","funding":".github/FUNDING.yml","license":"LICENSE.md","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":".github/CODEOWNERS","security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null},"funding":{"github":"pivovarit","buy_me_a_coffee":"pivovarit"}},"created_at":"2019-01-20T12:46:03.000Z","updated_at":"2026-04-25T05:21:14.000Z","dependencies_parsed_at":"2024-01-05T21:25:18.992Z","dependency_job_id":"52178f49-f2d8-4c7d-bd94-283b3dd8ed69","html_url":"https://github.com/pivovarit/parallel-collectors","commit_stats":{"total_commits":936,"total_committers":7,"mean_commits":"133.71428571428572","dds":0.3023504273504274,"last_synced_commit":"b4e8283d26bbd70aaf40a3b609dacd55fa052e10"},"previous_names":[],"tags_count":69,"template":false,"template_full_name":null,"purl":"pkg:github/pivovarit/parallel-collectors","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pivovarit%2Fparallel-collectors","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pivovarit%2Fparallel-collectors/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pivovarit%2Fparallel-collectors/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pivovarit%2Fparallel-collectors/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/pivovarit","download_url":"https://codeload.github.com/pivovarit/parallel-collectors/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pivovarit%2Fparallel-collectors/sbom","scorecard":{"id":708850,"data":{"date":"2025-08-18","repo":{"name":"github.com/pivovarit/parallel-collectors","commit":"a67ca154ef81eeec667c1e7bea2cf8d5b3cf6a66"},"scorecard":{"version":"v5.2.1-41-g40576783","commit":"40576783fda6698350fcbbeaea760ff827433034"},"score":5.2,"checks":[{"name":"Code-Review","score":0,"reason":"Found 0/15 approved changesets -- score normalized to 0","details":null,"documentation":{"short":"Determines if the project requires human code review before pull requests (aka merge requests) are merged.","url":"https://github.com/ossf/scorecard/blob/40576783fda6698350fcbbeaea760ff827433034/docs/checks.md#code-review"}},{"name":"Maintained","score":10,"reason":"30 commit(s) and 0 issue activity found in the last 90 days -- score normalized to 10","details":null,"documentation":{"short":"Determines if the project is \"actively maintained\".","url":"https://github.com/ossf/scorecard/blob/40576783fda6698350fcbbeaea760ff827433034/docs/checks.md#maintained"}},{"name":"Dangerous-Workflow","score":10,"reason":"no dangerous workflow patterns detected","details":null,"documentation":{"short":"Determines if the project's GitHub Action workflows avoid dangerous patterns.","url":"https://github.com/ossf/scorecard/blob/40576783fda6698350fcbbeaea760ff827433034/docs/checks.md#dangerous-workflow"}},{"name":"Binary-Artifacts","score":10,"reason":"no binaries found in the repo","details":null,"documentation":{"short":"Determines if the project has generated executable (binary) artifacts in the source repository.","url":"https://github.com/ossf/scorecard/blob/40576783fda6698350fcbbeaea760ff827433034/docs/checks.md#binary-artifacts"}},{"name":"Token-Permissions","score":0,"reason":"detected GitHub workflow tokens with excessive permissions","details":["Warn: no topLevel permission defined: .github/workflows/build.yml:1","Warn: no topLevel permission defined: .github/workflows/pitest.yml:1","Warn: no topLevel permission defined: .github/workflows/release.yml:1","Info: no jobLevel write permissions found"],"documentation":{"short":"Determines if the project's workflows follow the principle of least privilege.","url":"https://github.com/ossf/scorecard/blob/40576783fda6698350fcbbeaea760ff827433034/docs/checks.md#token-permissions"}},{"name":"CII-Best-Practices","score":0,"reason":"no effort to earn an OpenSSF best practices badge detected","details":null,"documentation":{"short":"Determines if the project has an OpenSSF (formerly CII) Best Practices Badge.","url":"https://github.com/ossf/scorecard/blob/40576783fda6698350fcbbeaea760ff827433034/docs/checks.md#cii-best-practices"}},{"name":"Security-Policy","score":0,"reason":"security policy file not detected","details":["Warn: no security policy file detected","Warn: no security file to analyze","Warn: no security file to analyze","Warn: no security file to analyze"],"documentation":{"short":"Determines if the project has published a security policy.","url":"https://github.com/ossf/scorecard/blob/40576783fda6698350fcbbeaea760ff827433034/docs/checks.md#security-policy"}},{"name":"License","score":10,"reason":"license file detected","details":["Info: project has a license file: LICENSE.md:0","Info: FSF or OSI recognized license: Apache License 2.0: LICENSE.md:0"],"documentation":{"short":"Determines if the project has defined a license.","url":"https://github.com/ossf/scorecard/blob/40576783fda6698350fcbbeaea760ff827433034/docs/checks.md#license"}},{"name":"Fuzzing","score":0,"reason":"project is not fuzzed","details":["Warn: no fuzzer integrations found"],"documentation":{"short":"Determines if the project uses fuzzing.","url":"https://github.com/ossf/scorecard/blob/40576783fda6698350fcbbeaea760ff827433034/docs/checks.md#fuzzing"}},{"name":"Pinned-Dependencies","score":0,"reason":"dependency not pinned by hash detected -- score normalized to 0","details":["Warn: GitHub-owned GitHubAction not pinned by hash: .github/workflows/build.yml:22: update your workflow using https://app.stepsecurity.io/secureworkflow/pivovarit/parallel-collectors/build.yml/master?enable=pin","Warn: GitHub-owned GitHubAction not pinned by hash: .github/workflows/build.yml:24: update your workflow using https://app.stepsecurity.io/secureworkflow/pivovarit/parallel-collectors/build.yml/master?enable=pin","Warn: GitHub-owned GitHubAction not pinned by hash: .github/workflows/build.yml:38: update your workflow using https://app.stepsecurity.io/secureworkflow/pivovarit/parallel-collectors/build.yml/master?enable=pin","Warn: GitHub-owned GitHubAction not pinned by hash: .github/workflows/build.yml:40: update your workflow using https://app.stepsecurity.io/secureworkflow/pivovarit/parallel-collectors/build.yml/master?enable=pin","Warn: GitHub-owned GitHubAction not pinned by hash: .github/workflows/build.yml:56: update your workflow using https://app.stepsecurity.io/secureworkflow/pivovarit/parallel-collectors/build.yml/master?enable=pin","Warn: GitHub-owned GitHubAction not pinned by hash: .github/workflows/build.yml:58: update your workflow using https://app.stepsecurity.io/secureworkflow/pivovarit/parallel-collectors/build.yml/master?enable=pin","Warn: GitHub-owned GitHubAction not pinned by hash: .github/workflows/pitest.yml:21: update your workflow using https://app.stepsecurity.io/secureworkflow/pivovarit/parallel-collectors/pitest.yml/master?enable=pin","Warn: GitHub-owned GitHubAction not pinned by hash: .github/workflows/pitest.yml:23: update your workflow using https://app.stepsecurity.io/secureworkflow/pivovarit/parallel-collectors/pitest.yml/master?enable=pin","Warn: third-party GitHubAction not pinned by hash: .github/workflows/pitest.yml:33: update your workflow using https://app.stepsecurity.io/secureworkflow/pivovarit/parallel-collectors/pitest.yml/master?enable=pin","Warn: GitHub-owned GitHubAction not pinned by hash: .github/workflows/release.yml:19: update your workflow using https://app.stepsecurity.io/secureworkflow/pivovarit/parallel-collectors/release.yml/master?enable=pin","Warn: GitHub-owned GitHubAction not pinned by hash: .github/workflows/release.yml:24: update your workflow using https://app.stepsecurity.io/secureworkflow/pivovarit/parallel-collectors/release.yml/master?enable=pin","Info:   0 out of  10 GitHub-owned GitHubAction dependencies pinned","Info:   0 out of   1 third-party GitHubAction dependencies pinned"],"documentation":{"short":"Determines if the project has declared and pinned the dependencies of its build process.","url":"https://github.com/ossf/scorecard/blob/40576783fda6698350fcbbeaea760ff827433034/docs/checks.md#pinned-dependencies"}},{"name":"Signed-Releases","score":-1,"reason":"no releases found","details":null,"documentation":{"short":"Determines if the project cryptographically signs release artifacts.","url":"https://github.com/ossf/scorecard/blob/40576783fda6698350fcbbeaea760ff827433034/docs/checks.md#signed-releases"}},{"name":"Branch-Protection","score":-1,"reason":"internal error: error during branchesHandler.setup: internal error: githubv4.Query: Resource not accessible by integration","details":null,"documentation":{"short":"Determines if the default and release branches are protected with GitHub's branch protection settings.","url":"https://github.com/ossf/scorecard/blob/40576783fda6698350fcbbeaea760ff827433034/docs/checks.md#branch-protection"}},{"name":"Packaging","score":10,"reason":"packaging workflow detected","details":["Info: Project packages its releases by way of GitHub Actions.: .github/workflows/build.yml:50"],"documentation":{"short":"Determines if the project is published as a package that others can easily download, install, easily update, and uninstall.","url":"https://github.com/ossf/scorecard/blob/40576783fda6698350fcbbeaea760ff827433034/docs/checks.md#packaging"}},{"name":"Vulnerabilities","score":10,"reason":"0 existing vulnerabilities detected","details":null,"documentation":{"short":"Determines if the project has open, known unfixed vulnerabilities.","url":"https://github.com/ossf/scorecard/blob/40576783fda6698350fcbbeaea760ff827433034/docs/checks.md#vulnerabilities"}},{"name":"SAST","score":0,"reason":"SAST tool is not run on all commits -- score normalized to 0","details":["Warn: 0 commits out of 30 are checked with a SAST tool"],"documentation":{"short":"Determines if the project uses static code analysis.","url":"https://github.com/ossf/scorecard/blob/40576783fda6698350fcbbeaea760ff827433034/docs/checks.md#sast"}}]},"last_synced_at":"2025-08-22T07:33:09.573Z","repository_id":34088885,"created_at":"2025-08-22T07:33:09.574Z","updated_at":"2025-08-22T07:33:09.574Z"},"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":32487746,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-30T13:12:12.517Z","status":"online","status_checked_at":"2026-05-01T02:00:05.856Z","response_time":64,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["hacktoberfest","parallel-streams","parallelism","stream-api","virtual-threads"],"created_at":"2024-08-02T17:01:35.907Z","updated_at":"2026-05-01T07:05:07.682Z","avatar_url":"https://github.com/pivovarit.png","language":"Java","funding_links":["https://github.com/sponsors/pivovarit","https://buymeacoffee.com/pivovarit"],"categories":["Java","Uncategorized","Projects"],"sub_categories":["Uncategorized","Functional Programming"],"readme":"# Java Stream API Virtual-Threads-enabled Parallel Collectors\nOvercoming limitations of standard Parallel Streams\n\n[![ci](https://github.com/pivovarit/parallel-collectors/actions/workflows/ci.yml/badge.svg?branch=main)](https://github.com/pivovarit/parallel-collectors/actions/workflows/ci.yml)\n[![pitest](https://github.com/pivovarit/parallel-collectors/actions/workflows/pitest.yml/badge.svg?branch=main)](http://pivovarit.github.io/parallel-collectors/pitest)\n[![Maven Central Version](https://img.shields.io/maven-central/v/com.pivovarit/parallel-collectors)](https://central.sonatype.com/artifact/com.pivovarit/parallel-collectors/versions)\n[![javadoc](https://javadoc.io/badge2/com.pivovarit/parallel-collectors/4.0.0/javadoc.svg)](https://javadoc.io/doc/com.pivovarit/parallel-collectors/4.0.0)\n\n![](docs/pc.png)\n\n[![Stargazers over time](https://starchart.cc/pivovarit/parallel-collectors.svg?variant=adaptive)](https://starchart.cc/pivovarit/parallel-collectors)\n\nParallel Collectors is a toolkit that eases parallel collection processing in Java using the Stream API without the limitations imposed by standard Parallel Streams.\n\n    list.stream()\n      .collect(parallel(i -\u003e blockingOp(i), toList()))\n        .orTimeout(1000, MILLISECONDS)\n        .thenAcceptAsync(System.out::println, executor)\n        .thenRun(() -\u003e System.out.println(\"Finished!\"));\n      \nThey are:\n- lightweight, defaulting to Virtual Threads (an alternative to Project Reactor for scenarios where a lighter solution is preferred)\n- powerful (the combined power of Stream API and `CompletableFuture`s, allowing for timeout specification, composition with other `CompletableFuture`s, and asynchronous processing)\n- configurable (flexibility with customizable `Executor`s and _parallelism_ levels)\n- non-blocking (eliminates the need to block the calling thread while awaiting results)\n- short-circuiting (if one of the operations raises an exception, the remaining tasks will get interrupted)  \n- non-invasive (they are just custom implementations of `Collector` interface, no magic inside, zero-dependencies, no Stream API internals hacking)\n- versatile (enables easy integration with existing Stream API `Collectors`)\n\n### Used by\n\n- [Jenkins JUnit Plugin](https://github.com/jenkinsci/junit-plugin) — the official Jenkins plugin for publishing JUnit test results\n- [LinkedIn Avro Util](https://github.com/linkedin/avro-util) — LinkedIn's utilities for working across multiple Apache Avro versions\n\n### Maven Dependencies\n\n#### JDK 21+:\n\n    \u003cdependency\u003e\n        \u003cgroupId\u003ecom.pivovarit\u003c/groupId\u003e\n        \u003cartifactId\u003eparallel-collectors\u003c/artifactId\u003e\n        \u003cversion\u003e4.0.0\u003c/version\u003e\n    \u003c/dependency\u003e\n\n#### JDK 8+:\n\n    \u003cdependency\u003e\n        \u003cgroupId\u003ecom.pivovarit\u003c/groupId\u003e\n        \u003cartifactId\u003eparallel-collectors\u003c/artifactId\u003e\n        \u003cversion\u003e2.6.1\u003c/version\u003e\n    \u003c/dependency\u003e\n\n##### Gradle\n\n#### JDK 21+:\n\n    implementation 'com.pivovarit:parallel-collectors:4.0.0'\n\n#### JDK 8+:\n\n    implementation 'com.pivovarit:parallel-collectors:2.6.1'\n\n## Philosophy\n\nParallel Collectors are intentionally unopinionated, leaving responsibility to users for:\n\n- Proper configuration of provided `Executor`s and their lifecycle management\n- Choosing appropriate parallelism levels\n- Ensuring the tool is applied in the right context\n\nReview the API documentation before deploying in production.\n\n## Why This Exists?\n\nThe goal is to use the Stream API without inheriting the limitations of parallel streams, especially for I/O-heavy or structured workloads.\n\nJava's built-in parallelization story is geared toward CPU-bound workloads - `parallelStream()` runs everything on the shared ForkJoinPool, which makes it a poor fit for blocking I/O, remote calls, database access, or anything that can stall a worker thread. Once that pool is saturated, everything else using it slows down as well.\n\nThis library fills that gap. It keeps the Stream API model but replaces the execution strategy:\n- user-provided executors instead of the common pool\n- virtual-thread defaults for low-overhead concurrency\n- classification and batching for further scheduling fine-tuning\n- `CompletableFuture` integration so you can work asynchronously and apply timeouts, callbacks, or composition naturally\n\n## Basic API\n\nThe main entry point is the `com.pivovarit.collectors.ParallelCollectors` class - which follows the convention established by `java.util.stream.Collectors` and features static factory methods returning custom `java.util.stream.Collector` implementations spiced up with parallel processing capabilities.\n\nBy default, collectors use Virtual Threads, but you can optionally provide a custom `Executor` instance for more control. When using a custom `Executor`, you are responsible for its lifecycle management.\n\nAll parallel collectors are one-off and must not be reused.\n\n**Important:** `parallel(mapper)` returns `CompletableFuture\u003cStream\u003cR\u003e\u003e`, not `CompletableFuture\u003cList\u003cR\u003e\u003e`. If you want a `List`, pass a downstream collector explicitly: `parallel(mapper, toList())`. The same applies to `parallelBy`.\n\n## Choosing the Right Collector\n\n```mermaid\nflowchart TD\n\nA[Are you ok blocking the caller thread while waiting for processing to finish?] --\u003e|No| B[Use ParallelCollectors.parallel]\nA --\u003e|Yes| C{Does the order of elements matter?}\n\nC --\u003e|Yes| D[\"Use ParallelCollectors.parallelToStream with c -\u003e c.ordered()\"]\nC --\u003e|No| E[Use ParallelCollectors.parallelToStream]\n```\n\n`ParallelCollectors.parallel` family returns `CompletableFuture` while `ParallelCollectors.parallelToStream` family returns `Stream`.\n\nAdditionally, you can customize:\n- a custom `Executor` (defaults to Virtual Threads)\n- a custom parallelism level\n- batching via the `batching()` configurer option\n- grouping by key via `parallelBy(...)` / `parallelToStreamBy(...)` methods\n- ordered streaming via the `ordered()` configurer option (streaming collectors only)\n- a custom downstream `Collector` (`ParallelCollectors.parallel` only)\n- executor decoration via `executorDecorator()` to wrap the resolved executor\n- task decoration via `taskDecorator()` to wrap each individual task\n\nAll configuration is done via the `CollectingConfigurer` (for `parallel`/`parallelBy`) or `StreamingConfigurer` (for `parallelToStream`/`parallelToStreamBy`) passed as a `Consumer`:\n\n    list.stream()\n      .collect(parallel(i -\u003e foo(i), c -\u003e c\n        .executor(executor)\n        .parallelism(4)\n        .batching(),\n      toList()));\n\n#### Batching Collectors\nWhen you use non-batching parallel collectors, **every input element is turned into an individual task** submitted to an `ExecutorService`. If you have 1000 elements, you end up submitting 1000 tasks.\nEven if you only have two threads processing them, both threads hammer the same task queue, repeatedly competing for the next piece of work. That competition creates contention, and overall overhead.\n\nThis behaviour resembles a primitive form of **work-stealing**, where each worker repeatedly tries to grab the next available task. **Work-stealing is great in scenarios where task durations vary significantly**, since it keeps faster workers busy, **but it's not free**.\n\nHowever, if the processing time for all subtasks is similar, it might be better to distribute tasks in batches to avoid excessive contention.\n\nWithout batching:\n\n```\nThread 1: [] [] [] [] [] [] [] [] [] [] [] ... (500 tiny tasks)\nThread 2: [] [] [] [] [] [] [] [] [] [] [] ... (500 tiny tasks)\n```\n\nWith batching:\n```\nThread 1: [--------------------------------------------------] (1 large task)\nThread 2: [--------------------------------------------------] (1 large task)\n```\n\nThe difference in performance for lightweight tasks can be enormous:\n\n```plain\nBenchmark                              Mode  Cnt      Score     Error  Units\nBatchedVsNonBatchedBenchmark.batch    thrpt    5  41558.548 ± 959.057  ops/s\nBatchedVsNonBatchedBenchmark.normal   thrpt    5    254.869 ±   5.667  ops/s\n```\n\nBatching can be enabled via the `batching()` configurer option:\n\n    list.stream()\n      .collect(parallel(i -\u003e foo(i), c -\u003e c.parallelism(4).batching(), toList()));\n\n#### Normal\n\n![](docs/flamegraph_normal.png)\n\n#### Batched\n\n![](docs/flamegraph_batched.png)\n\n#### Grouping Collectors\n\nThe `parallelBy(...)` and `parallelToStreamBy(...)` methods allow you to classify input elements by a key and process each group in parallel. Each group is guaranteed to be processed on a single thread, and results are returned as `Group\u003cK, R\u003e` entries:\n\n    CompletableFuture\u003cStream\u003cGroup\u003cString, String\u003e\u003e\u003e result = tasks.stream()\n      .collect(parallelBy(Task::groupId, t -\u003e compute(t)));\n\n    CompletableFuture\u003cList\u003cGroup\u003cString, String\u003e\u003e\u003e result = tasks.stream()\n      .collect(parallelBy(Task::groupId, t -\u003e compute(t), toList()));\n\nThe `Group\u003cK, V\u003e` record provides `key()` and `values()` accessors, plus a `map()` method for transforming values while preserving the grouping key.\n\n#### Decorators\n\nTwo decorator options let you add cross-cutting behavior without replacing the executor:\n\n**`executorDecorator(UnaryOperator\u003cExecutor\u003e)`** wraps the resolved executor (the virtual-thread default or a custom one) and returns a replacement. It is invoked once per collector, before any tasks are submitted. This is a natural fit for intercepting every `execute()` call, for example to plug in a monitoring layer.\n\nThe returned executor must not drop or discard tasks — doing so will cause the collector to wait indefinitely for results that will never arrive.\n\n    list.stream()\n      .collect(parallel(i -\u003e foo(i), c -\u003e c\n        .executorDecorator(exec -\u003e task -\u003e {\n            metrics.incrementAndGet();\n            exec.execute(task);\n        }),\n      toList()));\n\n**`taskDecorator(UnaryOperator\u003cRunnable\u003e)`** wraps each individual task before it is handed to the executor. Unlike the executor decorator, it runs on the worker thread and is re-applied for every element. This makes it the right tool for propagating thread-local context (MDC, OpenTelemetry spans, `SecurityContext`) into worker threads:\n\n    var snapshot = MDC.getCopyOfContextMap();\n\n    list.stream()\n      .collect(parallel(i -\u003e foo(i), c -\u003e c\n        .taskDecorator(task -\u003e () -\u003e {\n            MDC.setContextMap(snapshot);\n            try {\n                task.run();\n            } finally {\n                MDC.clear();\n            }\n        }),\n      toList()));\n\nBoth decorators can be combined and each may be specified at most once per configurer.\n\n### Leveraging CompletableFuture\n\nParallel Collectors expose results wrapped in `CompletableFuture` instances, which provides great flexibility and the possibility of working with them in a non-blocking fashion:\n\n    CompletableFuture\u003cList\u003cString\u003e\u003e result = list.stream()\n      .collect(parallel(i -\u003e foo(i), toList()));\n\nThis makes it possible to conveniently apply callbacks and compose with other `CompletableFuture`s:\n\n    list.stream()\n      .collect(parallel(i -\u003e foo(i), toSet()))\n      .thenAcceptAsync(System.out::println, otherExecutor)\n      .thenRun(() -\u003e System.out.println(\"Finished!\"));\n\nOr just `join()` if you just want to block the calling thread and wait for the result:\n\n    List\u003cString\u003e result = list.stream()\n      .collect(parallel(i -\u003e foo(i), toList()))\n      .join();\n\nWhat's more, since JDK9, [you can even provide your own timeout easily](https://docs.oracle.com/en/java/javase/11/docs/api/java.base/java/util/concurrent/CompletableFuture.html#orTimeout(long,java.util.concurrent.TimeUnit)).\n      \n## Examples\n\n##### 1. Apply `i -\u003e foo(i)` in parallel using Virtual Threads and collect to `List`\n\n    CompletableFuture\u003cList\u003cString\u003e\u003e result = list.stream()\n      .collect(parallel(i -\u003e foo(i), toList()));\n\n##### 2. Apply `i -\u003e foo(i)` in parallel on a custom `Executor` with max parallelism of 4 and collect to `Set`\n\n    Executor executor = ...\n\n    CompletableFuture\u003cSet\u003cString\u003e\u003e result = list.stream()\n      .collect(parallel(i -\u003e foo(i), c -\u003e c\n        .executor(executor)\n        .parallelism(4),\n      toSet()));\n\n##### 3. Apply `i -\u003e foo(i)` in parallel with batching and collect to `LinkedList`\n\n    CompletableFuture\u003cList\u003cString\u003e\u003e result = list.stream()\n      .collect(parallel(i -\u003e foo(i), c -\u003e c.parallelism(4).batching(),\n        toCollection(LinkedList::new)));\n\n##### 4. Apply `i -\u003e foo(i)` in parallel and stream results in completion order\n\n    list.stream()\n      .collect(parallelToStream(i -\u003e foo(i)))\n      .forEach(i -\u003e ...);\n\n##### 5. Apply `i -\u003e foo(i)` in parallel and stream results in the original order\n\n    list.stream()\n      .collect(parallelToStream(i -\u003e foo(i), c -\u003e c.ordered()))\n      .forEach(i -\u003e ...);\n\n##### 6. Classify and process elements in parallel by group\n\n    CompletableFuture\u003cStream\u003cGroup\u003cString, String\u003e\u003e\u003e result = tasks.stream()\n      .collect(parallelBy(Task::groupId, t -\u003e compute(t)));\n\n##### 7. Apply `i -\u003e foo(i)` in parallel with full configuration\n\n    Executor executor = ...\n\n    CompletableFuture\u003cList\u003cString\u003e\u003e result = list.stream()\n      .collect(parallel(i -\u003e foo(i), c -\u003e c\n        .executor(executor)\n        .parallelism(64)\n        .batching(),\n      toList()));\n\n##### 8. Propagate MDC context into worker threads via `taskDecorator`\n\n    var snapshot = MDC.getCopyOfContextMap();\n\n    CompletableFuture\u003cList\u003cString\u003e\u003e result = list.stream()\n      .collect(parallel(i -\u003e foo(i), c -\u003e c\n        .taskDecorator(task -\u003e () -\u003e {\n            MDC.setContextMap(snapshot);\n            try {\n                task.run();\n            } finally {\n                MDC.clear();\n            }\n        }),\n      toList()));\n\n##### 9. Instrument every task submission via `executorDecorator`\n\n    var submitted = new AtomicInteger();\n\n    CompletableFuture\u003cList\u003cString\u003e\u003e result = list.stream()\n      .collect(parallel(i -\u003e foo(i), c -\u003e c\n        .executorDecorator(exec -\u003e task -\u003e {\n            submitted.incrementAndGet();\n            exec.execute(task);\n        }),\n      toList()));\n\n## Rationale\n\nStream API is a great tool for collection processing, especially if you need to parallelize the execution of CPU-intensive tasks, for example:\n\n    public static void parallelSetAll(int[] array, IntUnaryOperator generator) {\n        Objects.requireNonNull(generator);\n        IntStream.range(0, array.length).parallel().forEach(i -\u003e { array[i] = generator.applyAsInt(i); });\n    }\n    \n**However, Parallel Streams execute tasks on a shared `ForkJoinPool` instance**.\n \nUnfortunately, it's not the best choice for running blocking operations even when using `ManagedBlocker` - [as explained here by Tagir Valeev](https://stackoverflow.com/a/37518272/2229438) - this could easily lead to the saturation of the common pool, and to a performance degradation of everything that uses it.\n\nFor example:\n\n    List\u003cString\u003e result = list.parallelStream()\n      .map(i -\u003e foo(i)) // runs implicitly on ForkJoinPool.commonPool()\n      .toList();\n\nTo avoid such problems, **the solution is to isolate blocking tasks** and run them on a separate thread pool... but there's a catch.\n\n**Sadly, Streams can only run parallel computations on the common `ForkJoinPool`**, which effectively restricts their applicability to CPU-bound jobs.\n\nHowever, there's a trick that allows running parallel Streams in a custom FJP instance... but it's not considered reliable (and can still induce oversubscription issues while competing with the common pool for resources)\n\n\u003e Note, however, that this technique of submitting a task to a fork-join pool to run the parallel stream in that pool is an implementation \"trick\" and is not guaranteed to work. Indeed, the threads or thread pool that is used for the execution of parallel streams is unspecified. By default, the common fork-join pool is used, but in different environments, different thread pools might end up being used. \n\nSays [Stuart Marks on StackOverflow](https://stackoverflow.com/questions/28985704/parallel-stream-from-a-hashset-doesnt-run-in-parallel/29272776#29272776). \n\nNot even mentioning that this approach was seriously flawed before JDK-10 - if a `Stream` was targeted towards another pool, splitting would still need to adhere to the parallelism of the common pool and not the one of the targeted pool [[JDK8190974]](https://bugs.openjdk.java.net/browse/JDK-8190974).\n   \n### Dependencies\n\nNone - the library is implemented using core Java libraries.\n\n### Limitations\n\n- Upstream `Stream` is always evaluated as a whole, even if the following operation is short-circuiting.\nThis means that none of these should be used to work with infinite streams. The design of the `Collector` API imposes this limitation.\n\n- Never use Parallel Collectors with `Executor`s with `RejectedExecutionHandler` that discards tasks - this might result in a deadlock.\n\n### Good Practices\n\n- Consider providing reasonable timeouts for `CompletableFuture`s in order to not block for unreasonably long in case when something bad happens [(how-to)](https://docs.oracle.com/en/java/javase/11/docs/api/java.base/java/util/concurrent/CompletableFuture.html#orTimeout(long,java.util.concurrent.TimeUnit))\n- Name your thread pools - it makes debugging easier\n- Limit the size of a working queue of your thread pool [(source)](https://mechanical-sympathy.blogspot.com/2012/05/apply-back-pressure-when-overloaded.html)\n- Limit the level of parallelism [(source)](https://mechanical-sympathy.blogspot.com/2012/05/apply-back-pressure-when-overloaded.html)\n- A no-longer-used `ExecutorService` should be shut down to allow reclamation of its resources\n- Keep in mind that `CompletableFuture#then(Apply|Combine|Consume|Run|Accept)` might be executed by the calling thread. If this is not suitable, use `CompletableFuture#then(Apply|Combine|Consume|Run|Accept)Async` instead, and provide a custom _Executor_ instance.\n\n## Words of Caution\n\nWhile Parallel Collectors and Virtual Threads make parallelization easy, it doesn't always mean it's the best choice. Platform threads are resource-intensive, and parallelism comes with a cost. \n\nBefore opting for parallel processing, consider addressing the root cause through alternatives like DB-level JOIN statements, batching, data reorganization, or... simply selecting a more suitable API method.\n\n----\nSee [CHANGELOG.MD](https://github.com/pivovarit/parallel-collectors/blob/main/CHANGELOG.MD) for a complete version history.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpivovarit%2Fparallel-collectors","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fpivovarit%2Fparallel-collectors","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpivovarit%2Fparallel-collectors/lists"}