{"id":36694178,"url":"https://github.com/noleme/noleme-flow","last_synced_at":"2026-01-12T11:26:19.825Z","repository":{"id":37982730,"uuid":"351253606","full_name":"noleme/noleme-flow","owner":"noleme","description":"A library enabling DAG structuring of data processing programs such as ETLs","archived":false,"fork":false,"pushed_at":"2025-12-13T13:49:56.000Z","size":333,"stargazers_count":17,"open_issues_count":12,"forks_count":4,"subscribers_count":3,"default_branch":"master","last_synced_at":"2025-12-15T06:08:02.120Z","etag":null,"topics":["dag","dataflow","etl-pipeline","java","workflow","workflow-engine"],"latest_commit_sha":null,"homepage":"","language":"Java","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/noleme.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2021-03-24T23:41:01.000Z","updated_at":"2025-12-13T13:50:00.000Z","dependencies_parsed_at":"2023-02-10T07:01:45.724Z","dependency_job_id":"4d67462d-243f-4093-9bc4-646150bc1ff6","html_url":"https://github.com/noleme/noleme-flow","commit_stats":null,"previous_names":[],"tags_count":12,"template":false,"template_full_name":null,"purl":"pkg:github/noleme/noleme-flow","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/noleme%2Fnoleme-flow","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/noleme%2Fnoleme-flow/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/noleme%2Fnoleme-flow/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/noleme%2Fnoleme-flow/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/noleme","download_url":"https://codeload.github.com/noleme/noleme-flow/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/noleme%2Fnoleme-flow/sbom","scorecard":{"id":693474,"data":{"date":"2025-08-11","repo":{"name":"github.com/noleme/noleme-flow","commit":"b745d8e2917f78280a3fd499c7619f3ab91d6991"},"scorecard":{"version":"v5.2.1-40-gf6ed084d","commit":"f6ed084d17c9236477efd66e5b258b9d4cc7b389"},"score":4.7,"checks":[{"name":"Dangerous-Workflow","score":10,"reason":"no dangerous workflow patterns detected","details":null,"documentation":{"short":"Determines if the project's GitHub Action workflows avoid dangerous patterns.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#dangerous-workflow"}},{"name":"Maintained","score":0,"reason":"0 commit(s) and 0 issue activity found in the last 90 days -- score normalized to 0","details":null,"documentation":{"short":"Determines if the project is \"actively maintained\".","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#maintained"}},{"name":"Code-Review","score":0,"reason":"Found 0/3 approved changesets -- score normalized to 0","details":null,"documentation":{"short":"Determines if the project requires human code review before pull requests (aka merge requests) are merged.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#code-review"}},{"name":"Token-Permissions","score":0,"reason":"detected GitHub workflow tokens with excessive permissions","details":["Info: jobLevel 'actions' permission set to 'read': .github/workflows/codeql.yml:16","Info: jobLevel 'contents' permission set to 'read': .github/workflows/codeql.yml:17","Warn: no topLevel permission defined: .github/workflows/codeql.yml:1","Warn: no topLevel permission defined: .github/workflows/maven-build.yml:1","Warn: no topLevel permission defined: .github/workflows/maven-checkstyle.yml:1","Warn: no topLevel permission defined: .github/workflows/maven-code-coverage.yml:1","Warn: no topLevel permission defined: .github/workflows/maven-publish.yml:1","Info: no jobLevel write permissions found"],"documentation":{"short":"Determines if the project's workflows follow the principle of least privilege.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#token-permissions"}},{"name":"Binary-Artifacts","score":10,"reason":"no binaries found in the repo","details":null,"documentation":{"short":"Determines if the project has generated executable (binary) artifacts in the source repository.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#binary-artifacts"}},{"name":"Pinned-Dependencies","score":0,"reason":"dependency not pinned by hash detected -- score normalized to 0","details":["Warn: GitHub-owned GitHubAction not pinned by hash: .github/workflows/codeql.yml:29: update your workflow using https://app.stepsecurity.io/secureworkflow/noleme/noleme-flow/codeql.yml/master?enable=pin","Warn: GitHub-owned GitHubAction not pinned by hash: .github/workflows/codeql.yml:31: update your workflow using https://app.stepsecurity.io/secureworkflow/noleme/noleme-flow/codeql.yml/master?enable=pin","Warn: GitHub-owned GitHubAction not pinned by hash: .github/workflows/codeql.yml:37: update your workflow using https://app.stepsecurity.io/secureworkflow/noleme/noleme-flow/codeql.yml/master?enable=pin","Warn: GitHub-owned GitHubAction not pinned by hash: .github/workflows/codeql.yml:51: update your workflow using https://app.stepsecurity.io/secureworkflow/noleme/noleme-flow/codeql.yml/master?enable=pin","Warn: GitHub-owned GitHubAction not pinned by hash: .github/workflows/codeql.yml:64: update your workflow using https://app.stepsecurity.io/secureworkflow/noleme/noleme-flow/codeql.yml/master?enable=pin","Warn: GitHub-owned GitHubAction not pinned by hash: .github/workflows/maven-build.yml:21: update your workflow using https://app.stepsecurity.io/secureworkflow/noleme/noleme-flow/maven-build.yml/master?enable=pin","Warn: GitHub-owned GitHubAction not pinned by hash: .github/workflows/maven-build.yml:23: update your workflow using https://app.stepsecurity.io/secureworkflow/noleme/noleme-flow/maven-build.yml/master?enable=pin","Warn: GitHub-owned GitHubAction not pinned by hash: .github/workflows/maven-build.yml:30: update your workflow using https://app.stepsecurity.io/secureworkflow/noleme/noleme-flow/maven-build.yml/master?enable=pin","Warn: GitHub-owned GitHubAction not pinned by hash: .github/workflows/maven-checkstyle.yml:21: update your workflow using https://app.stepsecurity.io/secureworkflow/noleme/noleme-flow/maven-checkstyle.yml/master?enable=pin","Warn: GitHub-owned GitHubAction not pinned by hash: .github/workflows/maven-checkstyle.yml:23: update your workflow using https://app.stepsecurity.io/secureworkflow/noleme/noleme-flow/maven-checkstyle.yml/master?enable=pin","Warn: third-party GitHubAction not pinned by hash: .github/workflows/maven-checkstyle.yml:27: update your workflow using https://app.stepsecurity.io/secureworkflow/noleme/noleme-flow/maven-checkstyle.yml/master?enable=pin","Warn: GitHub-owned GitHubAction not pinned by hash: .github/workflows/maven-code-coverage.yml:21: update your workflow using https://app.stepsecurity.io/secureworkflow/noleme/noleme-flow/maven-code-coverage.yml/master?enable=pin","Warn: GitHub-owned GitHubAction not pinned by hash: .github/workflows/maven-code-coverage.yml:23: update your workflow using https://app.stepsecurity.io/secureworkflow/noleme/noleme-flow/maven-code-coverage.yml/master?enable=pin","Warn: third-party GitHubAction not pinned by hash: .github/workflows/maven-code-coverage.yml:29: update your workflow using https://app.stepsecurity.io/secureworkflow/noleme/noleme-flow/maven-code-coverage.yml/master?enable=pin","Warn: GitHub-owned GitHubAction not pinned by hash: .github/workflows/maven-publish.yml:9: update your workflow using https://app.stepsecurity.io/secureworkflow/noleme/noleme-flow/maven-publish.yml/master?enable=pin","Warn: GitHub-owned GitHubAction not pinned by hash: .github/workflows/maven-publish.yml:11: update your workflow using https://app.stepsecurity.io/secureworkflow/noleme/noleme-flow/maven-publish.yml/master?enable=pin","Info:   0 out of  14 GitHub-owned GitHubAction dependencies pinned","Info:   0 out of   2 third-party GitHubAction dependencies pinned"],"documentation":{"short":"Determines if the project has declared and pinned the dependencies of its build process.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#pinned-dependencies"}},{"name":"CII-Best-Practices","score":0,"reason":"no effort to earn an OpenSSF best practices badge detected","details":null,"documentation":{"short":"Determines if the project has an OpenSSF (formerly CII) Best Practices Badge.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#cii-best-practices"}},{"name":"Security-Policy","score":0,"reason":"security policy file not detected","details":["Warn: no security policy file detected","Warn: no security file to analyze","Warn: no security file to analyze","Warn: no security file to analyze"],"documentation":{"short":"Determines if the project has published a security policy.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#security-policy"}},{"name":"Fuzzing","score":0,"reason":"project is not fuzzed","details":["Warn: no fuzzer integrations found"],"documentation":{"short":"Determines if the project uses fuzzing.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#fuzzing"}},{"name":"License","score":10,"reason":"license file detected","details":["Info: project has a license file: LICENSE:0","Info: FSF or OSI recognized license: MIT License: LICENSE:0"],"documentation":{"short":"Determines if the project has defined a license.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#license"}},{"name":"Signed-Releases","score":-1,"reason":"no releases found","details":null,"documentation":{"short":"Determines if the project cryptographically signs release artifacts.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#signed-releases"}},{"name":"Packaging","score":10,"reason":"packaging workflow detected","details":["Info: Project packages its releases by way of GitHub Actions.: .github/workflows/maven-publish.yml:6"],"documentation":{"short":"Determines if the project is published as a package that others can easily download, install, easily update, and uninstall.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#packaging"}},{"name":"Branch-Protection","score":-1,"reason":"internal error: error during branchesHandler.setup: internal error: githubv4.Query: Resource not accessible by integration","details":null,"documentation":{"short":"Determines if the default and release branches are protected with GitHub's branch protection settings.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#branch-protection"}},{"name":"Vulnerabilities","score":10,"reason":"0 existing vulnerabilities detected","details":null,"documentation":{"short":"Determines if the project has open, known unfixed vulnerabilities.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#vulnerabilities"}},{"name":"SAST","score":8,"reason":"SAST tool detected but not run on all commits","details":["Info: SAST configuration detected: CodeQL","Warn: 16 commits out of 28 are checked with a SAST tool"],"documentation":{"short":"Determines if the project uses static code analysis.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#sast"}}]},"last_synced_at":"2025-08-22T02:54:40.636Z","repository_id":37982730,"created_at":"2025-08-22T02:54:40.636Z","updated_at":"2025-08-22T02:54:40.636Z"},"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28338971,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-01-12T10:58:46.209Z","status":"ssl_error","status_checked_at":"2026-01-12T10:58:42.742Z","response_time":98,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["dag","dataflow","etl-pipeline","java","workflow","workflow-engine"],"created_at":"2026-01-12T11:26:19.207Z","updated_at":"2026-01-12T11:26:19.818Z","avatar_url":"https://github.com/noleme.png","language":"Java","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Noleme Flow\n\n[![Maven Build](https://github.com/noleme/noleme-flow/actions/workflows/maven-build.yml/badge.svg?branch=master)](https://github.com/noleme/noleme-flow/actions/workflows/maven-build.yml)\n[![Maven Central Repository](https://maven-badges.herokuapp.com/maven-central/com.noleme/noleme-flow/badge.svg)](https://maven-badges.herokuapp.com/maven-central/com.noleme/noleme-flow)\n[![javadoc](https://javadoc.io/badge2/com.noleme/noleme-flow/javadoc.svg)](https://javadoc.io/doc/com.noleme/noleme-flow)\n[![coverage](https://codecov.io/gh/noleme/noleme-flow/branch/master/graph/badge.svg?token=Y9FD38RLDE)](https://codecov.io/gh/noleme/noleme-flow)\n![GitHub](https://img.shields.io/github/license/noleme/noleme-flow)\n[![FOSSA Status](https://app.fossa.com/api/projects/git%2Bgithub.com%2Fnoleme%2Fnoleme-flow.svg?type=shield)](https://app.fossa.com/projects/git%2Bgithub.com%2Fnoleme%2Fnoleme-flow?ref=badge_shield)\n\nThis library provides features enabling DAG structuring of data processing programs such as ETLs.\n\n_Note: This library is considered as \"in beta\" and as such significant API changes may occur without prior warning._\n\n_Note (2025-12): Looking back on this project, I remain convinced that there are quite a few \"nuggets of truth\" in this library's design ; I got bogged down when attempting to fix some of its core internals, but its composition model is still something I find myself longing for in most of my projects. I plan to restore it and work on its shortcomings in a hopefully not too far future, possibly by relying on concurrency APIs provided by newer JDKs, and most likely with a more marked focus on observability features_\n\n## I. Installation\n\nAdd the following in your `pom.xml`:\n\n```xml\n\u003cdependency\u003e\n    \u003cgroupId\u003ecom.noleme\u003c/groupId\u003e\n    \u003cartifactId\u003enoleme-flow\u003c/artifactId\u003e\n    \u003cversion\u003e0.18.1\u003c/version\u003e\n\u003c/dependency\u003e\n```\n\n## II. Notes on Structure and Design\n\nThe core idea behind this library is to structure the program as a DAG of \"actions\" to perform, then compile that graph into concrete runnable instance which properties will depend on the implementation.\n \nThese actions can be of three different types, mirroring an ETL process:\n\n* `Extractor` for introducing data into the flow, they would typically be connectors for a variety of data sources \n* `Transformer` for manipulating data and returning an altered version of it, or new data inferred from the input\n* `Loader` for dumping data out of the flow\n\nAdditionally, there are two actions related to stream flows:\n\n* `Generator` for gradually introducing data into the flow, they can be iterating through a `Collection`, reading off an `InputStream` or generating data on-the-fly\n* `Accumulator` for doing the reverse operation, they accumulate all outputs from a stream flow and continue with a standard (ie. non-stream) flow\n\nOnce a DAG has been defined, a `FlowCompiler` will be responsible for transforming the DAG representation into a runnable instance, a `FlowRuntime`. \n\nAt the time of this writing, there are two available implementations:\n\n* the serial `PipelineRuntime` which will run one node after another, making sure each one can satisfy its input\n* the parallel `ParallelRuntime` which will attempt to run any node that can satisfy its input in a parallel fashion\n\nOnce a `FlowRuntime` has been produced, we can simply `run` it.\n\n_TODO_\n\n## III. Usage\n\nFirst, let us start by the end and have a look at what it can look like \"in practice\".\n\nStarting with a CSV located at `data/my.csv` such that:\n\n```csv\nkey,value,metadata,flag\n0,234,interesting,false\n1,139,not_interesting,false\n3,982,interesting,true\n4,389,interesting,false\n5,093,not_interesting,false\n```\n\nBelow is a flow that will leverage [tablesaw](https://github.com/jtablesaw/tablesaw) for transforming this local CSV, perform some transformations, and dump it back on the filesystem.\n\n```java\nvar flow = Flow\n    .from(new FileStreamer(), \"data/my.csv\")\n    .pipe(new TablesawCSVParser(tableProperties)) //tableProperties is a tablesaw-specific configuration classs, don't mind it\n    .pipe(table -\u003e table.where(t -\u003e t.stringColumn(\"metadata\").isEqualTo(\"interesting\")))\n    .pipe(table -\u003e table.where(t -\u003e t.booleanColumn(\"flag\").isFalse()))\n    .sink(new TablesawCSVWrite(\"data/my-filtered.csv\"))\n;\n```\n\nThe overarching goal for `noleme-flow` is to have a simple yet flexible API that can enable both:\n* simplistic scenarios like this one, where ease of use and not being locked-in by a heavy ecosystem is paramount: `noleme-flow` aims to remain first and foremost a lightweight library enabling quick drafts\n* intermediate and complex scenarios joining multiple data-sources where `noleme-flow`'s main goal shifts towards enabling better code reuse, with the help of [noleme-vault](https://github.com/noleme/noleme-vault) for configuration management, by making it easy to bundle flow sequences for reuse into larger flow graphs \n\nImplementations mentioned above can be found over at [noleme-flow-connectors](https://github.com/noleme/noleme-flow-connectors).\n\nGoing back, here is a very basic example of pipeline we could create:\n\n```java\n/* We initialize a flow */\nvar flow = Flow\n    .from(() -\u003e 1)\n    .pipe(i -\u003e i + 1)\n    .pipe(i -\u003e i * 2)\n    .sink(System.out::println)\n;\n\n/* We run it as a Pipeline */\nFlow.runAsPipeline(flow);\n```\n\nWhich, upon running should print `4`.\n\nAnother example:\n\n```java\n/* We initialize a flow */\nvar flow = Flow\n    .from(() -\u003e 2)\n    .pipe(i -\u003e i * 2)\n;\n\n/* We branch the flow in two branchs */\nvar branchA = flow.pipe(i -\u003e i * i);\nvar branchB = flow.pipe(i -\u003e i * 5);\n\n/* We join the two branchs and collect the end result */\nvar recipient = branchA\n    .join(branchB, Integer::sum)\n    .pipe(i -\u003e i * 2)\n    .collect()\n;\n\nvar output = Flow.runAsPipeline(flow);\n\nSystem.out.println(output.get(recipient));\n```\n\nUpon running this should print `72` (`2*((2*2)^2)+((2*2)*5)`).\n\nNow a final example with a stream flow going on:\n\n```java\n/* Let's have a \"standard\" flow doing its thing */\nvar branch = Flow\n    .from(() -\u003e 2)\n    .pipe(i -\u003e i + 1)\n;\n\n/* Create a \"stream\" flow from a list of integers  */\nvar flow = Flow\n    .from(() -\u003e List.of(0, 1, 2, 3, 4, 5, 6, 7, 8, 9))\n    .stream(IterableGenerator::new)\n    .pipe(i -\u003e i * i)\n    .join(branch, (f, b) -\u003e f * b) /* All values in the main flow will be multiplied by the output from the branch flow */\n    .accumulate(values -\u003e values.stream() /* Once the generator is exhausted and all stream nodes have run, we gather the output integers and sum them ; note that accumulation is optional (you could also end the stream with a sink) */\n        .reduce(Integer::sum)\n        .orElseThrow(() -\u003e new AccumulationException(\"Could not sum data.\"))\n    )\n    .pipe(i -\u003e i + 1) /* After the accumulation step, the flow is back to being a \"standard\" flow so we can queue further transformations */\n    .sink(System.out::println)\n;\n\nFlow.runAsPipeline(flow);\n```\n\nUpon running this should print `856`.\n\nNote that `noleme-flow` itself doesn't provide any `Generator` implementation, but the `IterableGenerator` class mentioned above is part of the `noleme-flow-connect-commons` library ([over there](https://github.com/noleme/noleme-flow-connectors)).\n\nOther features that will need to be documented include:\n\n* the complete set of DAG building methods (including alternate flavours of `from`, `stream`, as well as `driftSink`, `after` and the generic `into`)\n* control-flow with partial DAG interruption (`interrupt` and `interruptIf`, `nonFatal` helpers)\n* runtime input management (dynamic `from` and the `Input` component) \n* runtime output management, sampling/collection features (`collect`, `sample` and the `Output` component)\n* stream flows and parallelization (`setMaxParallelism` and implementation-specific considerations)\n* flow slices (`SourceSlice`, `PipeSlice` and `SinkSlice`) for flow DAG fragments code reuse\n* `ParallelRuntime` service executor lifecycle and other considerations\n* DAG node naming for debugging purposes (appears in traces, logs)\n\n_TODO_\n\n## IV. Dev Installation\n\nThis project will require you to have the following:\n\n* Java 11+\n* Git (versioning)\n* Maven (dependency resolving, publishing and packaging) \n\n\n## License\n[![FOSSA Status](https://app.fossa.com/api/projects/git%2Bgithub.com%2Fnoleme%2Fnoleme-flow.svg?type=large)](https://app.fossa.com/projects/git%2Bgithub.com%2Fnoleme%2Fnoleme-flow?ref=badge_large)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnoleme%2Fnoleme-flow","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fnoleme%2Fnoleme-flow","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnoleme%2Fnoleme-flow/lists"}