{"id":42309746,"url":"https://github.com/proddata/plugin-transform","last_synced_at":"2026-01-27T11:15:52.967Z","repository":{"id":330418833,"uuid":"1122128205","full_name":"proddata/plugin-transform","owner":"proddata","description":"Typed Map/Transform plugin for Kestra: declarative record mapping, casting, and derived fields with Ion support.","archived":false,"fork":false,"pushed_at":"2025-12-25T20:37:29.000Z","size":229,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-12-26T22:44:18.848Z","etag":null,"topics":["kestra"],"latest_commit_sha":null,"homepage":"","language":"Java","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/proddata.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":"AGENTS.md","dco":null,"cla":null}},"created_at":"2025-12-24T06:34:38.000Z","updated_at":"2025-12-25T20:37:32.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/proddata/plugin-transform","commit_stats":null,"previous_names":["proddata/plugin-transform"],"tags_count":null,"template":false,"template_full_name":"kestra-io/plugin-template","purl":"pkg:github/proddata/plugin-transform","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/proddata%2Fplugin-transform","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/proddata%2Fplugin-transform/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/proddata%2Fplugin-transform/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/proddata%2Fplugin-transform/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/proddata","download_url":"https://codeload.github.com/proddata/plugin-transform/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/proddata%2Fplugin-transform/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28812372,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-01-27T07:41:26.337Z","status":"ssl_error","status_checked_at":"2026-01-27T07:41:08.776Z","response_time":168,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.6:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["kestra"],"created_at":"2026-01-27T11:15:52.293Z","updated_at":"2026-01-27T11:15:52.955Z","avatar_url":"https://github.com/proddata.png","language":"Java","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Kestra Transform Plugin\n\nTyped, streaming-friendly transform tasks for Kestra using Amazon Ion.\n\n## Tasks\n- `io.kestra.plugin.transform.Select`: align N inputs by position + optional filter + projection\n- `io.kestra.plugin.transform.Map`: project/rename/cast fields\n- `io.kestra.plugin.transform.Unnest`: explode array fields into rows\n- `io.kestra.plugin.transform.Filter`: keep/drop records by boolean expression\n- `io.kestra.plugin.transform.Aggregate`: group-by and typed aggregates\n- `io.kestra.plugin.transform.Zip`: merge multiple record streams by position\n\n## Input and output\nMost tasks accept:\n- `from`: in-memory records (map/list/Ion) or a storage URI (`kestra://...`)\n\nMulti-input tasks:\n- `Zip`: `inputs` (list)\n- `Select`: `inputs` (list)\n\nOutput mode (`output`) supports:\n- `AUTO` (default): STORE if `from` is a storage URI, else RECORDS\n- `RECORDS`: emit `outputs.records`\n- `STORE`: write newline-delimited Ion to internal storage and emit `outputs.uri`\n\nExperimental output format (`outputFormat`):\n- `TEXT` (default): newline-delimited Ion text\n- `BINARY`: Ion binary output (only readable by transform tasks; use TEXT as final step)\n\n## Expression language\n- Field access: `user.address.city` or `user[\"first name\"]`\n- Arrays: `items[].price`\n- Comparisons: `\u003e \u003c == != \u003e= \u003c=`\n- Boolean: `\u0026\u0026 || !`\n- Functions: `sum`, `count`, `min`, `max`, `avg`, `first`, `last`, `coalesce`, `concat`, `toInt`, `toDecimal`, `toString`, `toBoolean`, `parseTimestamp`\n\nCheat sheet: `docs/EXPRESSION_CHEATSHEET.md`\n\n## Type casting\n- `type` is optional in Map/Aggregate definitions.\n- If `type` is omitted, the evaluated Ion value is passed through.\n- `TIMESTAMP` supports Ion timestamps and ISO-8601 strings (`parseTimestamp` can convert strings).\n\n## Tasks (reference + examples)\n\n### Select\n```yaml\ntype: io.kestra.plugin.transform.Select\n```\nUse this when you want \"Zip + optional Filter + Map\" in one streaming operator: align inputs, filter rows, and project/cast output fields.\n\nCommon config:\n- `inputs`: list of inputs (1+)\n- `where`: optional boolean expression (supports `$1`, `$2`, ...)\n- `fields`: Map-style definitions (shorthand or `{ expr, type, optional }`)\n- `keepInputFields`: optional list of input indices to copy into output when `fields` is set (e.g. `[1]` keeps only `$1` fields)\n- `onLengthMismatch`: `FAIL | SKIP` (when inputs are different lengths)\n- `onError`: `FAIL | SKIP | KEEP` (KEEP emits the original merged row)\n\nExample: enrich orders with scores and output typed columns\n```yaml\n- id: select\n  type: io.kestra.plugin.transform.Select\n  inputs:\n    - \"{{ outputs.orders.values.records }}\"\n    - \"{{ outputs.scores.values.records }}\"\n  where: $1.amount \u003e 100 \u0026\u0026 $2.score \u003e 0.8\n  fields:\n    order_id:\n      expr: $1.order_id\n      type: INT\n    amount:\n      expr: $1.amount\n      type: DECIMAL\n    score:\n      expr: $2.score\n      type: DECIMAL\n  output: RECORDS\n```\n\n### Map\n```yaml\ntype: io.kestra.plugin.transform.Map\n```\nUse this when you want to normalize records into a typed schema: rename fields, compute derived fields, and cast values (without scripts).\n\nCommon config:\n- `fields`: shorthand `field: expr` or full `{ expr, type, optional }`\n- `keepOriginalFields`: keep input fields not mapped by target name\n- `dropNulls`: drop null fields from output\n- `onError`: `FAIL | SKIP | NULL` (NULL sets failing fields to null)\n\nExample: normalize API records into typed columns\n```yaml\n- id: normalize\n  type: io.kestra.plugin.transform.Map\n  from: \"{{ outputs.fetch.records }}\"\n\n  fields:\n    customer_id:\n      expr: user.id\n      type: STRING\n    created_at:\n      expr: createdAt\n      type: TIMESTAMP\n    total:\n      expr: sum(items[].price)\n      type: DECIMAL\n\n  keepOriginalFields: false\n  dropNulls: true\n  onError: SKIP\n```\nNote: `keepOriginalFields` keeps input fields not mapped by name; mapping `a_new: a` still keeps the original `a`.\n\n### Unnest\n```yaml\ntype: io.kestra.plugin.transform.Unnest\n```\nUse this when you want to explode an array field into multiple rows (one per element), similar to \"UNNEST\" in SQL.\n\nCommon config:\n- `path`: array path to explode (e.g. `items[]`)\n- `as`: field name that receives the element value\n- `keepOriginalFields`: keep original fields except the exploded array field\n\nExample: explode items into one row per item\n```yaml\n- id: explode_items\n  type: io.kestra.plugin.transform.Unnest\n  from: \"{{ outputs.fetch.records }}\"\n  path: items[]\n  as: item\n```\n\n### Filter\n```yaml\ntype: io.kestra.plugin.transform.Filter\n```\nUse this when you want to keep/drop records based on a boolean expression, like a SQL `WHERE`.\n\nCommon config:\n- `where`: boolean expression evaluated per record\n- `onError`: `FAIL | SKIP | KEEP` (KEEP keeps the record if `where` fails)\n\nExample: keep only expensive items\n```yaml\n- id: expensive_items\n  type: io.kestra.plugin.transform.Filter\n  from: \"{{ outputs.explode_items.records }}\"\n  where: item.price \u003e 10\n```\n\n### Aggregate\n```yaml\ntype: io.kestra.plugin.transform.Aggregate\n```\nUse this when you want typed group-by aggregates (count/sum/min/max) without exporting to a database.\n\nCommon config:\n- `groupBy`: list of fields that form the group key\n- `aggregates`: Map-style definitions `{ expr, type, optional }` (type optional)\n\nExample: compute per-customer totals\n```yaml\n- id: totals\n  type: io.kestra.plugin.transform.Aggregate\n  from: \"{{ outputs.normalize.records }}\"\n  groupBy:\n    - customer_id\n    - country\n  aggregates:\n    order_count:\n      expr: count()\n      type: INT\n    total_spent:\n      expr: sum(total_spent)\n      type: DECIMAL\n    last_order_at:\n      expr: max(created_at)\n      type: TIMESTAMP\n  onError: FAIL\n```\n\n### Zip\n```yaml\ntype: io.kestra.plugin.transform.Zip\n```\nUse this when you have multiple sources already aligned by row order and want to merge them positionally (record i with record i).\n\nCommon config:\n- `inputs`: list of inputs (2+)\n- `onConflict`: `FAIL | LEFT | RIGHT` when two inputs have the same field name\n\nExample: merge two record streams by row position\n```yaml\n- id: zip\n  type: io.kestra.plugin.transform.Zip\n  inputs:\n    - \"{{ outputs.left.values.records }}\"\n    - \"{{ outputs.right.values.records }}\"\n  onConflict: RIGHT\n```\n\n## Examples index\n- `examples/api_to_typed_records.yml`: normalize API output into typed fields\n- `examples/http_download_transform.yml`: download products, unnest, map, and store\n- `examples/dummyjson_products_flow.yml`: unnest products, filter, map\n- `examples/dummyjson_carts_flow.yml`: compute max product total per cart and filter\n- `examples/dummyjson_users_flow.yml`: unnest users, map, filter\n- `examples/aggregate_totals.yml`: group and aggregate totals\n- `examples/zip_basic.yml`: zip record streams by position\n\nMore flows live in `examples/`.\n\n## Migration notes\nSee `docs/UPGRADE.md` for breaking changes (options flattening, renamed fields).\n\n## Development\nPrerequisites:\n- Java 21\n- Docker\n\nRun tests:\n```sh\n./gradlew test\n```\n\nRun Kestra locally with the plugin:\n```sh\n./gradlew shadowJar \u0026\u0026 docker build -t kestra-custom . \u0026\u0026 docker run --rm -p 8080:8080 kestra-custom server local\n```\n\n## Benchmarks\nOpt-in benchmarks live in `src/test/java/io/kestra/plugin/transform/BenchTest.java`.\n\nExamples:\n```sh\n./gradlew test --tests io.kestra.plugin.transform.BenchTest -Dbench=true -Dbench.records=10000,100000 -Dbench.format=text\n./gradlew test --tests io.kestra.plugin.transform.BenchTest -Dbench=true -Dbench.records=10000,100000 -Dbench.format=binary\n```\n\nReports are written to `build/bench/report.md`.\n\n## Documentation\nKestra docs: https://kestra.io/docs  \nPlugin developer guide: https://kestra.io/docs/plugin-developer-guide\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fproddata%2Fplugin-transform","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fproddata%2Fplugin-transform","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fproddata%2Fplugin-transform/lists"}