{"id":19651275,"url":"https://github.com/embulk/embulk-filter-calcite","last_synced_at":"2025-04-28T16:31:23.243Z","repository":{"id":49502946,"uuid":"83230209","full_name":"embulk/embulk-filter-calcite","owner":"embulk","description":null,"archived":false,"fork":false,"pushed_at":"2021-06-16T11:51:05.000Z","size":226,"stargazers_count":14,"open_issues_count":5,"forks_count":5,"subscribers_count":12,"default_branch":"master","last_synced_at":"2025-04-20T00:47:51.899Z","etag":null,"topics":["calcite","embulk","embulk-filter-plugin"],"latest_commit_sha":null,"homepage":null,"language":"Java","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/embulk.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2017-02-26T18:16:30.000Z","updated_at":"2021-06-16T11:44:26.000Z","dependencies_parsed_at":"2022-09-01T21:01:39.770Z","dependency_job_id":null,"html_url":"https://github.com/embulk/embulk-filter-calcite","commit_stats":null,"previous_names":[],"tags_count":9,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/embulk%2Fembulk-filter-calcite","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/embulk%2Fembulk-filter-calcite/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/embulk%2Fembulk-filter-calcite/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/embulk%2Fembulk-filter-calcite/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/embulk","download_url":"https://codeload.github.com/embulk/embulk-filter-calcite/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":251345918,"owners_count":21574806,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["calcite","embulk","embulk-filter-plugin"],"created_at":"2024-11-11T15:05:55.546Z","updated_at":"2025-04-28T16:31:22.599Z","avatar_url":"https://github.com/embulk.png","language":"Java","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Apache Calcite filter plugin for Embulk\n\n[![Build Status](https://github.com/embulk/embulk-filter-calcite/workflows/Build%20and%20test/badge.svg)](https://github.com/embulk/embulk-filter-calcite/actions?query=workflow%3A%22Build+and+test%22)\n\n## Overview\n\n* **Plugin type**: filter\n\nThis plugin allows users to translate rows flexibly by SQL queries specified by them.\n\n## Architecture\n\nThis plugin allows translating rows by SQL queries in Pages received from input plugin and sending the query results to next filter or output plugin as modified Pages. It uses [Apache Calcite](https://calcite.apache.org/), which is the foundation for your next high-performance database and enbles executing SQL queries to customized storage by the [custom adaptor](https://calcite.apache.org/docs/tutorial.html). The plugin applies Page storage adaptor to Apache Calcite and then enables executing SQL queries to Pages via JDBC Driver provided.\n\nHere is Embulk config example for this plugin:\n\n```yaml\nfilters:\n  - type: calcite\n    query: SELECT * FROM $PAGES\n```\n\nUsers can define `SELECT` query as query option in the filter config section. `$PAGES` represents Pages that input plugin creates and sends. `$PAGES` schema is Embulk input schema given. On the other hand, the output schema of the plugin is built from the metadata of query result. Embulk types are converted into Apache Calcite types internally. This is type mapping between Embulk and Apache Calcite.\n\n| Embulk type | Apache Calcite type |      JDBC type      |\n| ----------- | ------------------- | ------------------- |\n| boolean     | BOOLEAN             | java.lang.Boolean   |\n| long        | BIGINT              | java.lang.Long      |\n| double      | DOUBLE              | java.lang.Double    |\n| timestamp   | TIMESTAMP           | java.sql.Timestamp  |\n| string      | VARCHAR             | java.lang.String    |\n| json        | VARCHAR             | java.lang.String    |\n\nData types by Apache Calcite: https://calcite.apache.org/docs/reference.html#data-types\n\n## Configuration\n\n- **query**: SQL to run (string, required)\n- **default_timezone**: Configure timezone that is used for JDBC connection properties and Calcite engine. This option is one of [JDBC connect parameters](https://calcite.apache.org/docs/adapter.html#jdbc-connect-string-parameters) provided by Apache Calcite. java.util.TimeZone's [AvailableIDs](http://docs.oracle.com/javase/7/docs/api/java/util/TimeZone.html#getAvailableIDs) can be specified. (string, default: 'UTC')\n- **options**: extra JDBC properties. See [JDBC connect parameters](https://calcite.apache.org/docs/adapter.html#jdbc-connect-string-parameters). (hash, default: {})\n\n\n## Example\n\nThis config enables removing rows not associated to id 1 and 2 from Pages.\n```yaml\nfilters:\n  - type: calcite\n    query: SELECT * FROM $PAGES WHERE id IN (1, 2)\n```\n\nThe following is an example by LIKE operator and enables removing rows not matched at a specified pattern from Pages.\n```yaml\nfilters:\n  - type: calcite\n    query: SELECT * FROM $PAGES WHERE message LIKE '%EMBULK%'\n```\n\nThis enables adding new column and inserting the value combined 2 string column values.\n```yaml\nfilters:\n  - type: calcite\n    query: SELECT first_name || last_name AS name, * FROM $PAGES\n```\n\nAdds the new column by CURRENT_TIMESTAMP function.\n```yaml\nfilters:\n  - type: calcite\n    query: SELECT CURRENT_TIMESTAMP, * FROM $PAGES\n    default_timezone: 'America/Los_Angeles'\n```\n\nSQL language provided by Apache Calcite: https://calcite.apache.org/docs/reference.html\n\n## Build\n\n```\n$ ./gradlew gem\n```\n\n## Release\n\n```\n$ ./gradlew gemPush\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fembulk%2Fembulk-filter-calcite","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fembulk%2Fembulk-filter-calcite","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fembulk%2Fembulk-filter-calcite/lists"}