{"id":49844055,"url":"https://github.com/ydb-platform/ydb-sqlglot-plugin","last_synced_at":"2026-05-14T08:44:48.483Z","repository":{"id":330176659,"uuid":"1121852048","full_name":"ydb-platform/ydb-sqlglot-plugin","owner":"ydb-platform","description":null,"archived":false,"fork":false,"pushed_at":"2026-05-06T08:51:29.000Z","size":231,"stargazers_count":2,"open_issues_count":1,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-05-06T10:44:55.482Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ydb-platform.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":"AGENTS.md","dco":null,"cla":null}},"created_at":"2025-12-23T16:56:49.000Z","updated_at":"2026-05-05T14:07:36.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/ydb-platform/ydb-sqlglot-plugin","commit_stats":null,"previous_names":["ydb-platform/ydb-sqlglot-plugin"],"tags_count":7,"template":false,"template_full_name":null,"purl":"pkg:github/ydb-platform/ydb-sqlglot-plugin","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ydb-platform%2Fydb-sqlglot-plugin","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ydb-platform%2Fydb-sqlglot-plugin/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ydb-platform%2Fydb-sqlglot-plugin/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ydb-platform%2Fydb-sqlglot-plugin/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ydb-platform","download_url":"https://codeload.github.com/ydb-platform/ydb-sqlglot-plugin/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ydb-platform%2Fydb-sqlglot-plugin/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":33017709,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-13T13:14:54.681Z","status":"online","status_checked_at":"2026-05-14T02:00:06.663Z","response_time":57,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2026-05-14T08:44:47.682Z","updated_at":"2026-05-14T08:44:48.471Z","avatar_url":"https://github.com/ydb-platform.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# ydb-sqlglot-plugin\n\nYDB dialect plugin for [sqlglot](https://github.com/tobymao/sqlglot) — bidirectional transpilation between YDB/YQL and any SQL dialect.\n\n## Installation\n\n```bash\npip install ydb-sqlglot-plugin\n```\n\n## Usage\n\nAfter installing the package, the `ydb` dialect is available in sqlglot automatically — no extra imports needed:\n\n```python\nimport sqlglot\n\n# Any dialect → YDB\nresult = sqlglot.transpile(\"SELECT * FROM users WHERE id = 1\", read=\"mysql\", write=\"ydb\")[0]\n# → SELECT * FROM `users` WHERE id = 1\n\n# YDB → any dialect\nresult = sqlglot.transpile(\"$t = (SELECT id FROM users); SELECT * FROM $t AS t\", read=\"ydb\", write=\"postgres\")[0]\n# → WITH t AS (SELECT id FROM users) SELECT * FROM t AS t\n```\n\n## What the plugin does\n\n### Any SQL → YDB\n\n#### Table names\n\nDatabase-qualified names are rewritten to the YDB path format and wrapped in backticks:\n\n```sql\n-- input\nSELECT * FROM analytics.events\n\n-- output\nSELECT * FROM `analytics/events`\n```\n\n#### CTEs → YDB variables\n\n```sql\n-- input\nWITH active AS (SELECT * FROM users WHERE status = 'active')\nSELECT * FROM active\n\n-- output\n$active = (SELECT * FROM `users` WHERE status = 'active');\n\nSELECT * FROM $active AS active\n```\n\n#### Subquery decorrelation\n\nCorrelated subqueries (which YQL does not support) are rewritten as JOINs:\n\n```sql\n-- input\nSELECT id, (SELECT MAX(amount) FROM orders WHERE orders.user_id = users.id) AS max_order\nFROM users\n\n-- output\nSELECT users.id AS id, _u_0._u_2 AS max_order\nFROM `users`\nLEFT JOIN (\n    SELECT MAX(amount) AS _u_2, user_id AS _u_1\n    FROM `orders`\n    WHERE TRUE\n    GROUP BY user_id AS _u_1\n) AS _u_0 ON users.id = _u_0._u_1\n```\n\nThe same rewriting applies to `EXISTS`, `IN (subquery)`, and `ANY/ALL` subqueries.\n\n#### GROUP BY aliases\n\nYDB accepts aliases directly inside `GROUP BY` items. The generator uses this\nform for grouped columns so later clauses and decorrelated subqueries can refer\nto a stable grouping name:\n\n```sql\n-- input\nSELECT user_id, COUNT(*) FROM events GROUP BY user_id\n\n-- output\nSELECT user_id, COUNT(*) FROM `events` GROUP BY user_id AS user_id\n```\n\nIf a grouped column is selected under a generated alias, the `GROUP BY` item uses\nthat alias as well:\n\n```sql\nSELECT user_id AS _u_1, COUNT(*) FROM `events` GROUP BY user_id AS _u_1\n```\n\nPositional `GROUP BY` references are expanded before generation. When a\npositional reference points to a constant expression, the grouping item is\nremoved because YDB rejects grouping by constants.\n\n---\n\n### YDB → any SQL\n\nThe plugin parses YDB/YQL back into sqlglot's AST, enabling round-trips, YDB-to-YDB transformations, and transpilation to other dialects.\n\n#### Supported YQL constructs\n\n| Construct | Example |\n|---|---|\n| `$variable` references | `SELECT * FROM $t AS t` |\n| `Module::Function()` | `DateTime::GetYear(ts)` |\n| `DECLARE $p AS Type` | `DECLARE $p AS Int32` |\n| `FLATTEN [LIST\\|DICT\\|OPTIONAL] BY ...` / `FLATTEN COLUMNS` | `FROM t FLATTEN LIST BY col AS item`, `FROM t FLATTEN BY (a, b)`, `FROM t FLATTEN COLUMNS` |\n| `Optional\u003cT\u003e` / `T?` | `CAST(x AS Optional\u003cUtf8\u003e)` |\n| Container types | `CAST(x AS List\u003cInt32\u003e)`, `Dict\u003cUtf8, Int64\u003e`, `Set\u003cUtf8\u003e`, `Tuple\u003cInt32, Utf8\u003e` |\n| `ASSUME ORDER BY` | `SELECT * FROM t ASSUME ORDER BY id` |\n| `GROUP BY expr AS alias` / `GROUP COMPACT BY` | `SELECT v, COUNT(*) FROM t GROUP BY v AS v` |\n| `LEFT ONLY JOIN` | `SELECT * FROM a LEFT ONLY JOIN b USING (id)` |\n| `* WITHOUT (...)` projections | `SELECT b.* WITHOUT (b.id) FROM t AS b` |\n| Named expressions | `$t = (SELECT 1 AS x)` |\n| Lambda expressions | `($x, $y?) -\u003e ($x + COALESCE($y, 0))`, `($y) -\u003e { $p = \"x\"; RETURN $p \\|\\| $y }` |\n| YQL struct literals | `AsList(\u003c|user_id: \"u1\", description: NULL|\u003e)` |\n| `IN COMPACT` | `WHERE key IN COMPACT $values` |\n| `PRAGMA` | `PRAGMA AnsiImplicitCrossJoin` |\n| Table-valued functions | `SELECT * FROM AS_TABLE($Input) AS k` |\n| Table source options and index views | ``FROM `t` WITH TabletId='...'``, ``FROM `t` VIEW PRIMARY KEY v`` |\n| Function-valued expressions | `$grep(x)`, `DateTime::Format(\"%Y-%m-%d\")(ts)`, `Interval(\"P7D\")` |\n\nTable names without backticks are accepted on input; the generator always produces backtick-quoted output.\n\nThe parser also tolerates case variants that appear in real YQL dumps, such as\n`set\u003cUtf8\u003e`, `Tuple\u003cInt32, Utf8\u003e?`, and lowercase `return` in lambda blocks.\n\n#### CTEs reassembly\n\nYDB-style named expressions are automatically reassembled into standard `WITH` CTEs when targeting other dialects:\n\n```python\nydb_sql = \"$t = (SELECT 1 AS x); SELECT * FROM $t AS t\"\nparse_one(ydb_sql, dialect=\"ydb\").sql(dialect=\"postgres\")\n# → WITH t AS (SELECT 1 AS x) SELECT * FROM t AS t\n```\n\n---\n\n### Column lineage\n\nBecause YDB SQL is fully parsed into sqlglot's AST, column-level lineage works out of the box:\n\n```python\nfrom sqlglot.lineage import lineage\n\nnode = lineage(\"total\", \"$orders = (SELECT user_id, amount FROM orders); SELECT SUM(amount) AS total FROM $orders AS o\", dialect=\"ydb\")\nfor dep in node.walk():\n    print(dep.name, \"→\", dep.source)\n```\n\n---\n\n## Function reference\n\nFunctions below are recognized by sqlglot as standard SQL expressions and translated to their YQL equivalents. Dialect-specific functions that sqlglot does not parse into typed AST nodes are **passed through unchanged** — see [Limitations](#limitations).\n\n### Date / time\n\n| Input | YQL output |\n|---|---|\n| `DATE_TRUNC('day', x)` | `DATE(x)` |\n| `DATE_TRUNC('week', x)` | `DateTime::MakeDate(DateTime::StartOfWeek(x))` |\n| `DATE_TRUNC('month', x)` | `DateTime::MakeDate(DateTime::StartOfMonth(x))` |\n| `DATE_TRUNC('quarter', x)` | `DateTime::MakeDate(DateTime::StartOfQuarter(x))` |\n| `DATE_TRUNC('year', x)` | `DateTime::MakeDate(DateTime::StartOfYear(x))` |\n| `EXTRACT(WEEK FROM x)` | `DateTime::GetWeekOfYear(x)` |\n| `EXTRACT(MONTH FROM x)` | `DateTime::GetMonth(x)` |\n| `EXTRACT(YEAR FROM x)` | `DateTime::GetYear(x)` |\n| `CURRENT_TIMESTAMP` | `CurrentUtcTimestamp()` |\n| `STR_TO_DATE(str, fmt)` / `TO_DATE(str, fmt)` | `DateTime::MakeTimestamp(DateTime::Parse(fmt)(str))` |\n| `DATE_ADD(x, INTERVAL n MONTH)` | `DateTime::MakeDate(DateTime::ShiftMonths(x, n))` |\n| `DATE_ADD(x, INTERVAL n YEAR)` | `DateTime::MakeDate(DateTime::ShiftYears(x, n))` |\n| `DATE_ADD(x, INTERVAL n DAY)` | `x + DateTime::IntervalFromDays(n)` |\n| `DATE_ADD(x, INTERVAL n HOUR)` | `x + DateTime::IntervalFromHours(n)` |\n| `DATE_ADD(x, INTERVAL n MINUTE)` | `x + DateTime::IntervalFromMinutes(n)` |\n| `DATE_ADD(x, INTERVAL n SECOND)` | `x + DateTime::IntervalFromSeconds(n)` |\n| `DATE_SUB(x, INTERVAL n ...)` | same as `DATE_ADD` with `−` |\n| `INTERVAL n DAY` (literal) | `DateTime::IntervalFromDays(n)` |\n| `INTERVAL n HOUR` (literal) | `DateTime::IntervalFromHours(n)` |\n| `INTERVAL n MINUTE` (literal) | `DateTime::IntervalFromMinutes(n)` |\n| `INTERVAL n SECOND` (literal) | `DateTime::IntervalFromSeconds(n)` |\n| `Interval(\"P7D\")` (YQL input) | passed through unchanged |\n| `dateDiff('minute', a, b)` | `(CAST(b AS Int64) - CAST(a AS Int64)) / 60000000` |\n| `dateDiff('hour', a, b)` | `(CAST(b AS Int64) - CAST(a AS Int64)) / 3600000000` |\n| `dateDiff('day', a, b)` | `(CAST(b AS Int64) - CAST(a AS Int64)) / 86400000000` |\n| `dateDiff('week', a, b)` | `(CAST(b AS Int64) - CAST(a AS Int64)) / 604800000000` |\n\n\u003e **Note on `dateDiff`:** YDB stores `Timestamp` as microseconds since epoch. The formula above gives exact integer units assuming both arguments are `Timestamp`. Results for `Date`-typed columns will differ.\n\n### Strings\n\n| Input | YQL output |\n|---|---|\n| `CONCAT(a, b, ...)` | `a \\|\\| b \\|\\| ...` |\n| `UPPER(x)` | `Unicode::ToUpper(x)` |\n| `LOWER(x)` | `Unicode::ToLower(x)` |\n| `LENGTH(x)` / `CHAR_LENGTH(x)` | `Unicode::GetLength(x)` |\n| `POSITION(sub IN x)` / `STRPOS(x, sub)` | `Find(x, sub)` |\n| `STRING_TO_ARRAY(x, delim)` | `String::SplitToList(x, delim)` |\n| `ARRAY_TO_STRING(arr, delim)` | `String::JoinFromList(arr, delim)` |\n\n### Arrays / collections\n\n| Input | YQL output |\n|---|---|\n| `ARRAY(v1, v2, ...)` | `AsList(v1, v2, ...)` |\n| `ARRAY_LENGTH(x)` / `ARRAY_SIZE(x)` | `ListLength(x)` |\n| `ARRAY_FILTER(arr, x -\u003e cond)` | `ListFilter(arr, ($x) -\u003e (cond))` |\n| `ARRAY_ANY(arr, x -\u003e cond)` | `ListHasItems(ListFilter(arr, ($x) -\u003e (cond)))` |\n| `ARRAY_AGG(x)` | `AGGREGATE_LIST(x)` |\n| `UNNEST(x)` | `FLATTEN BY x` |\n\nLambda expressions are represented with sqlglot's standard `exp.Lambda` AST node.\nWhen a source dialect parses lambdas, the YDB generator emits YQL lambda syntax:\n\n```sql\n-- DuckDB input\nSELECT list_filter(arr, x -\u003e x \u003e 0) FROM t\n\n-- YDB output\nSELECT ListFilter(arr, ($x) -\u003e ($x \u003e 0)) FROM `t`\n```\n\nYDB input also supports documented YQL lambda forms, including optional\narguments and block bodies with local named expressions:\n\n```sql\n($x, $y?) -\u003e ($x + COALESCE($y, 0));\n($y) -\u003e { $prefix = \"x\"; RETURN $prefix || $y; };\n```\n\nClickHouse `ARRAY JOIN` and simple `arrayJoin(...)` projections, and PostgreSQL\n`LATERAL unnest(...)`, are converted to YDB `FLATTEN BY` when the operation is\ndirectly tied to the source table.\n\n### Conditional / math\n\n| Input | YQL output |\n|---|---|\n| `NULLIF(x, y)` | `IF(x = y, NULL, x)` |\n| `ROUND(x, n)` | `Math::Round(x, -n)` |\n| `COUNT()` *(zero-argument form)* | `COUNT(*)` |\n\n### JSON\n\n| Input | YQL output |\n|---|---|\n| `jsonb_col @\u003e value` (PostgreSQL) | `Yson::Contains(jsonb_col, value)` |\n\nYDB JSON functions are parsed and round-tripped, including `PASSING`,\n`RETURNING`, wrapper modes, and `ON EMPTY` / `ON ERROR` clauses:\n\n```sql\nJSON_VALUE(payload, '$.value + $delta' PASSING 1 AS delta RETURNING Int64 DEFAULT 0 ON EMPTY ERROR ON ERROR)\nJSON_QUERY(payload, '$.items' WITH CONDITIONAL ARRAY WRAPPER NULL ON EMPTY ERROR ON ERROR)\nJSON_EXISTS(payload, '$.items[$Index]' PASSING 0 AS \"Index\" FALSE ON ERROR)\n```\n\nJSON paths can contain quoted keys, for example\n`JSON_EXISTS(item_result, \"$.'P_008 device playback test'\")`.\n\n---\n\n## Type mapping\n\n### Standard SQL → YDB\n\n| SQL type | YDB type |\n|---|---|\n| `TINYINT` | `Int8` |\n| `SMALLINT` | `Int16` |\n| `INT` / `INTEGER` | `Int32` |\n| `BIGINT` | `Int64` |\n| `FLOAT` | `Float` |\n| `DOUBLE` / `DOUBLE PRECISION` | `Double` |\n| `DECIMAL(p, s)` | `Decimal(p, s)` |\n| `BOOLEAN` / `BIT` | `Uint8` |\n| `TIMESTAMP` | `Timestamp` |\n| `VARCHAR` / `NVARCHAR` / `CHAR` / `TEXT` | `Utf8` |\n| `BLOB` / `BINARY` / `VARBINARY` | `String` |\n\n### YDB types → standard SQL\n\n| YDB type | Standard SQL | Postgres | ClickHouse |\n|---|---|---|---|\n| `Utf8` | `TEXT` | `TEXT` | `String` |\n| `String` | `BLOB` | `BYTEA` | `String` |\n| `Int32` | `INT` | `INT` | `Int32` |\n| `Int64` | `BIGINT` | `BIGINT` | `Int64` |\n| `Optional\u003cT\u003e` | `T` (nullable) | `T` | `Nullable(T)` |\n| `List\u003cT\u003e` | `LIST\u003cT\u003e` | `LIST\u003cT\u003e` | `Array(T)` |\n| `Dict\u003cK,V\u003e` | `MAP\u003cK,V\u003e` | `MAP\u003cK,V\u003e` | `Map(K,V)` |\n| `Tuple\u003cT1,T2\u003e` | `STRUCT\u003c...\u003e` | `STRUCT\u003c...\u003e` | `Tuple(T1,T2)` |\n\n---\n\n## Limitations\n\n### Dialect-specific functions\n\nFunctions that sqlglot does not parse into typed AST nodes are passed through unchanged and must be replaced manually. Common examples from ClickHouse: `now()`, `today()`, `parseDateTimeBestEffort()`, `toDate()`, `toFloat64()`, `toString()`, `countDistinct()`, `groupArray()`.\n\n### Correlated subqueries in DML\n\nCorrelated subqueries inside `UPDATE` or `INSERT` statements cannot be automatically decorrelated — YDB does not support them natively, and rewriting requires knowledge of the table's primary key. Rewrite manually using a `$variable`:\n\n```sql\n-- not supported (will raise an error)\nUPDATE t SET col = (SELECT val FROM other WHERE other.id = t.id)\n\n-- workaround\n$vals = (SELECT id, val FROM other);\nUPDATE t SET col = (SELECT val FROM $vals WHERE id = t.id)\n```\n\nCorrelated subqueries inside `SELECT` are handled automatically via JOIN rewriting.\n\n### `dateDiff` with month granularity\n\n`dateDiff('month', a, b)` has no exact equivalent in YDB because months have variable length. Use `DateTime::ShiftMonths` for date arithmetic instead.\n\n### YDB container types in other dialects\n\n`Uint8`/`Uint16`/`Uint32`/`Uint64` and YDB-specific container types (`Struct\u003c...\u003e`, `Variant\u003c...\u003e`, `Enum\u003c...\u003e`) do not have direct equivalents in standard SQL and are passed through as-is when targeting other dialects.\n\n---\n\n## Development\n\n```bash\ngit clone https://github.com/ydb-platform/ydb-sqlglot-plugin.git\ncd ydb-sqlglot-plugin\npython -m venv .venv \u0026\u0026 source .venv/bin/activate\npip install -e \".[dev]\"\npython -m pytest tests/\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fydb-platform%2Fydb-sqlglot-plugin","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fydb-platform%2Fydb-sqlglot-plugin","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fydb-platform%2Fydb-sqlglot-plugin/lists"}