{"id":50419038,"url":"https://github.com/tobilg/quacklake","last_synced_at":"2026-05-31T07:30:51.704Z","repository":{"id":360504177,"uuid":"1250388843","full_name":"tobilg/quacklake","owner":"tobilg","description":"A DuckLake data catalog based on quack, deployed to Cloudflare","archived":false,"fork":false,"pushed_at":"2026-05-26T16:57:57.000Z","size":1687,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-05-26T18:27:11.909Z","etag":null,"topics":["cloudflare","data-catalog","duckdb","ducklake","durable-objects","quack"],"latest_commit_sha":null,"homepage":"","language":"TypeScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/tobilg.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-05-26T15:25:08.000Z","updated_at":"2026-05-26T16:58:01.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/tobilg/quacklake","commit_stats":null,"previous_names":["tobilg/quacklake"],"tags_count":null,"template":false,"template_full_name":null,"purl":"pkg:github/tobilg/quacklake","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tobilg%2Fquacklake","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tobilg%2Fquacklake/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tobilg%2Fquacklake/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tobilg%2Fquacklake/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/tobilg","download_url":"https://codeload.github.com/tobilg/quacklake/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tobilg%2Fquacklake/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":33723548,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-05-31T02:00:06.040Z","response_time":95,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["cloudflare","data-catalog","duckdb","ducklake","durable-objects","quack"],"created_at":"2026-05-31T07:30:50.813Z","updated_at":"2026-05-31T07:30:51.690Z","avatar_url":"https://github.com/tobilg.png","language":"TypeScript","funding_links":[],"categories":[],"sub_categories":[],"readme":"![Quacklake](images/quacklake-compressed.png)\n\n# quacklake\n\n`quacklake` is a Cloudflare Workers / Durable Objects service that speaks DuckDB's experimental Quack HTTP protocol and stores DuckLake catalog metadata in Durable Object SQLite storage.\n\nThe Worker exposes one public Quack endpoint at `/quack`. Clients authenticate by sending a JWT as the Quack auth string. A valid JWT resolves to one catalog Durable Object and one normalized principal, then quacklake applies the catalog's server-side authorization policy before executing catalog SQL.\n\n## Status\n\nThis is an alpha implementation. It is useful for protocol integration work, local Worker tests, and R2-backed DuckLake metadata smoke tests. It is not a full DuckDB server.\n\nImplemented:\n\n- Quack binary request/response transport through `POST /quack`.\n- `CONNECTION_REQUEST`, `PREPARE_REQUEST`, `FETCH_REQUEST`, `APPEND_REQUEST`, and `DISCONNECT_MESSAGE`.\n- JWT-only catalog authentication.\n- First-party HS256 quacklake JWT credentials.\n- Third-party OIDC JWT verification through configured providers and JWKS.\n- Catalog auth mappings that select a catalog for verified OIDC principals.\n- Catalog auth policies that authorize SQL and append requests before execution.\n- SQLite-backed query execution with DuckDB-style compatibility rewrites.\n- Planned R2-backed DuckLake `DATA_PATH` assignment and enforcement per catalog.\n- R2-backed DuckLake file discovery for orphan cleanup.\n- Optional trusted-client R2 data leases for catalogs created with `dataAccessMode: \"trusted_client\"`.\n- Result materialization into Quack `DataChunk`s using `@quack-protocol/sdk`.\n- Basic explicit transaction emulation with snapshot restore on `ROLLBACK`.\n- Worker integration tests through the published `@quack-protocol/sdk` client.\n- OpenAPI v3 Admin API document at `GET /api-docs`.\n\nNot implemented as full DuckDB semantics:\n\n- Complete DuckDB SQL parser or optimizer behavior.\n- Arbitrary DuckDB functions and table functions.\n- Cross-session transactional conflict detection.\n- Complete DuckLake test-suite coverage.\n- OPA/Rego policy execution.\n- A server-side data gateway. Trusted-client leases grant raw R2 object access under the catalog data path and do not enforce row or column policy at the storage layer.\n\n## Guides\n\nThe README is intentionally a short project entry point. Detailed operational docs live in `guides/`:\n\n- [Getting Started Guide](./guides/getting-started.md): simplest production-style Cloudflare deployment with one R2 bucket, first-party JWT auth, default `catalog_only` access, and a DuckDB CLI smoke query.\n- [Authn/Authz Guide](./guides/authn-authz.md): JWT-only authentication, first-party credentials, OIDC providers, catalog mappings, catalog policies, policy cookbook, explain output, and troubleshooting.\n- [Cognito End-To-End Guide](./guides/cognito-e2e.md): Cognito user-pool setup, group-based permission profiles, catalog mapping, row and column policy examples, and end-user DuckLake querying.\n- [Microsoft Entra ID End-To-End Guide](./guides/entraid-e2e-md): Entra app registration, group and app-role permission profiles, catalog mapping, row and column policy examples, and end-user DuckLake querying.\n- [Quack, DuckLake, And R2 Guide](./guides/quack-ducklake.md): DuckDB Quack secrets, SDK usage, DuckLake attachment, planned R2 `DATA_PATH` enforcement, trusted-client R2 leases, R2 bucket listing, diagnostics, and file inventory endpoints.\n- [Local Development And Configuration Guide](./guides/local-development.md): dependencies, Wrangler configuration, local secrets, development commands, local Worker health checks, and OpenAPI discovery.\n\nThe machine-readable Admin API reference is served by a running Worker:\n\n```sh\ncurl http://localhost:8787/api-docs\n```\n\n## Architecture\n\nThere are two Durable Object classes:\n\n- `CatalogRegistry`: global registry for catalog ids, first-party credential metadata, OIDC providers, catalog auth mappings, and catalog auth policies.\n- `QuackCatalogObject`: one SQLite-backed Durable Object database per catalog.\n\nRequest flow:\n\n1. A client sends a Quack `CONNECTION_REQUEST` with a JWT auth string.\n2. The Worker asks `CatalogRegistry` to verify and resolve the JWT.\n3. The JWT resolves to one catalog id, one `QuackCatalogObject`, one normalized principal, and the current catalog policy version.\n4. The catalog object opens a session and stores the auth context.\n5. The Worker signs `{ catalogId, sessionId }` into the public Quack `connection_id`.\n6. Later Quack messages include that signed connection id and route directly to the catalog Durable Object.\n7. `PREPARE_REQUEST` and `APPEND_REQUEST` are authorized against the stored session principal and policy before execution.\n\n## Project Layout\n\n- `src/index.ts`: Worker fetch handler, `/quack`, CORS, `/api-docs`, and `/admin/*` routes.\n- `src/openapi.ts`: OpenAPI v3 Admin API document.\n- `src/registry.ts`: catalog, credential, OIDC provider, mapping, and policy registry Durable Object.\n- `src/auth.ts`: shared authentication, mapping, policy, and session auth types.\n- `src/authz.ts`: SQL classification and internal policy evaluator.\n- `src/catalog.ts`: Quack protocol Durable Object and per-session auth enforcement.\n- `src/sql-compat.ts`: SQL execution orchestration, session state, schema tracking, transactions, and result chunking.\n- `src/ducklake-metadata.ts`: DuckLake-specific catalog query and migration shims that SQLite cannot execute directly.\n- `src/sql-rewrite.ts`: DuckDB-to-SQLite SQL text rewrites and column-definition parsing.\n- `src/sql-names.ts`: schema-qualified identifier normalization helpers.\n- `src/sql-types.ts`: shared SQL execution, result, schema, and transaction snapshot types.\n- `src/quack-values.ts`: value and logical type conversion between SQLite and Quack.\n- `src/crypto.ts`: signed connection ids and constant-time comparisons.\n- `test/quack-worker.test.ts`: Worker integration tests through `QuackClient`.\n- `test/auth.test.ts`: JWT, OIDC, mapping, policy, protocol, OpenAPI, and explain tests.\n- `test/authz.test.ts`: SQL authorization classifier and policy evaluator tests.\n- `test/file-listing.test.ts`: R2/file-listing helper tests.\n- `test/quack-values.test.ts`: Quack value and logical type conversion tests.\n- `guides/`: focused operator and developer guides.\n- `scripts/create-jwt.sh`: creates a first-party personal JWT and installs a broad personal catalog policy.\n- `scripts/setup-cognito.sh`: creates Cognito user-pool resources for OIDC smoke tests and registration.\n- `scripts/register-cognito-idp.sh`: registers Cognito as a quacklake OIDC provider and installs group-based mapping/policy rules.\n- `wrangler.example.jsonc`: tracked Worker, Durable Object, R2, migration, and runtime configuration template.\n\n## Quick Start\n\nInstall dependencies:\n\n```sh\npnpm install\n```\n\nCreate `.dev.vars` for local development:\n\n```dotenv\nADMIN_TOKEN=admin-test-token\nQUACKLAKE_JWT_SECRET=replace-with-long-random-local-jwt-secret\nCONNECTION_SIGNING_SECRET=replace-with-long-random-local-signing-secret\n```\n\nRun checks:\n\n```sh\npnpm run typecheck\npnpm run test\npnpm run test:coverage\n```\n\nRun the Worker locally:\n\n```sh\npnpm run dev\n```\n\nHealth check:\n\n```sh\ncurl http://localhost:8787/\n```\n\nExpected shape:\n\n```json\n{\n  \"name\": \"quacklake\",\n  \"protocol\": \"quack\",\n  \"endpoint\": \"/quack\",\n  \"apiDocs\": \"/api-docs\"\n}\n```\n\nCreate a local catalog and first-party JWT credential:\n\n```sh\ncurl -s -X POST http://localhost:8787/admin/catalogs \\\n  -H 'Authorization: Bearer admin-test-token' \\\n  -H 'Content-Type: application/json' \\\n  -d '{\"catalogId\":\"default\",\"scopes\":[\"catalog.admin\"]}'\n```\n\nInstall a permissive bootstrap policy for local setup:\n\n```sh\ncurl -s -X PUT http://localhost:8787/admin/catalogs/default/auth-policy \\\n  -H 'Authorization: Bearer admin-test-token' \\\n  -H 'Content-Type: application/json' \\\n  -d '{\"version\":1,\"defaultEffect\":\"allow\",\"rules\":[]}'\n```\n\nFor production policies, OIDC, and troubleshooting, use [Authn/Authz Guide](./guides/authn-authz.md).\n\n## Configuration Summary\n\n`wrangler.example.jsonc` shows the Worker configuration shape. Copy it to `wrangler.jsonc` and edit the local copy before running Wrangler commands.\n\nImportant runtime vars:\n\n- `QUACK_FETCH_ROWS_PER_CHUNK`: default `1024`.\n- `QUACK_FETCH_CHUNKS_PER_BATCH`: default `12`.\n- `QUACKLAKE_JWT_ISSUER`: first-party JWT issuer, default `quacklake`.\n- `QUACKLAKE_JWT_AUDIENCE`: first-party JWT audience, default `quacklake:quack`.\n- `QUACKLAKE_JWT_DEFAULT_TTL_SECONDS`: first-party credential lifetime, default one year.\n- `DUCKLAKE_R2_BINDINGS`: JSON map from DuckLake bucket name to Worker R2 binding name, for example `{\"\u003cbucket-name\u003e\":\"DUCKLAKE_R2\"}`. Every usable DuckLake data bucket must also appear in `wrangler.jsonc` `r2_buckets`.\n\nTrusted-client lease vars, only needed when using `dataAccessMode: \"trusted_client\"`:\n\n- `DUCKLAKE_R2_DATA_LEASE_TTL_SECONDS`: trusted-client R2 data lease TTL, clamped to 30-120 seconds. Default `60`.\n- `R2_ACCOUNT_ID`: Cloudflare account id used when locally signing R2 temporary credentials.\n- `R2_ENDPOINT`: S3-compatible R2 endpoint, for example `https://\u003caccount-id\u003e.r2.cloudflarestorage.com`.\n\nRuntime secrets:\n\n- `ADMIN_TOKEN`: bearer secret required for every `/admin/*` route.\n- `QUACKLAKE_JWT_SECRET`: HS256 signing key for first-party quacklake JWT credentials.\n- `CONNECTION_SIGNING_SECRET`: HMAC secret used to sign Quack connection ids.\n- `R2_ACCESS_KEY_ID`: parent R2 S3 access key id used only when issuing trusted-client data leases.\n- `R2_SECRET_ACCESS_KEY`: parent R2 S3 secret access key used only when issuing trusted-client data leases.\n\nFor a deployed Worker using the default name:\n\n```sh\npnpm exec wrangler secret put ADMIN_TOKEN --name quacklake\npnpm exec wrangler secret put QUACKLAKE_JWT_SECRET --name quacklake\npnpm exec wrangler secret put CONNECTION_SIGNING_SECRET --name quacklake\n```\n\nIf you enable `dataAccessMode: \"trusted_client\"` for any catalog, also set the parent R2 S3 credentials used for local temporary-credential signing:\n\n```sh\npnpm exec wrangler secret put R2_ACCESS_KEY_ID --name quacklake\npnpm exec wrangler secret put R2_SECRET_ACCESS_KEY --name quacklake\n```\n\nThe value passed to admin calls, including `scripts/create-jwt.sh --admin-token`, must exactly match the deployed `ADMIN_TOKEN` secret.\n\nSee [Local Development And Configuration Guide](./guides/local-development.md) for local and deployed secret setup.\n\n## Client Usage Summary\n\nUse a JWT as the Quack secret token value:\n\nUse the `core_nightly` DuckDB extension builds for `quack` and `ducklake`; those\nbuilds contain bugfixes required by the current quacklake workflows.\n\n```sql\nFORCE INSTALL quack FROM core_nightly;\nFORCE INSTALL ducklake FROM core_nightly;\nLOAD quack;\nLOAD ducklake;\n\nCREATE OR REPLACE SECRET quacklake_catalog (\n  TYPE quack,\n  TOKEN '\u003cjwt\u003e',\n  SCOPE 'quack:\u003cworker-host\u003e:443'\n);\n```\n\n`POST /admin/catalogs` assigns the catalog a planned `DATA_PATH` of `r2://\u003cbucket\u003e/catalogs/\u003ccatalogId\u003e/` and returns `ducklake.secretSql` and `ducklake.attachSql` for copy/paste bootstrap. `ducklake.secretSql` contains the one-time-visible JWT and must be treated as secret material.\n\nFor DuckLake, create a separate storage secret scoped to the planned bucket and prefix:\n\n```sql\nCREATE OR REPLACE SECRET lake_r2 (\n  TYPE s3,\n  PROVIDER config,\n  KEY_ID '\u003cr2-access-key-id\u003e',\n  SECRET '\u003cr2-secret-access-key\u003e',\n  ENDPOINT '\u003caccount-id\u003e.r2.cloudflarestorage.com',\n  URL_STYLE 'path',\n  REGION 'auto',\n  SCOPE 'r2://\u003cbucket\u003e/catalogs/\u003ccatalogId\u003e/'\n);\n\nATTACH 'ducklake:quack:\u003cworker-host\u003e:443' AS lake (\n  DATA_PATH 'r2://\u003cbucket\u003e/catalogs/\u003ccatalogId\u003e/'\n);\n```\n\nManual storage secrets are still the default `catalog_only` setup. For trusted clients, create the catalog with `dataAccessMode: \"trusted_client\"` and call `POST /catalog/data-lease` with the same catalog JWT to receive short-lived R2 credentials for the planned catalog `DATA_PATH`.\n\nFor server-side DuckLake maintenance paths such as `read_blob()` orphan discovery, and for validating trusted-client lease paths, the Worker also needs an R2 bucket binding mapped through `DUCKLAKE_R2_BINDINGS`. See [Quack, DuckLake, And R2 Guide](./guides/quack-ducklake.md) for Worker R2 binding setup, client storage secrets, trusted-client leases, R2 bucket listing, diagnostics, and file inventory examples.\n\n## Notes\n\n- Keep one catalog id per independent DuckLake `DATA_PATH`.\n- Additional credentials for a catalog are credential rotations or app-specific credentials; they do not create a new metadata store.\n- Signed connection ids depend on `CONNECTION_SIGNING_SECRET`; rotating it invalidates all active client sessions.\n- First-party credentials depend on `QUACKLAKE_JWT_SECRET`; rotating it requires credential reissue.\n- OPA/Rego is intentionally not implemented in v1, but the internal explain input/output shape is OPA-compatible enough to support a future OPA Wasm backend.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftobilg%2Fquacklake","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ftobilg%2Fquacklake","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftobilg%2Fquacklake/lists"}