{"id":51121813,"url":"https://github.com/tegmentum/ducklink","last_synced_at":"2026-06-25T03:30:31.447Z","repository":{"id":366721628,"uuid":"1066727202","full_name":"tegmentum/ducklink","owner":"tegmentum","description":null,"archived":false,"fork":false,"pushed_at":"2026-06-23T02:42:56.000Z","size":5116,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-06-23T04:20:21.584Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Rust","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/tegmentum.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":"AGENTS.md","dco":null,"cla":null}},"created_at":"2025-09-29T21:54:49.000Z","updated_at":"2026-06-23T02:43:00.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/tegmentum/ducklink","commit_stats":null,"previous_names":["tegmentum/ducklink"],"tags_count":null,"template":false,"template_full_name":null,"purl":"pkg:github/tegmentum/ducklink","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tegmentum%2Fducklink","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tegmentum%2Fducklink/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tegmentum%2Fducklink/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tegmentum%2Fducklink/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/tegmentum","download_url":"https://codeload.github.com/tegmentum/ducklink/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tegmentum%2Fducklink/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":34758773,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-25T02:00:05.521Z","response_time":101,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2026-06-25T03:30:30.549Z","updated_at":"2026-06-25T03:30:31.433Z","avatar_url":"https://github.com/tegmentum.png","language":"Rust","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003cp align=\"center\"\u003e\n  \u003cimg src=\"docs/assets/ducklink_logo.png\" alt=\"DuckLink\" width=\"360\"\u003e\n\u003c/p\u003e\n\n# DuckDB WebAssembly Components\n\nThis repository contains a pair of WebAssembly components that wrap the DuckDB C API (`libduckdb`) and expose it through the Wasm component model.\n\n- `ducklink-core`: Implements the `duckdb:component/database` world and provides structured access to DuckDB connections and SQL execution.\n- `ducklink-cli`: Implements the `wasi:cli/run` world and offers a WASI-native command line interface that mirrors the behaviour of the native DuckDB shell while delegating database access through the component interface.\n\nBoth components are intended to run in preview2-capable runtimes such as `wasmtime 16.0+`.\n\n## Extension catalog\n\nThe repo also ships **111 component extensions** (254 SQL functions) — Rust\n`wasm32-wasip2` components implementing the `duckdb:extension` WIT world, loadable\nat runtime with `LOAD \u003cname\u003e` and verified by `tooling/smoke.py`. They span text\n\u0026 NLP, encodings, crypto, aggregates (bloom/minhash/count-min sketches), and gated\nnetwork (dns/http). See **[CATALOG.md](CATALOG.md)** for the full index\n(regenerate with `python3 tooling/gen-catalog.py`; verify integrity with\n`python3 tooling/verify-catalog.py`).\n\n## Repository layout\n\n```\nwit/\n  core/                Shared database interface definitions\n  standalone/          WASI-oriented worlds (standalone DB + CLI)\n  browser/             Browser-oriented database world\ncrates/\n  libduckdb-sys/       bindgen-based bindings to the DuckDB C API\n  ducklink-core/ Component implementation of the DuckDB API\n  ducklink-cli/ WASI CLI component built on top of the exported API\nscripts/\n  build-libduckdb-wasm.sh  Helper for cross-compiling DuckDB to wasm32-wasi\ncmake/toolchains/\n  wasi-sdk.cmake       Toolchain file for building DuckDB with wasi-sdk\n```\n\n## Prerequisites\n\n1. **DuckDB source** at `DUCKDB_SOURCE_DIR` (e.g. `~/src/duckdb`). A shallow clone is sufficient:\n   ```bash\n   git clone https://github.com/duckdb/duckdb.git ~/src/duckdb\n   ```\n2. **wasi-sdk** (tested with 33.0; exception handling requires \u003e= 33) with `WASI_SDK_PREFIX` pointing at the installation root. A predownloaded copy lives under `external/wasi-sdk-33.0-\u003cplatform\u003e`; point the variable there if you do not have a global install.\n3. **Rust tooling**:\n   - `rustup target add wasm32-wasi`\n   - `cargo install cargo-component`\n4. **wit-bindgen tooling** (included automatically by `cargo-component`).\n\nNetwork access is required only when fetching DuckDB or installing the toolchain.\n\n## Building `libduckdb` for wasm32-wasi\n\nThe component links against a statically built `libduckdb` compiled for `wasm32-wasi`. Use the helper script to cross-compile the library:\n\n```bash\nexport DUCKDB_SOURCE_DIR=~/src/duckdb\nexport WASI_SDK_PREFIX=\"$(pwd)/external/wasi-sdk-33.0-arm64-macos\"\nexport WASI_TARGET_TRIPLE=wasm32-wasip2\nexport WASM_EXTENSIONS=json  # defaults to json if unset; comma‑separate to add more later\nscripts/build-libduckdb-wasm.sh\n```\n\nThe script places `libduckdb-wasi.a` under `artifacts/`. Afterwards set the following environment variables so the Rust build can locate the headers and the archive:\n\n```bash\nexport DUCKDB_INCLUDE_DIR=\"$DUCKDB_SOURCE_DIR/src/include\"\nexport DUCKDB_STATIC_LIB=\"$(pwd)/artifacts/libduckdb-wasi.a\"\n```\n\n### Browser-oriented static library\n\nFor the browser component you will need a DuckDB archive compiled for the appropriate `wasm32-unknown-unknown` (or equivalent) target. Once built, point `DUCKDB_STATIC_LIB` at that archive and use the `make core-browser` target to produce `ducklink_core.wasm` with the `browser` feature enabled.\n\n## Building the components\n\nCompile both components using the make targets (they call `cargo component` under the hood):\n\n```bash\nmake\n```\n\nIndividual targets are also available:\n\n```bash\nmake core\nmake ducklink-cli\n\n# Build the browser-oriented core (requires a browser-compatible DuckDB static archive)\nmake core-browser BROWSER_TARGET=wasm32-unknown-unknown\n```\n\nThe resulting component binaries are generated in `target/wasm32-wasi/release/`:\n\n- `ducklink_core.wasm`\n- `ducklink_cli.wasm`\n\n## Developing component extensions\n\nExtensions live under `extensions/\u003cname\u003e-component`, register imperatively in\n`load()` against the `duckdb:extension` world, and are tracked by the tooling in\n`tooling/` + `registry/` (mirrors `~/git/sqlite-wasm`'s system). The full\nroadmap is in [PLAN-duckdb-extensions.md](PLAN-duckdb-extensions.md).\n\n```bash\n# Scaffold a skeleton (consults tooling/compat-registry.json for crate status,\n# registers the workspace member, and cargo-checks that it compiles):\nmake ext-scaffold NAME=myext CRATE=base32,bs58\n\n# Edit extensions/myext-component/src/lib.rs + smoke.sql, then build + smoke:\nmake ext NAME=myext-component\n\n# Seed assertions from current output, review, and re-run to assert:\npython3 tooling/smoke.py --seed-expected myext\npython3 tooling/smoke.py myext\n\nmake ext-smoke-all        # smoke every extension\nmake ext-list-broken      # crates flagged un-buildable on wasm32-wasip2\npython3 tooling/t-status.py   # tooling-improvement items from build experience\n```\n\nExtensions load through the **native host runner** (`ducklink`); the\nwac-composed standalone CLI links a no-op loader stub and cannot instantiate\nthem. `isin` (hand-rolled) and `baseN` (crate-backed) are worked examples. See\n[docs/component-extension-guide.md](docs/component-extension-guide.md) for the\ncapability surface and packaging details.\n\n## Using the components\n\n### Component worlds\n\n- `wit/core/duckdb-core.wit` defines the shared `duckdb:component/database` interface implemented by the core component.\n- `wit/standalone/duckdb-standalone.wit` exports the database world for WASI runtimes, while `wit/standalone/duckdb-cli.wit` wires in the CLI experience on top of it.\n- `wit/browser/duckdb-browser.wit` will back the browser-friendly component variant, sharing the same database surface but relying on host-provided storage and networking.\n\n### Direct database access\n\nInstantiate the database component with a runtime that supports the component model. For example, using `wasmtime`:\n\n```bash\nwasmtime component run target/wasm32-wasi/release/ducklink_core.wasm --dir .\n```\n\nPre-open directories that contain database files (e.g. `--dir .`) so the component can access them via WASI.\n\n### CLI component\n\nThe CLI component imports the database world and exposes a `wasi:cli` entry point. To run it with wasmtime you can compose the CLI and core components using the [`wac`](https://github.com/bytecodealliance/wac) tool:\n\n```bash\n# Install the wac CLI once\ncargo install wac-cli\n\n# Compose the CLI + core component pair\nwac plug target/wasm32-wasip2/release/ducklink_cli.wasm \\\n  --plug target/wasm32-wasip2/release/ducklink_core.wasm \\\n  -o artifacts/duckdb-cli.wasm\n\n# Execute a query (grant directory access for any on-disk database file)\nwasmtime run artifacts/duckdb-cli.wasm --dir . -- :memory: -c \"select 42;\"\n```\n\nFor quick validation there is also a helper script that performs the `wac plug`\nstep and executes a simple query:\n\n```bash\nscripts/smoke-cli.sh\n```\n\nThe script accepts optional environment variables (`SQL`, `DB_PATH`, `EXTRA_WASMTIME_FLAGS`, `EXTENSIONS`)\nto tailor the smoke test.\nFor example, set `EXTENSIONS=\"sample_extension\"` to pass `--load-extension sample_extension`\nto the CLI before the query runs.\n\nThe CLI supports:\n\n- Connecting to a database file or running purely in-memory (`ducklink_cli.wasm :memory:`)\n- Executing a single command via `-c \"SQL\"`\n- Preloading componentized extensions via `--load-extension \u003cname\u003e` (repeat for multiple extensions); this issues a `LOAD \u003cname\u003e` statement before user SQL runs\n- Interactive REPL with `.help`, `.exit`, and `.quit`\n\nResult sets are rendered in a text table that mirrors the native DuckDB shell.\n\n### WIT packages\n\nAll WIT interfaces live under `wit/` at the repository root. That directory\nvendors the WASI Preview 2 packages at version `0.2.6` (the latest preview\nsupported by Wasmtime `37.0.2`), along with the DuckDB-specific packages. The\ncrate-local copies under `crates/*/wit/` are generated from this canonical tree\nvia `scripts/sync-core-wit.sh` and `scripts/sync-cli-wit.sh`. Always edit the WIT\nfiles in `wit/` first, then re-run the sync scripts to propagate changes before\nbuilding.\n\nExternal extensions can depend on the definitions in `wit/duckdb-extension/`\nto stay in sync with the host runtime without having to vendor their own copies\nof the extension interfaces.\n\n### Native host runner\n\nThe `ducklink-host` crate provides a reusable Wasmtime runner that composes the CLI\nand core components along with the componentized extension loader. Build and execute it via:\n\n```bash\ncargo run -p ducklink-host --bin ducklink -- -- duckdb-cli :memory: -c \"select 42 as answer;\"\n```\n\nAdditional directories can be exposed to the CLI with `--dir /host/path::/guest/path`, and\ncustom component artifacts can be supplied with `--core-component` / `--cli-component`. The\nhost automatically preopens the current working directory as `.` so relative database paths\ncontinue to work.\n\n### Extension components\n\nDuckDB’s extension loader is in the process of resolving WebAssembly components from `artifacts/extensions/`. When an extension registers itself with the core component, the name is sanitized to `[A-Za-z0-9_-]` and mapped to `\u003cname\u003e.wasm` inside that directory. As the loader matures, dropping a compiled extension there will allow `LOAD \u003cname\u003e` to instantiate it through the preview2 runtime rather than the native shared-library path.\n\nThis repository ships a minimal sample extension under `extensions/sample-extension-component/` that exercises the component interface. You can build and validate it end-to-end via:\n\n```bash\nmake smoke-extension\n```\n\nThe target runs the `ducklink-host` test `load_sample_extension_component`, which:\n\n1. Builds the sample extension (if it is not already present).\n2. Copies the resulting component to `artifacts/extensions/sample_extension.wasm`.\n3. Instantiates it with Wasmtime using the preview2 bindings and asserts that `load()` returns the expected metadata.\n\n## Testing\n\nCurrently the project does not ship a full integration test suite because executing the components requires a preview2 runtime plus a wasm32-wasi build of DuckDB. Manual smoke testing can be done after building:\n\n```bash\nwasmtime component run artifacts/duckdb-cli.wasm --dir . -- in_memory_db.duckdb -c \"select 42 as answer;\"\n```\n\nThere are also convenience targets:\n\n```bash\nmake smoke-cli            # :memory: query via scripts/smoke-cli.sh\nmake smoke-cli-disk       # same but forces an on-disk temp database\nmake sample-extension     # builds the sample component and copies it to artifacts/extensions/\nmake smoke-extension      # runs Cargo test to build + load the sample extension component\n```\n\nTo validate the preview2 filesystem adapter against real storage outside of `make`, set `ON_DISK_SMOKE=1` when running `scripts/smoke-cli.sh`; the helper will create a temporary on-disk database, grant Wasmtime access to that directory, and delete it after the query completes.\n\nContinuous smoke coverage runs in CI via `.github/workflows/smoke-tests.yml`, which builds the components and executes both the in-memory and on-disk runs of `scripts/smoke-cli.sh` on every push and pull request.\n\n### Running CI locally with act\n\nUntil hosted Actions are available (public repo / billing), the same workflow can\nrun locally with [nektos/act](https://github.com/nektos/act) in Docker:\n\n```bash\nbrew install act          # one-time (Docker must be running)\nmake ci-local             # runs .github/workflows/smoke-tests.yml\nscripts/ci-local.sh -l    # list jobs without running\n```\n\n`.actrc` maps `ubuntu-latest` to `catthehacker/ubuntu:act-latest` and enables\n`--reuse` so caches persist between runs. The wasi-sdk download in the workflow\nis architecture-aware (`x86_64`/`arm64`), so it runs natively under act on Apple\nsilicon as well as on GitHub's x86_64 runners. The first run is slow (it pulls\nthe runner image, compiles the component tooling, and builds the patched DuckDB\narchive); afterwards the cached archive makes runs fast.\n\n## Database interface\n\nBeyond `execute` / `open-stream`, the `database` interface exposes:\n\n- **Prepared statements** — `prepare(conn, sql)` returns a reusable\n  `prepared-statement` resource; `execute(params)` binds positional parameters\n  (`$1`, `$2`, ...) and runs it, rebinding from scratch each call.\n- **Configuration** — `open-with-config(path, options)` opens a database applying\n  `(name, value)` options (e.g. `access_mode`, `default_order`, `max_memory`).\n- **Arrow** — `query-arrow(conn, sql)` returns the result as an Arrow IPC stream\n  (`list\u003cu8\u003e`), decodable by any Arrow implementation (apache-arrow in JS,\n  arrow-rs in Rust). Zero-copy is not possible across the component boundary, so\n  buffers are serialized once into IPC bytes.\n\n## Next steps\n\n- Flesh out remaining CLI scripting parity with the native shell\n- Resolve GitHub Actions billing so the smoke-tests workflow can run\n\n## Acknowledgments\n\nThis project owes a clear debt to [Simon Willison](https://simonwillison.net/)\nand [`sqlite-utils`](https://sqlite-utils.datasette.io/). The extension catalog,\nthe scaffold → smoke → feedback tooling loop, and much of the CLI ergonomics here\nfollow the patterns Simon established with `sqlite-utils` and the wider Datasette\necosystem for making a database pleasant to extend and script from the command\nline. Many of the component extensions also mirror utilities first popularized in\nthat ecosystem. Thank you.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftegmentum%2Fducklink","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ftegmentum%2Fducklink","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftegmentum%2Fducklink/lists"}