{"id":34801048,"url":"https://github.com/lance-format/lance-duckdb","last_synced_at":"2026-01-27T12:15:05.883Z","repository":{"id":308984060,"uuid":"1034604295","full_name":"lance-format/lance-duckdb","owner":"lance-format","description":"The lance extensions for DuckDB enable reading and writing of lance tables.","archived":false,"fork":false,"pushed_at":"2025-12-29T16:14:39.000Z","size":3077,"stargazers_count":35,"open_issues_count":5,"forks_count":0,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-12-31T03:08:35.404Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"C++","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/lance-format.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":"AGENTS.md","dco":null,"cla":null}},"created_at":"2025-08-08T16:59:58.000Z","updated_at":"2025-12-29T16:23:41.000Z","dependencies_parsed_at":"2025-08-09T04:26:03.030Z","dependency_job_id":null,"html_url":"https://github.com/lance-format/lance-duckdb","commit_stats":null,"previous_names":["lancedb/lance-duckdb","lance-format/lance-duckdb"],"tags_count":4,"template":false,"template_full_name":null,"purl":"pkg:github/lance-format/lance-duckdb","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lance-format%2Flance-duckdb","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lance-format%2Flance-duckdb/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lance-format%2Flance-duckdb/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lance-format%2Flance-duckdb/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/lance-format","download_url":"https://codeload.github.com/lance-format/lance-duckdb/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lance-format%2Flance-duckdb/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28400124,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-01-13T14:36:09.778Z","status":"ssl_error","status_checked_at":"2026-01-13T14:35:19.697Z","response_time":56,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-12-25T11:38:39.116Z","updated_at":"2026-01-13T20:43:54.548Z","avatar_url":"https://github.com/lance-format.png","language":"C++","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Lance DuckDB Extension\n\n[Lance](https://github.com/lance-format/lance/) is a modern columnar data format optimized for ML/AI workloads, with native cloud storage support. This extension will make `Lance` the best file/table/lakehouse formats on DuckDB.\n\n## Install\n\n### Install from DuckDB Community Extensions (recommended)\n\nIf you just want to use the extension, install it directly from DuckDB's community extensions repository:\n\n```sql\nINSTALL lance FROM community;\nLOAD lance;\n\nSELECT *\n  FROM 'path/to/dataset.lance'\n  LIMIT 1;\n```\n\nSee DuckDB's extension page for `lance` for the latest release details: https://duckdb.org/community_extensions/extensions/lance\n\n### Build from source (development)\n\nThis repository focuses on source builds for development and CI.\n\n1. Initialize submodules:\n\n```bash\ngit submodule update --init --recursive\n```\n\n2. Build:\n\n```bash\nGEN=ninja make release\n```\n\n3. Load the extension from a standalone DuckDB binary (local builds typically require unsigned extensions):\n\n```bash\nduckdb -unsigned -c \"LOAD 'build/release/extension/lance/lance.duckdb_extension'; SELECT 1;\"\n```\n\n## Usage\n\n- Full SQL reference: [`docs/sql.md`](./docs/sql.md)\n- Cloud storage reference: [`docs/cloud.md`](./docs/cloud.md)\n\n### Query a Lance dataset\n\n```sql\n-- local file\nSELECT *\n  FROM 'path/to/dataset.lance'\n  LIMIT 10;\n-- s3\nSELECT *\n  FROM 's3://bucket/path/to/dataset.lance'\n  LIMIT 10;\n```\n\nTo access object store URIs (e.g. `s3://...`), configure a `TYPE LANCE` secret (see [`docs/cloud.md`](./docs/cloud.md)).\n\n```sql\nCREATE SECRET (\n  TYPE LANCE,\n  PROVIDER credential_chain,\n  SCOPE 's3://bucket/'\n);\n\nSELECT *\n  FROM 's3://bucket/path/to/dataset.lance'\n  LIMIT 10;\n```\n\n### Write a Lance dataset\n\nUse DuckDB's `COPY ... TO ...` to materialize query results as a Lance dataset.\n\n```sql\n-- Create/overwrite a Lance dataset from a query\nCOPY (\n  SELECT 1::BIGINT AS id, 'a'::VARCHAR AS s\n  UNION ALL\n  SELECT 2::BIGINT AS id, 'b'::VARCHAR AS s\n) TO 'path/to/out.lance' (FORMAT lance, mode 'overwrite');\n\n-- Read it back via the replacement scan\nSELECT count(*) FROM 'path/to/out.lance';\n\n-- Append more rows to an existing dataset\nCOPY (\n  SELECT 3::BIGINT AS id, 'c'::VARCHAR AS s\n) TO 'path/to/out.lance' (FORMAT lance, mode 'append');\n\n-- Optionally create an empty dataset (schema only)\nCOPY (\n  SELECT 1::BIGINT AS id, 'x'::VARCHAR AS s\n  LIMIT 0\n) TO 'path/to/empty.lance' (FORMAT lance, mode 'overwrite', write_empty_file true);\n```\n\nTo write to `s3://...` paths, configure a `TYPE LANCE` secret for that scope (see [`docs/cloud.md`](./docs/cloud.md)).\n\n```sql\nCREATE SECRET (\n  TYPE LANCE,\n  PROVIDER credential_chain,\n  SCOPE 's3://bucket/'\n);\n\nCOPY (SELECT 1 AS id) TO 's3://bucket/path/to/out.lance' (FORMAT lance, mode 'overwrite');\n```\n\n### Create a Lance dataset via `CREATE TABLE` (directory namespace)\n\nWhen you `ATTACH` a directory as a Lance namespace, you can create new datasets using `CREATE TABLE` (schema-only)\nor `CREATE TABLE AS SELECT` (CTAS). The dataset is written to `\u003cnamespace_root\u003e/\u003ctable_name\u003e.lance`.\n\n```sql\nATTACH 'path/to/dir' AS lance_ns (TYPE LANCE);\n\n-- Schema-only (creates an empty dataset)\nCREATE TABLE lance_ns.main.my_empty (id BIGINT, s VARCHAR);\n\n-- CTAS (writes query results)\nCREATE TABLE lance_ns.main.my_dataset AS\n  SELECT 1::BIGINT AS id, 'a'::VARCHAR AS s\n  UNION ALL\n  SELECT 2::BIGINT AS id, 'b'::VARCHAR AS s;\n\nSELECT count(*) FROM lance_ns.main.my_dataset;\n```\n\n### Vector search\n\n```sql\n-- Search a vector column, returning distances in `_distance` (smaller is closer)\nSELECT id, label, _distance\nFROM lance_vector_search('path/to/dataset.lance', 'vec', [0.1, 0.2, 0.3, 0.4]::FLOAT[4],\n                         k = 5, prefilter = true)\nORDER BY _distance ASC;\n```\n\nSee the SQL reference for full parameter documentation: [docs/sql.md#search](docs/sql.md#search).\n\n### Full-text search (FTS)\n\n```sql\n-- Search a text column, returning BM25-like scores in `_score`\nSELECT id, text, _score\nFROM lance_fts('path/to/dataset.lance', 'text', 'puppy', k = 10, prefilter = true)\nORDER BY _score DESC;\n```\n\nSee the SQL reference for full parameter documentation: [docs/sql.md#search](docs/sql.md#search).\n\n### Hybrid search (vector + FTS)\n\n```sql\n-- Combine vector and text scores, returning `_hybrid_score` in addition to `_distance` / `_score`\nSELECT id, _hybrid_score, _distance, _score\nFROM lance_hybrid_search('path/to/dataset.lance',\n                         'vec', [0.1, 0.2, 0.3, 0.4]::FLOAT[4],\n                         'text', 'puppy',\n                         k = 10, prefilter = false,\n                         alpha = 0.5, oversample_factor = 4)\nORDER BY _hybrid_score DESC;\n```\n\nSee the SQL reference for full parameter documentation: [docs/sql.md#search](docs/sql.md#search).\n\n## Contributing\n\nIssues and PRs are welcome. High-impact areas include pushdown, parallelism/performance, type coverage, and better diagnostics.\n\n## License\n\nApache License 2.0.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flance-format%2Flance-duckdb","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Flance-format%2Flance-duckdb","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flance-format%2Flance-duckdb/lists"}