{"id":14247113,"url":"https://github.com/duckdb/pg_duckdb","last_synced_at":"2025-05-14T05:12:02.822Z","repository":{"id":252101944,"uuid":"775009948","full_name":"duckdb/pg_duckdb","owner":"duckdb","description":"DuckDB-powered Postgres for high performance apps \u0026 analytics.","archived":false,"fork":false,"pushed_at":"2025-05-09T15:07:37.000Z","size":5494,"stargazers_count":2216,"open_issues_count":57,"forks_count":101,"subscribers_count":23,"default_branch":"main","last_synced_at":"2025-05-10T23:04:01.194Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"C++","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/duckdb.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2024-03-20T15:46:12.000Z","updated_at":"2025-05-10T14:27:27.000Z","dependencies_parsed_at":"2024-09-15T23:28:55.348Z","dependency_job_id":"eafbbb77-48f6-4ee7-b0b8-48e08affe2db","html_url":"https://github.com/duckdb/pg_duckdb","commit_stats":{"total_commits":319,"total_committers":19,"mean_commits":"16.789473684210527","dds":0.7084639498432601,"last_synced_commit":"3b34beeaca0aca6aea2a4f1707991f8ad3b154a9"},"previous_names":["duckdb/pg_duckdb"],"tags_count":4,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/duckdb%2Fpg_duckdb","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/duckdb%2Fpg_duckdb/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/duckdb%2Fpg_duckdb/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/duckdb%2Fpg_duckdb/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/duckdb","download_url":"https://codeload.github.com/duckdb/pg_duckdb/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254076850,"owners_count":22010611,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-08-21T23:00:41.905Z","updated_at":"2025-05-14T05:11:57.811Z","avatar_url":"https://github.com/duckdb.png","language":"C++","readme":"\u003cp align=\"center\"\u003e\n    \u003cpicture\u003e\n        \u003csource media=\"(prefers-color-scheme: dark)\" srcset=\"logo-dark.svg\"\u003e\n        \u003cimg width=\"800\" src=\"logo-light.svg\" alt=\"pg_duckdb logo\" /\u003e\n    \u003c/picture\u003e\n\u003c/p\u003e\n\u003cp align=\"center\"\u003e\n\t0.3.0 release is here 🎉\u003cbr /\u003e\n\tPlease \u003ca href=\"#try-it-out\"\u003etry\u003c/a\u003e it out!\n\u003c/p\u003e\n\n# pg_duckdb: Official Postgres extension for DuckDB\n\npg_duckdb is a Postgres extension that embeds DuckDB's columnar-vectorized analytics engine and features into Postgres. We recommend using pg_duckdb to build high performance analytics and data-intensive applications.\n\npg_duckdb was developed in collaboration with our partners, [Hydra][] and [MotherDuck][].\n\n## Try It Out\n\nAn easy way to try pg_duckdb is using the Hydra python package. Try now locally or deploy to the cloud:\n\n```\npip install hydra-cli\nhydra\n```\n\n## Features\n\nSee our [official documentation][docs] for further details.\n\n- `SELECT` queries executed by the DuckDB engine can directly read Postgres tables. (If you only query Postgres tables you need to run `SET duckdb.force_execution TO true`, see the **IMPORTANT** section above for details)\n\t- Able to read [data types](https://www.postgresql.org/docs/current/datatype.html) that exist in both Postgres and DuckDB. The following data types are supported: numeric, character, binary, date/time, boolean, uuid, json, domain, and arrays.\n\t- If DuckDB cannot support the query for any reason, execution falls back to Postgres.\n- Read and Write support for object storage (AWS S3, Azure, Cloudflare R2, or Google GCS):\n\t- Read parquet, CSV and JSON files:\n\t\t- `SELECT * FROM read_parquet('s3://bucket/file.parquet')`\n\t\t- `SELECT r['id'], r['name'] FROM read_csv('s3://bucket/file.csv') r`\n\t\t- `SELECT count(*) FROM read_json('s3://bucket/file.json')`\n\t\t- You can pass globs and arrays to these functions, just like in DuckDB\n\t- Enable the DuckDB Iceberg extension using `SELECT duckdb.install_extension('iceberg')` and read Iceberg files with `iceberg_scan`.\n\t- Enable the DuckDB Delta extension using `SELECT duckdb.install_extension('delta')` and read Delta files with `delta_scan`.\n\t- Write a query — or an entire table — to parquet in object storage.\n\t\t- `COPY (SELECT foo, bar FROM baz) TO 's3://...'`\n\t\t- `COPY table TO 's3://...'`\n\t\t- Read and write to Parquet format in a single query\n\n\t\t\t```sql\n\t\t\tCOPY (\n\t\t\t\tSELECT count(*), r['name']\n\t\t\t\tFROM read_parquet('s3://bucket/file.parquet') r\n\t\t\t\tGROUP BY name\n\t\t\t\tORDER BY count DESC\n\t\t\t) TO 's3://bucket/results.parquet';\n\t\t\t```\n- Read and Write support for data stored in MotherDuck\n- Query and `JOIN` data in object storage/MotherDuck with Postgres tables, views, and materialized views.\n- Create temporary tables in DuckDB its columnar storage format using `CREATE TEMP TABLE ... USING duckdb`.\n- Install DuckDB extensions using `SELECT duckdb.install_extension('extension_name');`\n- Toggle DuckDB execution on/off with a setting:\n\t- `SET duckdb.force_execution = true|false`\n- Cache remote object locally for faster execution using `SELECT duckdb.cache('path', 'type');` where\n\t- 'path' is HTTPFS/S3/GCS/R2 remote object\n\t- 'type' specify remote object type: 'parquet' or 'csv'\n\n## Installation\n\n### Docker\n\nDocker images are [available on Dockerhub](https://hub.docker.com/r/pgduckdb/pgduckdb) and are based on the official Postgres image. Use of this image is [the same as the Postgres image](https://hub.docker.com/_/postgres/). For example, you can run the image directly:\n\n```shell\ndocker run -d -e POSTGRES_PASSWORD=duckdb pgduckdb/pgduckdb:16-main\n```\n\nAnd with MotherDuck, you only need a [a MotherDuck access token][md-access-token] and then it is as simple as:\n```shell\n$ export MOTHERDUCK_TOKEN=\u003cyour personal MD token\u003e\n$ docker run -d -e POSTGRES_PASSWORD=duckdb -e MOTHERDUCK_TOKEN pgduckdb/pgduckdb:16-main\n```\n\nOr you can use the docker compose in this repo:\n\n```shell\ngit clone https://github.com/duckdb/pg_duckdb \u0026\u0026 cd pg_duckdb \u0026\u0026 docker compose up -d\n```\n\nOnce started, connect to the database using psql:\n\n```shell\npsql postgres://postgres:duckdb@127.0.0.1:5432/postgres\n# Or if using docker compose\ndocker compose exec db psql\n```\n\nFor other usages see our [Docker specific README][docker readme].\n\n[docker readme]: https://github.com/duckdb/pg_duckdb/blob/main/docker/README.md\n### pgxman (apt)\n\nPre-built apt binaries are [available via pgxman](https://pgx.sh/pg_duckdb). After installation, you will need to add pg_duckdb to `shared_preload_libraries` and create the extension.\n\n```shell\npgxman install pg_duckdb\n```\n\nNote: due to the use of `shared_preload_libraries`, pgxman's container support is not currently compatible with pg_duckdb.\n\n### Compile from source\n\nTo build pg_duckdb, you need:\n\n* Postgres 14-17\n* Ubuntu 22.04-24.04 or MacOS\n* Standard set of build tools for building Postgres extensions\n* [Build tools that are required to build DuckDB](https://duckdb.org/docs/dev/building/build_instructions)\n\nTo build and install, run:\n\n```sh\nmake install\n```\n\nAdd `pg_duckdb` to the `shared_preload_libraries` in your `postgresql.conf` file:\n\n```ini\nshared_preload_libraries = 'pg_duckdb'\n```\n\nNext, create the `pg_duckdb` extension:\n\n```sql\nCREATE EXTENSION pg_duckdb;\n```\n\n**IMPORTANT:** DuckDB execution is usually enabled automatically when needed. It's enabled whenever you use DuckDB functions (such as `read_csv`), when you query DuckDB tables, and when running `COPY table TO 's3://...'`. However, if you want queries which only touch Postgres tables to use DuckDB execution you need to run `SET duckdb.force_execution TO true`'. This feature is _opt-in_ to avoid breaking existing queries. To avoid doing that for every session, you can configure it for a certain user by doing `ALTER USER my_analytics_user SET duckdb.force_execution TO true`.\n\n## Getting Started\n\nSee our [official documentation][docs] for more usage information.\n\npg_duckdb relies on DuckDB's vectorized execution engine to read and write data to object storage bucket (AWS S3, Azure, Cloudflare R2, or Google GCS) and/or MotherDuck. The follow two sections describe how to get started with these destinations.\n\n### Object storage bucket (AWS S3, Azure, Cloudflare R2, or Google GCS)\n\nQuerying data stored in Parquet, CSV, JSON, Iceberg and Delta format can be done with `read_parquet`, `read_csv`, `read_json`, `iceberg_scan` and `delta_scan` respectively.\n\n1. Add a credential to enable DuckDB's httpfs support.\n\n\t```sql\n\t-- Session Token is Optional\n\tINSERT INTO duckdb.secrets\n\t(type, key_id, secret, session_token, region)\n\tVALUES ('S3', 'access_key_id', 'secret_access_key', 'session_token', 'us-east-1');\n\t```\n\n2. Copy data directly to your bucket - no ETL pipeline!\n\n\t```sql\n\tCOPY (SELECT user_id, item_id, price, purchased_at FROM purchases)\n\tTO 's3://your-bucket/purchases.parquet;\n\t```\n\n3. Perform analytics on your data.\n\n\t```sql\n\tSELECT SUM(r['price']) AS total, r['item_id']\n\tFROM read_parquet('s3://your-bucket/purchases.parquet') r\n\tGROUP BY item_id\n\tORDER BY total DESC\n\tLIMIT 100;\n\t```\n\nNote, for Azure, you may store a secret using the `connection_string` parameter as such:\n```sql\nINSERT INTO duckdb.secrets\n(type, connection_string)\nVALUES ('Azure', '\u003cyour connection string\u003e');\n```\n\nNote: writes to Azure are not yet supported, please see [the current discussion](duckdb/duckdb_azure#44) for more information.\n\n### Connect with MotherDuck\n\n`pg_duckdb` also integrates with [MotherDuck][md].\nTo enable this support you first need to [generate an access token][md-access-token].\nThen you can enable it by simply using the `enable_motherduck` convenience method:\n\n```sql\n-- If not provided, the token will be read from the `motherduck_token` environment variable\n-- If not provided, the default MD database name is `my_db`\nSELECT duckdb.enable_motherduck('\u003coptional token\u003e', '\u003coptional MD database name\u003e');\n```\n\nRead more [here][md-docs] about MotherDuck integration.\n\nYou can now create tables in the MotherDuck database by using the `duckdb` [Table Access Method][tam] like this:\n```sql\nCREATE TABLE orders(id bigint, item text, price NUMERIC(10, 2)) USING duckdb;\nCREATE TABLE users_md_copy USING duckdb AS SELECT * FROM users;\n```\n\n[tam]: https://www.postgresql.org/docs/current/tableam.html\n[md-docs]: docs/motherduck.md\n\n\nAny tables that you already had in MotherDuck are automatically available in Postgres. Since DuckDB and MotherDuck allow accessing multiple databases from a single connection and Postgres does not, we map database+schema in DuckDB to a schema name in Postgres.\n\nThis is done in the following way:\n1. Each schema in your default MotherDuck database (see above on how to specify which database is the default) are simply merged with the Postgres schemas with the same name.\n2. Except for the `main` DuckDB schema in your default database, which is merged with the Postgres `public` schema.\n3. Tables in other databases are put into dedicated DuckDB-only schemas. These schemas are of the form `ddb$\u003cduckdb_db_name\u003e$\u003cduckdb_schema_name\u003e` (including the literal `$` characters).\n4. Except for the `main` schema in those other databases. That schema should be accessed using the shorter name `ddb$\u003cdb_name\u003e` instead.\n\nAn example of each of these cases is shown below:\n\n```sql\nINSERT INTO my_table VALUES (1, 'abc'); -- inserts into my_db.main.my_table\nINSERT INTO your_schema.tab1 VALUES (1, 'abc'); -- inserts into my_db.your_schema.tab1\nSELECT COUNT(*) FROM ddb$my_shared_db.aggregated_order_data; -- reads from my_shared_db.main.aggregated_order_data\nSELECT COUNT(*) FROM ddb$sample_data$hn.hacker_news; -- reads from sample_data.hn.hacker_news\n```\n\n[md]: https://motherduck.com/\n[md-access-token]: https://motherduck.com/docs/key-tasks/authenticating-and-connecting-to-motherduck/authenticating-to-motherduck/#authentication-using-an-access-token\n\n## Roadmap\n\nPlease see the [project milestones][milestones] for upcoming planned tasks and features.\n\n## Contributing\n\npg_duckdb was developed in collaboration with our partners, [Hydra][] and [MotherDuck][]. We look forward to their continued contributions and leadership.\n\n[Hydra][] is a Y Combinator-backed database company, focused on DuckDB-Powered Postgres for app developers.\n\n[MotherDuck][] is the cloud-based data warehouse that extends the power of DuckDB.\n\nWe welcome all contributions big and small:\n\n- [Vote on or suggest features][discussions] for our roadmap.\n- [Open a PR][prs].\n- [Submit a feature request or bug report][issues].\n- [Improve the docs][docs].\n\n## Resources\n\n- [Read the pg_duckdb documentation][docs].\n- Please see the [project milestones][milestones] for upcoming planned tasks and features.\n- [GitHub Issues][issues] for bug reports\n- [Join the DuckDB Discord community](https://discord.duckdb.org/) then chat in [the #pg_duckdb channel](https://discord.com/channels/909674491309850675/1289177578237857802).\n\n[milestones]: https://github.com/duckdb/pg_duckdb/milestones\n[discussions]: https://github.com/duckdb/pg_duckdb/discussions\n[prs]: https://github.com/duckdb/pg_duckdb/pulls\n[issues]: https://github.com/duckdb/pg_duckdb/issues\n[Hydra]: https://hydra.so/\n[Motherduck]: https://motherduck.com/\n[docs]: https://github.com/duckdb/pg_duckdb/tree/main/docs\n","funding_links":[],"categories":["C++","Client-Server Setups","\u003ca name=\"C%2B%2B\"\u003e\u003c/a\u003eC++"],"sub_categories":["Web Clients (WebAssembly)"],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fduckdb%2Fpg_duckdb","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fduckdb%2Fpg_duckdb","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fduckdb%2Fpg_duckdb/lists"}