{"id":20001154,"url":"https://github.com/apecloud/myduckserver","last_synced_at":"2025-05-15T23:08:03.839Z","repository":{"id":261362041,"uuid":"845992583","full_name":"apecloud/myduckserver","owner":"apecloud","description":"Unified MySQL, Postgres \u0026 FlightSQL Server, Powered by DuckDB.","archived":false,"fork":false,"pushed_at":"2025-01-17T09:04:02.000Z","size":5942,"stargazers_count":423,"open_issues_count":36,"forks_count":19,"subscribers_count":10,"default_branch":"main","last_synced_at":"2025-04-08T11:08:03.377Z","etag":null,"topics":["analytics","arrow","business-analytics","business-intelligence","columnar-storage","data-engineering","data-science","database","duckdb","htap","mariadb","mysql","olap","pandas","parquet","polars","postgres","replication","sql","zero-etl"],"latest_commit_sha":null,"homepage":"https://myduck.io/TODO","language":"Go","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/apecloud.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-08-22T10:28:34.000Z","updated_at":"2025-04-08T02:30:39.000Z","dependencies_parsed_at":"2024-11-20T13:34:58.561Z","dependency_job_id":"0bf4af5f-e8e8-4f59-88cb-cfabbb5d7fe4","html_url":"https://github.com/apecloud/myduckserver","commit_stats":null,"previous_names":["apecloud/myduckserver"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/apecloud%2Fmyduckserver","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/apecloud%2Fmyduckserver/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/apecloud%2Fmyduckserver/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/apecloud%2Fmyduckserver/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/apecloud","download_url":"https://codeload.github.com/apecloud/myduckserver/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254436949,"owners_count":22070947,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["analytics","arrow","business-analytics","business-intelligence","columnar-storage","data-engineering","data-science","database","duckdb","htap","mariadb","mysql","olap","pandas","parquet","polars","postgres","replication","sql","zero-etl"],"created_at":"2024-11-13T05:16:51.717Z","updated_at":"2025-05-15T23:07:57.788Z","avatar_url":"https://github.com/apecloud.png","language":"Go","readme":"\u003ch1 style=\"display: flex; align-items: center;\"\u003e\n    \u003cimg width=\"50\" alt=\"duck under dolphin\" style=\"margin-right: 0.2em\" src=\"logo/myduck-logo.png\"\u003e\n    \u003cspan\u003eMyDuck Server\u003c/span\u003e\n\u003c/h1\u003e\n\n**MyDuck Server** unlocks serious power for your MySQL \u0026 Postgres analytics. Imagine the simplicity of (MySQL|Postgres)’s familiar interface fused with the raw analytical speed of [DuckDB](https://duckdb.org/). Now you can supercharge your analytical queries with DuckDB’s lightning-fast OLAP engine, all while using the tools and dialect you know.\n\n\u003ch1 style=\"display: flex; align-items: center;\"\u003e\n    \u003cimg alt=\"duck under dolphin\" style=\"margin-right: 0.2em\" src=\"logo/MyDuck.svg\"\u003e\n\u003c/h1\u003e\n\n## 📑 Table of Contents\n\n- [Why MyDuck](#-why-myduck-)\n- [Key Features](#-key-features)\n- [Performance](#-performance)\n- [Getting Started](#-getting-started)\n  - [Prerequisites](#prerequisites)\n  - [Installation](#installation)\n  - [Usage](#usage)\n  - [Replicating Data](#replicating-data)\n  - [Connecting to Cloud MySQL \u0026 Postgres](#connecting-to-cloud-mysql--postgres)\n  - [HTAP Setup](#htap-setup)\n  - [Customizing the Docker Container](#customizing-the-docker-container)\n  - [Query Parquet Files](#query-parquet-files)\n  - [Already Using DuckDB?](#already-using-duckdb)\n  - [Backup and Restore with Object Storage](#backup-and-restore-with-object-storage)\n  - [LLM Integration](#llm-integration)\n  - [Access from Python](#access-from-python)\n- [Roadmap](#-roadmap)\n- [Contributing](#-contributing)\n- [Acknowledgements](#-acknowledgements)\n- [License](#-license)\n\n## ❓ Why MyDuck ❓\n\nWhile MySQL and Postgres are the most popular open-source databases for OLTP, their performance in analytics often falls short. DuckDB, on the other hand, is built for fast, embedded analytical processing. MyDuck Server lets you enjoy DuckDB's high-speed analytics without leaving the (MySQL|Postgres) ecosystem.\n\nWith MyDuck Server, you can:\n\n- **Set up an isolated, fast, and real-time replica** dedicated to ad-hoc analytics, batch jobs, and LLM-generated queries, without exhausting or corrupting your primary database 🔥\n- **Accelerate existing MySQL \u0026 Postgres analytics** to new heights through DuckDB's high-speed engine with minimal changes 🚀\n- **Enable richer \u0026 faster connectivity** between modern data manipulation \u0026 analysis tools and your MySQL \u0026 Postgres data 🛠️\n- **Go beyond MySQL \u0026 Postgres syntax** with DuckDB's advanced SQL features to expand your analytics potential 🦆\n- **Run DuckDB in server mode** to share a DuckDB instance with your team or among your applications 🌩️\n- **Build HTAP systems** by combining (MySQL|Postgres) for transactions with MyDuck for analytics 🔄\n- and much more! See below for a full list of feature highlights.\n\nMyDuck Server isn't here to replace MySQL \u0026 Postgres — it's here to help MySQL \u0026 Postgres users do more with their data. This open-source project provides a convenient way to integrate high-speed analytics into your workflow while embracing the flexibility and efficiency of DuckDB.\n\n## ✨ Key Features\n\n\n- **Blazing Fast OLAP with DuckDB**: MyDuck stores data in DuckDB, an OLAP-optimized database known for lightning-fast analytical queries. DuckDB enables MyDuck to execute queries up to 1000x faster than traditional MySQL \u0026 Postgres setups, making complex analytics practical that were previously unfeasible.\n\n- **MySQL-Compatible Interface**: MyDuck implements the MySQL wire protocol and understands MySQL syntax, allowing you to connect with any MySQL client and run MySQL-style SQL. MyDuck automatically translates your queries and executes them in DuckDB.\n\n- **Postgres-Compatible Interface**: MyDuck implements the Postgres wire protocol, enabling you to send DuckDB SQL directly using any Postgres client. Since DuckDB's SQL dialect [closely resembles PostgreSQL](https://duckdb.org/docs/sql/dialect/postgresql_compatibility.html), you can speed up existing Postgres queries with minimal changes.\n\n- **Raw DuckDB Power**: MyDuck provides full access to DuckDB's analytical capabilities through raw DuckDB SQL, including [friendly SQL syntax](https://duckdb.org/docs/sql/dialect/friendly_sql.html), [advanced aggregates](https://duckdb.org/docs/sql/functions/aggregates), [remote data source access](https://duckdb.org/docs/data/data_sources), [nested data types](https://duckdb.org/docs/sql/data_types/overview#nested--composite-types), and more.\n\n- **Zero-ETL**: Simply start replication and begin querying! MyDuck can function as a MySQL replica or Postgres standby, replicating data from your primary server in real-time. It works like standard MySQL \u0026 Postgres replication - using MySQL's `START REPLICA` or Postgres' `CREATE SUBSCRIPTION` commands, eliminating the need for complex ETL pipelines.\n\n- **Consistent and Efficient Replication**: Thanks to DuckDB's [solid ACID support](https://duckdb.org/2024/09/25/changing-data-with-confidence-and-acid.html), we've carefully managed transaction boundaries in the replication stream to ensure a **consistent data view** — you'll never see dirty data mid-transaction. Plus, MyDuck's **transaction batching** collects updates from multiple transactions and applies them to DuckDB in batches, significantly reducing write overhead (since DuckDB isn’t designed for high-frequency OLTP writes).\n\n- **HTAP Architecture Support**: MyDuck works well with database proxy tools to enable hybrid transactional/analytical processing setups. You can route DML operations to (MySQL|Postgres) and analytical queries to MyDuck, creating a powerful HTAP architecture that combines the best of both worlds.\n\n- **Bulk Upload \u0026 Download**: MyDuck supports fast bulk data loading from the client side with the standard MySQL `LOAD DATA LOCAL INFILE` command or the  PostgreSQL `COPY FROM STDIN` command. You can also extract data from MyDuck using the PostgreSQL `COPY TO STDOUT` command.\n\n- **End-to-End Columnar IO**: In addition to the traditional row-oriented data transfer in MySQL \u0026 Postgres protocol, MyDuck can also send query results and receive data uploads in columnar format, which can be significantly faster for high-volume data. This is implemented on top of the standard Postgres `COPY` protocol with extended columnar format support, e.g., `COPY ... TO STDOUT (FORMAT parquet | arrow)`, allowing you to use the standard Postgres client library to interact with MyDuck in an optimized way.\n\n- **Standalone Mode**: MyDuck can run in standalone mode without replication. In this mode, it is a drop-in replacement for (MySQL|Postgres), but with a DuckDB heart. You can `CREATE TABLE`, transactionally `INSERT`, `UPDATE`, and `DELETE` data, and run blazingly fast `SELECT` queries.\n\n- **DuckDB in Server Mode**: If you aren't interested in MySQL \u0026 Postgres but just want to share a DuckDB instance with your team or among your applications, MyDuck is also a great solution. You can deploy MyDuck to a server, connect to it with the Postgres client library in your favorite programming language, and start running DuckDB SQL queries directly.\n\n- **Seamless Integration with Dump \u0026 Copy Utilities**: MyDuck plays well with modern MySQL \u0026 Postgres data migration tools, especially the [MySQL Shell](https://dev.mysql.com/doc/mysql-shell/en/) and [pg_dump](https://www.postgresql.org/docs/current/app-pgdump.html). For MySQL, you can load data into MyDuck in parallel from a MySQL Shell dump, or leverage the Shell’s `copy-instance` utility to copy a consistent snapshot of your running MySQL server to MyDuck. For Postgres, MyDuck can load data from a `pg_dump` archive.\n\n## 📊 Performance\n\nTypical OLAP queries can run **up to 1000x faster** with MyDuck Server compared to MySQL \u0026 Postgres alone, especially on large datasets. Under the hood, it's just DuckDB doing what it does best: processing analytical queries at lightning speed. You are welcome to run your own benchmarks and prepare to be amazed! Alternatively, you can refer to well-known benchmarks like the [ClickBench](https://benchmark.clickhouse.com/) and [H2O.ai db-benchmark](https://duckdblabs.github.io/db-benchmark/) to see how DuckDB performs against other databases and data science tools. Also remember that DuckDB has robust support for transactions, JOINs, and [larger-than-memory query processing](https://duckdb.org/2024/07/09/memory-management.html), which are unavailable in many competing systems and tools.\n\n## 🏃‍♂️ Getting Started\n\n### Prerequisites\n\n- **Docker** (recommended) for setting up MyDuck Server quickly.\n- MySQL or PostgreSQL CLI clients for connecting and testing your setup.\n\n### Installation\n\nGet a standalone MyDuck Server up and running in minutes using Docker:\n\n```bash\ndocker run -p 13306:3306 -p 15432:5432 apecloud/myduckserver:latest\n```\n\nThis setup exposes:\n\n- **Port 13306** for MySQL wire protocol connections.\n- **Port 15432** for PostgreSQL wire protocol connections, allowing direct DuckDB SQL.\n\n### Usage\n\n#### Connecting via MySQL client\n\nConnect using any MySQL client to run MySQL-style SQL queries:\n\n```bash\nmysql -h127.0.0.1 -P13306 -uroot\n```\n\n\u003e [!NOTE]\n\u003e MySQL CLI clients version 9.0 and above are not yet supported on macOS. Consider `brew install mysql-client@8.4`.\n\n#### Connecting via PostgreSQL client\n\nFor full analytical power, connect to the Postgres port and run DuckDB SQL queries directly:\n\n```bash\npsql -h 127.0.0.1 -p 15432 -U postgres\n```\n\n### Replicating Data\n\nWe have integrated a setup tool in the Docker image that helps replicate data from your primary (MySQL|Postgres) server to MyDuck Server. The tool is available via the `SETUP_MODE` environment variable. In `REPLICA` mode, the container will start MyDuck Server, dump a snapshot of your primary (MySQL|Postgres) server, and start replicating data in real-time.\n\n\u003e [!NOTE]\n\u003e Supported primary database versions: MySQL\u003e=8.0 and PostgreSQL\u003e=13. In addition to the default settings,\nlogical replication must be enabled for PostgreSQL by setting `wal_level=logical`.\n\u003e For MySQL, GTID-based replication (`gtid_mode=ON` and `enforce_gtid_consistency=ON`) is recommended but not required.\n\n```bash\ndocker run -d --name myduck \\\n  -p 13306:3306 \\ \n  -p 15432:5432 \\\n  --env=SETUP_MODE=REPLICA \\\n  --env=SOURCE_DSN=\"\u003cpostgres|mysql\u003e://\u003cuser\u003e:\u003cpassword\u003e@\u003chost\u003e:\u003cport\u003e/\u003cdbname\u003e\"\n  apecloud/myduckserver:latest\n```\n`SOURCE_DSN` specifies the connection string to the primary database server, which can be either MySQL or PostgreSQL.\n\n- **MySQL Primary:** Use the `mysql` URI scheme, e.g.,  \n  `--env=SOURCE_DSN=mysql://root:password@example.com:3306`\n\n- **PostgreSQL Primary:** Use the `postgres` URI scheme, e.g.,  \n  `--env=SOURCE_DSN=postgres://postgres:password@example.com:5432/db01`\n\n\u003e [!NOTE]\n\u003e To replicate from a server running on the host machine, use `host.docker.internal` as the hostname instead of `localhost` or `127.0.0.1`. On Linux, you must also add `--add-host=host.docker.internal:host-gateway` to the `docker run` command.\n\n### Connecting to Cloud MySQL \u0026 Postgres\n\nMyDuck Server supports setting up replicas from common cloud-based MySQL \u0026 Postgres offerings. For more information, please refer to the [replica setup guide](docs/tutorial/replica-setup-rds.md).\n\n### HTAP Setup\n\nWith MyDuck's powerful analytics capabilities, you can create an hybrid transactional/analytical processing system where high-frequency data writes are directed to a standard MySQL or Postgres instance, while analytical queries are handled by a MyDuck Server instance. Follow our HTAP setup instructions to easily set up an HTAP demonstration:\n* Provisioning a MySQL HTAP cluster based on [ProxySQL](docs/tutorial/mysql-htap-proxysql-setup.md) or [MariaDB MaxScale](docs/tutorial/mysql-htap-maxscale-setup.md).\n* Provisioning a PostgreSQL HTAP cluster based on [PGPool-II](docs/tutorial/pg-htap-pgpool-setup.md)\n\n### Customizing the Docker Container\n\nTo rename the default database, pass the `DEFAULT_DB` environment variable to the Docker container:\n\n```bash\ndocker run -d -p 13306:3306 -p 15432:5432 \\\n    --env=DEFAULT_DB=mydbname \\\n    apecloud/myduckserver:latest\n```\n\n\nTo set the superuser password, pass the `SUPERUSER_PASSWORD` environment variable to the Docker container:\n\n```bash\ndocker run -d -p 13306:3306 -p 15432:5432 \\\n    --env=SUPERUSER_PASSWORD=mysecretpassword \\\n    apecloud/myduckserver:latest\n```\n\n\nTo initialize MyDuck Server with custom SQL statements, mount your `.sql` file to either `/docker-entrypoint-initdb.d/mysql/` or `/docker-entrypoint-initdb.d/postgres/` inside the Docker container, depending on the SQL dialect you're using.\n\nFor example:\n```bash\n# Execute `init.sql` via MySQL protocol\ndocker run -d -p 13306:3306 --name=myduck \\\n    -v ./init.sql:/docker-entrypoint-initdb.d/mysql/init.sql \\\n    apecloud/myduckserver:latest\n\n# Execute `init.sql` via PostgreSQL protocol\ndocker run -d -p 15432:5432 --name=myduck \\\n    -v ./init.sql:/docker-entrypoint-initdb.d/postgres/init.sql \\\n    apecloud/myduckserver:latest\n```\n\n### Query Parquet Files\n\nLooking to load Parquet files into MyDuck Server and start querying? Follow our [Parquet file loading guide](docs/tutorial/load-parquet-files.md) for easy setup.\n\n### Already Using DuckDB?\n\nAlready have a DuckDB file? You can seamlessly bootstrap MyDuck Server with it. See our [DuckDB file bootstrapping guide](docs/tutorial/bootstrap.md) for more details.\n\n### Managing Multiple Databases\n\nEasily manage multiple databases in MyDuck Server, same as Postgres. For step-by-step instructions and detailed guidance, check out our [Database Management Guide](docs/tutorial/manage-multiple-databases.md).\n\n### Backup and Restore with Object Storage\n\nTo back up and restore your databases inside MyDuck Server using object storage, refer to our [backup and restore guide](docs/tutorial/backup-restore.md) for detailed instructions.\n\n### LLM Integration\n\nMyDuck Server can be integrated with LLM applications via the [Model Context Protocol (MCP)](https://modelcontextprotocol.io/introduction). Follow the [MCP integration guide](docs/tutorial/mcp.md) to set up MyDuck Server as an external data source for LLMs.\n\n### Access from Python\n\nMyDuck Server can be seamlessly accessed from the Python data science ecosystem. Follow the [Python integration guide](docs/tutorial/pg-python-data-tools.md) to connect to MyDuck Server from Python and export data to PyArrow, pandas, and Polars. Additionally, check out the [Ibis integration guide](docs/tutorial/connect-with-ibis-setup.md) for using the [Ibis](https://ibis-project.org/) dataframe API to query MyDuck Server directly.\n\n## 🎯 Roadmap\n\nWe have big plans for MyDuck Server! Here are some of the features we’re working on:\n\n- [x] Arrow Flight SQL.\n- [x] Multiple DB.\n- [ ] Authentication.\n- [ ] ...and more! We’re always looking for ways to make MyDuck Server better. If you have a feature request, please let us know by [opening an issue](https://github.com/apecloud/myduckserver/issues/new).\n\n## 🏡 Join the Community\n\nLet's connect on [Discord](https://discord.gg/9MC5cgw5YK) to discuss requirements, address issues, and share user experiences.\n\n## 💡 Contributing\n\nMyDuck Server is open-source, and we’d love your help to keep it growing! Check out our [CONTRIBUTING.md](CONTRIBUTING.md) for ways to get involved. From bug reports to feature requests, all contributions are welcome!\n\n## 💗 Acknowledgements\n\nMyDuck Server is built on top of a collection of amazing open-source projects, notably:\n- [DuckDB](https://duckdb.org/) - The fast in-process analytical database that powers MyDuck Server.\n- [go-mysql-server](https://github.com/dolthub/go-mysql-server) - The outstanding MySQL server implementation in Go maintained by [DoltHub](https://www.dolthub.com/team) that MyDuck Server is bulit on. We also draw significant inspiration from [Dolt](https://github.com/dolthub/dolt) and [Doltgres](https://github.com/dolthub/doltgres).\n- [Vitess](https://vitess.io/) - Provides the MySQL replication stream used in MyDuck Server.\n- [go-duckdb](https://github.com/marcboeker/go-duckdb): An excellent Go driver for DuckDB that works seamlessly.\n- [SQLGlot](https://github.com/tobymao/sqlglot) - The ultimate SQL transpiler.\n\nWe are grateful to the developers and contributors of these projects for their hard work and dedication to open-source software.\n\n## 📝 License\n\nMyDuck Server is released under the [Apache License 2.0](LICENSE).\n","funding_links":[],"categories":["Client-Server Setups"],"sub_categories":["Web Clients (WebAssembly)"],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fapecloud%2Fmyduckserver","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fapecloud%2Fmyduckserver","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fapecloud%2Fmyduckserver/lists"}