{"id":13478596,"url":"https://github.com/postgresml/pgcat","last_synced_at":"2025-05-13T22:03:15.280Z","repository":{"id":37021238,"uuid":"453485519","full_name":"postgresml/pgcat","owner":"postgresml","description":"PostgreSQL pooler with sharding, load balancing and failover support.","archived":false,"fork":false,"pushed_at":"2025-02-27T21:51:01.000Z","size":1606,"stargazers_count":3394,"open_issues_count":188,"forks_count":220,"subscribers_count":48,"default_branch":"main","last_synced_at":"2025-04-03T11:36:56.820Z","etag":null,"topics":["pooler","pooling","postgresql","rust"],"latest_commit_sha":null,"homepage":"","language":"Rust","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/postgresml.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2022-01-29T18:34:44.000Z","updated_at":"2025-04-03T02:32:06.000Z","dependencies_parsed_at":"2023-12-04T05:26:00.122Z","dependency_job_id":"4c643e6a-6a52-4a0d-8ded-d68c7d8b1b56","html_url":"https://github.com/postgresml/pgcat","commit_stats":{"total_commits":574,"total_committers":53,"mean_commits":"10.830188679245284","dds":0.5487804878048781,"last_synced_commit":"3202f5685b2c9017b59f1374977f37aa3fd3c93a"},"previous_names":["levkk/pgcat"],"tags_count":20,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/postgresml%2Fpgcat","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/postgresml%2Fpgcat/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/postgresml%2Fpgcat/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/postgresml%2Fpgcat/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/postgresml","download_url":"https://codeload.github.com/postgresml/pgcat/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248252661,"owners_count":21072699,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["pooler","pooling","postgresql","rust"],"created_at":"2024-07-31T16:01:59.161Z","updated_at":"2025-04-10T16:20:03.865Z","avatar_url":"https://github.com/postgresml.png","language":"Rust","funding_links":[],"categories":["Rust","\u003ca name=\"Rust\"\u003e\u003c/a\u003eRust"],"sub_categories":[],"readme":"## PgCat: Nextgen PostgreSQL Pooler\n\n[![CircleCI](https://circleci.com/gh/postgresml/pgcat/tree/main.svg?style=svg)](https://circleci.com/gh/postgresml/pgcat/tree/main)\n\u003ca href=\"https://discord.gg/DmyJP3qJ7U\" target=\"_blank\"\u003e\n    \u003cimg src=\"https://img.shields.io/discord/1013868243036930099\" alt=\"Join our Discord!\" /\u003e\n\u003c/a\u003e\n\nPostgreSQL pooler and proxy (like PgBouncer) with support for sharding, load balancing, failover and mirroring.\n\n## Features\n\n| **Feature** | **Status** | **Comments** |\n|-------------|------------|--------------|\n| Transaction pooling | **Stable** | Identical to PgBouncer with notable improvements for handling bad clients and abandoned transactions. |\n| Session pooling | **Stable** | Identical to PgBouncer. |\n| Multi-threaded runtime | **Stable** | Using Tokio asynchronous runtime, the pooler takes advantage of multicore machines. |\n| Load balancing of read queries | **Stable** | Queries are automatically load balanced between replicas and the primary. |\n| Failover | **Stable** | Queries are automatically rerouted around broken replicas, validated by regular health checks. |\n| Admin database statistics | **Stable** | Pooler statistics and administration via the `pgbouncer` and `pgcat` databases. |\n| Prometheus statistics | **Stable** | Statistics are reported via a HTTP endpoint for Prometheus. |\n| SSL/TLS | **Stable** | Clients can connect to the pooler using TLS. Pooler can connect to Postgres servers using TLS. |\n| Client/Server authentication | **Stable** | Clients can connect using MD5 authentication, supported by `libpq` and all Postgres client drivers. PgCat can connect to Postgres using MD5 and SCRAM-SHA-256. |\n| Live configuration reloading | **Stable** | Identical to PgBouncer; all settings can be reloaded dynamically (except `host` and `port`). |\n| Auth passthrough | **Stable** | MD5 password authentication can be configured to use an `auth_query` so no cleartext passwords are needed in the config file.|\n| Sharding using extended SQL syntax | **Experimental** | Clients can dynamically configure the pooler to route queries to specific shards. |\n| Sharding using comments parsing/Regex | **Experimental** | Clients can include shard information (sharding key, shard ID) in the query comments. |\n| Automatic sharding | **Experimental** | PgCat can parse queries, detect sharding keys automatically, and route queries to the correct shard. |\n| Mirroring | **Experimental** | Mirror queries between multiple databases in order to test servers with realistic production traffic. |\n\n\n## Status\n\nPgCat is stable and used in production to serve hundreds of thousands of queries per second.\n\n\u003ctable\u003e\n  \u003ctr\u003e\n    \u003ctd\u003e\n      \u003ca href=\"https://tech.instacart.com/adopting-pgcat-a-nextgen-postgres-proxy-3cf284e68c2f\"\u003e\n        \u003cimg src=\"./images/instacart.webp\" height=\"70\" width=\"auto\"\u003e\n      \u003c/a\u003e\n    \u003c/td\u003e\n    \u003ctd\u003e\n      \u003ca href=\"https://postgresml.org/blog/scaling-postgresml-to-1-million-requests-per-second\"\u003e\n        \u003cimg src=\"./images/postgresml.webp\" height=\"70\" width=\"auto\"\u003e\n      \u003c/a\u003e\n    \u003c/td\u003e\n    \u003ctd\u003e\n      \u003ca href=\"https://onesignal.com\"\u003e\n        \u003cimg src=\"./images/one_signal.webp\" height=\"70\" width=\"auto\"\u003e\n      \u003c/a\u003e\n    \u003c/td\u003e\n  \u003c/tr\u003e\n  \u003ctr\u003e\n    \u003ctd\u003e\n      \u003ca href=\"https://tech.instacart.com/adopting-pgcat-a-nextgen-postgres-proxy-3cf284e68c2f\"\u003e\n        Instacart\n      \u003c/a\u003e\n    \u003c/td\u003e\n    \u003ctd\u003e\n      \u003ca href=\"https://postgresml.org/blog/scaling-postgresml-to-1-million-requests-per-second\"\u003e\n        PostgresML\n      \u003c/a\u003e\n    \u003c/td\u003e\n    \u003ctd\u003e\n      OneSignal\n    \u003c/td\u003e\n  \u003c/tr\u003e\n\u003c/table\u003e\n\nSome features remain experimental and are being actively developed. They are optional and can be enabled through configuration.\n\n## Deployment\n\nSee `Dockerfile` for example deployment using Docker. The pooler is configured to spawn 4 workers so 4 CPUs are recommended for optimal performance. That setting can be adjusted to spawn as many (or as little) workers as needed.\n\nA Docker image is available from `docker pull ghcr.io/postgresml/pgcat:latest`. See our [Github packages repository](https://github.com/postgresml/pgcat/pkgs/container/pgcat).\n\nFor quick local example, use the Docker Compose environment provided:\n\n```bash\ndocker-compose up\n\n# In a new terminal:\nPGPASSWORD=postgres psql -h 127.0.0.1 -p 6432 -U postgres -c 'SELECT 1'\n```\n\n### Config\n\nSee **[Configuration](https://github.com/levkk/pgcat/blob/main/CONFIG.md)**.\n\n## Contributing\n\nThe project is being actively developed and looking for additional contributors and production deployments.\n\n### Local development\n\n1. Install Rust (latest stable will work great).\n2. `cargo build --release` (to get better benchmarks).\n3. Change the config in `pgcat.toml` to fit your setup (optional given next step).\n4. Install Postgres and run `psql -f tests/sharding/query_routing_setup.sql` (user/password may be required depending on your setup)\n5. `RUST_LOG=info cargo run --release` You're ready to go!\n\n### Tests\n\nWhen making substantial modifications to the protocol implementation, make sure to test them with pgbench:\n\n```\npgbench -i -h 127.0.0.1 -p 6432 \u0026\u0026 \\\npgbench -t 1000 -p 6432 -h 127.0.0.1 --protocol simple \u0026\u0026 \\\npgbench -t 1000 -p 6432 -h 127.0.0.1 --protocol extended\n```\n\nSee [sharding README](./tests/sharding/README.md) for sharding logic testing.\n\nAdditionally, all features are tested with Ruby, Python, and Rust unit and integration tests.\n\nRun `cargo test` to run Rust unit tests.\n\nRun the following commands to run Ruby and Python integration tests:\n\n```\ncd tests/docker/\ndocker compose up --exit-code-from main # This will also produce coverage report under ./cov/\n```\n\n### Docker-based local development\n\nYou can open a Docker development environment where you can debug tests easier. Run the following command to spin it up:\n\n```\n./dev/script/console\n```\n\nThis will open a terminal in an environment similar to that used in tests. In there, you can compile the pooler, run tests, do some debugging with the test environment, etc. Objects compiled inside the container (and bundled gems) will be placed in `dev/cache` so they don't interfere with what you have on your machine.\n\n## Usage\n\n### Session mode\nIn session mode, a client talks to one server for the duration of the connection. Prepared statements, `SET`, and advisory locks are supported. In terms of supported features, there is very little if any difference between session mode and talking directly to the server.\n\nTo use session mode, change `pool_mode = \"session\"`.\n\n### Transaction mode\nIn transaction mode, a client talks to one server for the duration of a single transaction; once it's over, the server is returned to the pool. Prepared statements, `SET`, and advisory locks are not supported; alternatives are to use `SET LOCAL` and `pg_advisory_xact_lock` which are scoped to the transaction.\n\nThis mode is enabled by default.\n\n### Load balancing of read queries\nAll queries are load balanced against the configured servers using either the random or least open connections algorithms. The most straightforward configuration example would be to put this pooler in front of several replicas and let it load balance all queries.\n\nIf the configuration includes a primary and replicas, the queries can be separated with the built-in query parser. The query parser, implemented with the `sqlparser` crate, will interpret the query and route all `SELECT` queries to a replica, while all other queries including explicit transactions will be routed to the primary.\n\n#### Query parser\nThe query parser will do its best to determine where the query should go, but sometimes that's not possible. In that case, the client can select which server it wants using this custom SQL syntax:\n\n```sql\n-- To talk to the primary for the duration of the next transaction:\nSET SERVER ROLE TO 'primary';\n\n-- To talk to the replica for the duration of the next transaction:\nSET SERVER ROLE TO 'replica';\n\n-- Let the query parser decide\nSET SERVER ROLE TO 'auto';\n\n-- Pick any server at random\nSET SERVER ROLE TO 'any';\n\n-- Reset to default configured settings\nSET SERVER ROLE TO 'default';\n```\n\nThe setting will persist until it's changed again or the client disconnects.\n\nBy default, all queries are routed to the first available server; `default_role` setting controls this behavior.\n\n### Failover\nAll servers are checked with a `;` (very fast) query before being given to a client. Additionally, the server health is monitored with every client query that it processes. If the server is not reachable, it will be banned and cannot serve any more transactions for the duration of the ban. The queries are routed to the remaining servers. If all servers become banned, the ban list is cleared: this is a safety precaution against false positives. The primary can never be banned.\n\nThe ban time can be changed with `ban_time`. The default is 60 seconds.\n\n### Sharding\nWe use the `PARTITION BY HASH` hashing function, the same as used by Postgres for declarative partitioning. This allows to shard the database using Postgres partitions and place the partitions on different servers (shards). Both read and write queries can be routed to the shards using this pooler.\n\n#### Extended syntax\nTo route queries to a particular shard, we use this custom SQL syntax:\n\n```sql\n-- To talk to a shard explicitly\nSET SHARD TO '1';\n\n-- To let the pooler choose based on a value\nSET SHARDING KEY TO '1234';\n```\n\nThe active shard will last until it's changed again or the client disconnects. By default, the queries are routed to shard 0.\n\nFor hash function implementation, see `src/sharding.rs` and `tests/sharding/partition_hash_test_setup.sql`.\n\n\n##### ActiveRecord/Rails\n\n```ruby\nclass User \u003c ActiveRecord::Base\nend\n\n# Metadata will be fetched from shard 0\nActiveRecord::Base.establish_connection\n\n# Grab a bunch of users from shard 1\nUser.connection.execute \"SET SHARD TO '1'\"\nUser.take(10)\n\n# Using id as the sharding key\nUser.connection.execute \"SET SHARDING KEY TO '1234'\"\nUser.find_by_id(1234)\n\n# Using geographical sharding\nUser.connection.execute \"SET SERVER ROLE TO 'primary'\"\nUser.connection.execute \"SET SHARDING KEY TO '85'\"\nUser.create(name: \"test user\", email: \"test@example.com\", zone_id: 85)\n\n# Let the query parser figure out where the query should go.\n# We are still on shard = hash(85) % shards.\nUser.connection.execute \"SET SERVER ROLE TO 'auto'\"\nUser.find_by_email(\"test@example.com\")\n```\n\n##### Raw SQL\n\n```sql\n-- Grab a bunch of users from shard 1\nSET SHARD TO '1';\nSELECT * FROM users LIMIT 10;\n\n-- Find by id\nSET SHARDING KEY TO '1234';\nSELECT * FROM USERS WHERE id = 1234;\n\n-- Writing in a primary/replicas configuration.\nSET SHARDING ROLE TO 'primary';\nSET SHARDING KEY TO '85';\nINSERT INTO users (name, email, zome_id) VALUES ('test user', 'test@example.com', 85);\n\nSET SERVER ROLE TO 'auto'; -- let the query router figure out where the query should go\nSELECT * FROM users WHERE email = 'test@example.com'; -- shard setting lasts until set again; we are reading from the primary\n```\n\n#### With comments\nIssuing queries to the pooler can cause additional latency. To reduce its impact, it's possible to include sharding information inside SQL comments sent via the query. This is reasonably easy to implement with ORMs like [ActiveRecord](https://api.rubyonrails.org/classes/ActiveRecord/QueryMethods.html#method-i-annotate) and [SQLAlchemy](https://docs.sqlalchemy.org/en/20/core/events.html#sql-execution-and-connection-events).\n\n```\n/* shard_id: 5 */ SELECT * FROM foo WHERE id = 1234;\n\n/* sharding_key: 1234 */ SELECT * FROM foo WHERE id = 1234;\n```\n\n#### Automatic query parsing\nPgCat can use the `sqlparser` crate to parse SQL queries and extract the sharding key. This is configurable with the `automatic_sharding_key` setting. This feature is still experimental, but it's the ideal implementation for sharding, requiring no client modifications.\n\n### Statistics reporting\n\nThe stats are very similar to what PgBouncer reports and the names are kept to be comparable. They are accessible by querying the admin database `pgcat`, and `pgbouncer` for compatibility.\n\n```\npsql -h 127.0.0.1 -p 6432 -d pgbouncer -c 'SHOW DATABASES'\n```\n\nAdditionally, Prometheus statistics are available at `/metrics` via HTTP.\n\nWe also have a [basic Grafana dashboard](https://github.com/postgresml/pgcat/blob/main/grafana_dashboard.json) based on Prometheus metrics that you can import into Grafana and build on it or use it for monitoring.\n\n### Live configuration reloading\n\nThe config can be reloaded by sending a `kill -s SIGHUP` to the process or by querying `RELOAD` to the admin database. All settings except the `host` and `port` can be reloaded without restarting the pooler, including sharding and replicas configurations.\n\n### Mirroring\n\nMirroring allows to route queries to multiple databases at the same time. This is useful for prewarning replicas before placing them into the active configuration, or for testing different versions of Postgres with live traffic.\n\n## License\n\nPgCat is free and open source, released under the MIT license.\n\n## Contributors\n\nMany thanks to our amazing contributors!\n\n\u003ca href = \"https://github.com/postgresml/pgcat/graphs/contributors\"\u003e\n  \u003cimg src = \"https://contrib.rocks/image?repo=postgresml/pgcat\"/\u003e\n\u003c/a\u003e\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpostgresml%2Fpgcat","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fpostgresml%2Fpgcat","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpostgresml%2Fpgcat/lists"}