{"id":20782058,"url":"https://github.com/pgflo/pg_flo","last_synced_at":"2025-05-15T15:03:41.093Z","repository":{"id":256180261,"uuid":"851229016","full_name":"pgflo/pg_flo","owner":"pgflo","description":"Stream, transform, and route PostgreSQL data in real-time.","archived":false,"fork":false,"pushed_at":"2024-11-16T14:35:15.000Z","size":14582,"stargazers_count":647,"open_issues_count":9,"forks_count":11,"subscribers_count":2,"default_branch":"main","last_synced_at":"2024-11-16T15:22:43.882Z","etag":null,"topics":["data","database","etl","go","golang","logical-replication","postgres","postgresql","stream"],"latest_commit_sha":null,"homepage":"https://pgflo.io","language":"Go","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/pgflo.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-09-02T17:13:01.000Z","updated_at":"2024-11-16T14:34:46.000Z","dependencies_parsed_at":"2024-09-09T12:49:45.850Z","dependency_job_id":"132c54a6-e852-4d2b-9093-8ea89a1a3f2f","html_url":"https://github.com/pgflo/pg_flo","commit_stats":null,"previous_names":["shayonj/pg_flo","pgflo/pg_flo"],"tags_count":11,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pgflo%2Fpg_flo","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pgflo%2Fpg_flo/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pgflo%2Fpg_flo/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pgflo%2Fpg_flo/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/pgflo","download_url":"https://codeload.github.com/pgflo/pg_flo/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254364270,"owners_count":22058878,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["data","database","etl","go","golang","logical-replication","postgres","postgresql","stream"],"created_at":"2024-11-17T14:00:53.740Z","updated_at":"2025-05-15T15:03:41.042Z","avatar_url":"https://github.com/pgflo.png","language":"Go","readme":"# \u003cimg src=\"internal/pg_flo_logo.png\" alt=\"pg_flo logo\" width=\"40\" align=\"center\"\u003e pg_flo\n\n[![CI](https://github.com/pgflo/pg_flo/actions/workflows/ci.yml/badge.svg?branch=main)](https://github.com/pgflo/pg_flo/actions/workflows/ci.yml)\n[![Integration](https://github.com/pgflo/pg_flo/actions/workflows/integration.yml/badge.svg?branch=main)](https://github.com/pgflo/pg_flo/actions/workflows/integration.yml)\n[![Release](https://img.shields.io/github/v/release/pgflo/pg_flo?style=flat\u0026color=#959DA5\u0026sort=semver)](https://github.com/pgflo/pg_flo/releases/latest)\n[![Docker Image](https://img.shields.io/docker/v/pgflo/pg_flo?style=flat\u0026label=docker\u0026color=#959DA5\u0026label=docker\u0026sort=semver)](https://hub.docker.com/r/pgflo/pg_flo/tags)\n\n\u003e The easiest way to move and transform data between PostgreSQL databases using Logical Replication.\n\nℹ️ `pg_flo` is in active development. The design and architecture is continuously improving. PRs/Issues are very much welcome 🙏\n\n## Key Features\n\n- **Real-time Data Streaming** - Capture inserts, updates, deletes, and DDL changes in near real-time\n- **Fast Initial Loads** - Parallel copy of existing data with automatic follow-up continuous replication\n- **Powerful Transformations** - Filter and transform data on-the-fly ([see rules](pkg/rules/README.md))\n- **Flexible Routing** - Route to different tables and remap columns ([see routing](pkg/routing/README.md))\n- **Production Ready** - Supports resumable streaming, DDL tracking, and more\n\n## Common Use Cases\n\n- Real-time data replication between PostgreSQL databases\n- ETL pipelines with data transformation\n- Data re-routing, masking and filtering\n- Database migration with zero downtime\n- Event streaming from PostgreSQL\n\n[View detailed examples →](internal/examples/README.md)\n\n## Quick Start\n\n### Prerequisites\n\n- Docker\n- PostgreSQL database with `wal_level=logical`\n\n### 1. Install\n\n```shell\ndocker pull pgflo/pg_flo:latest\n```\n\n### 2. Configure\n\nChoose one:\n\n- Environment variables\n- YAML configuration file ([example](internal/pg-flo.yaml))\n- CLI flags\n\n### 3. Run\n\n```shell\n# Start NATS server\ndocker run -d --name pg_flo_nats \\\n  --network host \\\n  -v /path/to/nats-server.conf:/etc/nats/nats-server.conf \\\n  nats:latest \\\n  -c /etc/nats/nats-server.conf\n\n# Start replicator (using config file)\ndocker run -d --name pg_flo_replicator \\\n  --network host \\\n  -v /path/to/config.yaml:/etc/pg_flo/config.yaml \\\n  pgflo/pg_flo:latest \\\n  replicator --config /etc/pg_flo/config.yaml\n\n# Start worker\ndocker run -d --name pg_flo_worker \\\n  --network host \\\n  -v /path/to/config.yaml:/etc/pg_flo/config.yaml \\\n  pgflo/pg_flo:latest \\\n  worker postgres --config /etc/pg_flo/config.yaml\n```\n\n#### Example Configuration (config.yaml)\n\n```yaml\n# Replicator settings\nhost: \"localhost\"\nport: 5432\ndbname: \"myapp\"\nuser: \"replicator\"\npassword: \"secret\"\ngroup: \"users\"\ntables:\n  - \"users\"\n\n# Worker settings (postgres sink)\ntarget-host: \"dest-db\"\ntarget-dbname: \"myapp\"\ntarget-user: \"writer\"\ntarget-password: \"secret\"\n\n# Common settings\nnats-url: \"nats://localhost:4222\"\n```\n\n[View full configuration options →](internal/pg-flo.yaml)\n\n## Core Concepts\n\n### Architecture\n\npg_flo uses two main components:\n\n- **Replicator**: Captures PostgreSQL changes via logical replication\n- **Worker**: Processes and routes changes through NATS\n\n[Learn how it works →](internal/how-it-works.md)\n\n### Groups\n\nGroups are used to:\n\n- Identify replication processes\n- Isolate replication slots and publications\n- Run multiple instances on same database\n- Maintain state for resumability\n- Enable parallel processing\n\n```shell\n# Example: Separate groups for different tables\npg_flo replicator --group users_orders --tables users,orders\n\npg_flo replicator --group products --tables products\n```\n\n### Streaming Modes\n\n1. **Stream Only** (default)\n   - Real-time streaming of changes\n\n```shell\npg_flo replicator --stream\n```\n\n2. **Copy Only**\n   - One-time parallel copy of existing data\n\n```shell\npg_flo replicator --copy --max-copy-workers-per-table 4\n```\n\n3. **Copy and Stream**\n   - Initial parallel copy followed by continuous streaming\n\n```shell\npg_flo replicator --copy-and-stream --max-copy-workers-per-table 4\n```\n\n### Destinations\n\n- **stdout**: Console output\n- **file**: File writing\n- **postgres**: Database replication\n- **webhook**: HTTP endpoints\n\n[View destination details →](pkg/sinks/README.md)\n\n## Advanced Features\n\n### Message Routing\n\nRouting configuration is defined in a separate YAML file:\n\n```yaml\n# routing.yaml\nusers:\n  source_table: users\n  destination_table: customers\n  column_mappings:\n    - source: id\n      destination: customer_id\n```\n\n```shell\n# Apply routing configuration\npg_flo worker postgres --routing-config /path/to/routing.yaml\n```\n\n[Learn about routing →](pkg/routing/README.md)\n\n### Transformation Rules\n\nRules are defined in a separate YAML file:\n\n```yaml\n# rules.yaml\nusers:\n  - type: exclude_columns\n    columns: [password, ssn]\n  - type: mask_columns\n    columns: [email]\n```\n\n```shell\n# Apply transformation rules\npg_flo worker file --rules-config /path/to/rules.yaml\n```\n\n[View transformation options →](pkg/rules/README.md)\n\n### Combined Example\n\n```shell\npg_flo worker postgres --config /etc/pg_flo/config.yaml --routing-config routing.yaml --rules-config rules.yaml\n```\n\n## Scaling Guide\n\nBest practices:\n\n- Run one worker per group\n- Use groups to replicate different tables independently\n- Scale horizontally using multiple groups\n\nExample scaling setup:\n\n```shell\n# Group: sales\npg_flo replicator --group sales --tables sales\npg_flo worker postgres --group sales\n\n# Group: inventory\npg_flo replicator --group inventory --tables inventory\npg_flo worker postgres --group inventory\n```\n\n## Limits and Considerations\n\n- NATS message size: 8MB (configurable)\n- One worker per group recommended\n- PostgreSQL logical replication prerequisites required\n- Tables must have one of the following for replication:\n  - Primary key\n  - Unique constraint with `NOT NULL` columns\n  - `REPLICA IDENTITY FULL` set\n\nExample table configurations:\n\n```sql\n-- Using primary key (recommended)\nCREATE TABLE users (\n  id SERIAL PRIMARY KEY,\n  email TEXT,\n  name TEXT\n);\n\n-- Using unique constraint\nCREATE TABLE orders (\n  order_id TEXT NOT NULL,\n  customer_id TEXT NOT NULL,\n  data JSONB,\n  CONSTRAINT orders_unique UNIQUE (order_id, customer_id)\n);\nALTER TABLE orders REPLICA IDENTITY USING INDEX orders_unique;\n\n-- Using all columns (higher overhead in terms of performance)\nCREATE TABLE audit_logs (\n  id SERIAL,\n  action TEXT,\n  data JSONB\n);\nALTER TABLE audit_logs REPLICA IDENTITY FULL;\n```\n\n## Development\n\n```shell\nmake build\nmake test\nmake lint\n\n# E2E tests\n./internal/scripts/e2e_local.sh\n```\n\n## Contributing\n\nContributions welcome! Please open an issue or submit a pull request.\n\n## License\n\nApache License 2.0. [View license →](LICENSE)\n","funding_links":[],"categories":["Go","\u003ca name=\"Go\"\u003e\u003c/a\u003eGo"],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpgflo%2Fpg_flo","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fpgflo%2Fpg_flo","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpgflo%2Fpg_flo/lists"}