https://github.com/heptau/pg_atropos
✂️ Go command-line tool that splits PostgreSQL pg_dump -Fc (custom-format) dumps into individual files organized for Git versioning. Instead of wrestling with a single giant SQL file, each database object gets its own file — tables in TABLE/, functions in FUNCTION/, roles in ROLE/, and so on.
https://github.com/heptau/pg_atropos
git plpgsql postresql sql
Last synced: 1 day ago
JSON representation
✂️ Go command-line tool that splits PostgreSQL pg_dump -Fc (custom-format) dumps into individual files organized for Git versioning. Instead of wrestling with a single giant SQL file, each database object gets its own file — tables in TABLE/, functions in FUNCTION/, roles in ROLE/, and so on.
- Host: GitHub
- URL: https://github.com/heptau/pg_atropos
- Owner: heptau
- License: mit
- Created: 2026-06-02T22:14:37.000Z (19 days ago)
- Default Branch: main
- Last Pushed: 2026-06-03T14:12:21.000Z (19 days ago)
- Last Synced: 2026-06-03T16:06:29.912Z (19 days ago)
- Topics: git, plpgsql, postresql, sql
- Language: Go
- Homepage: http://pg_atropos.80.cz
- Size: 55.7 KB
- Stars: 1
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- Funding: .github/FUNDING.yml
- License: LICENSE
- Agents: AGENTS.md
Awesome Lists containing this project
README
# pg_atropos
[](https://go.dev/)
[](https://opensource.org/licenses/MIT)
[](pg_atropos)
Split PostgreSQL `pg_dump -Fc` (custom-format) dumps into individual files
organized for GIT versioning.
## Why?
When a team shares a single giant SQL dump in Git, every change — adding a
column, a new index, tweaking a function — creates a merge conflict across
the entire file. Resolving those conflicts is tedious and error-prone.
`pg_atropos` splits the dump into individual objects (tables, functions,
indexes, roles, …), each in its own file. In Git this means:
- **Minimal conflicts** — two people can change different tables without
stepping on each other
- **Clear code review** — a diff shows exactly the changed object, not
500 lines of dump header
- **Readable history** — `git log -- TABLE/users.sql` shows every change
to that specific table
- **CI/CD friendly** — deploy only the changed object, not the entire dump
## How?
Instead of parsing raw SQL text (brittle regex), `pg_atropos` pipes through
`pg_restore -f -` to reliably extract each object via its header metadata.
**Requirement:** `pg_restore` (from [PostgreSQL client tools](https://www.postgresql.org/download/))
must be installed. It is used to decompress and interpret the custom-format dump
— `pg_atropos` never reads the binary format directly.
## The Name
In Greek mythology, the three Moirai (Fates) spun the thread of life,
measured it, and — finally — **Atropos** (the Inevitable) cut it with
her shears. `pg_atropos` does the same: it takes a giant `pg_dump`
file and snips it into small, manageable threads (files) ready for Git.
Who wouldn't want shears that cut a dump into pieces?
## Inspiration
This project was inspired by
[michal-bartak/pgdump_splitter](https://github.com/michal-bartak/pgdump_splitter),
which parses plain `pg_dump` text output. That approach is fragile — object
boundaries are hard to detect reliably when function bodies or comments contain
text that looks like dump markers. `pg_atropos` solves this by using the
**custom-format** (`-Fc`) dump + `pg_restore -f -` pipe, which emits clean
`-- Name: ...; Type: ...` headers that the parser can trust.
The result: simpler, faster, and far more robust parsing.
## Quick Start
```bash
# From a custom-format dump file
pg_atropos -f dump.pgdump -output ./structure
# From a live database (auto-dump via pg_dump)
pg_atropos -d mydb -output ./structure
# Pipe from pg_dump (no temp file needed)
pg_dump -Fc mydb | pg_atropos -f - -o ./structure
# Pipe from a remote database via connection string
pg_dump -Fc postgresql://user@server:port/database | pg_atropos -f - -o ./structure
# Pipe from a remote database via ssh
ssh dbserver 'pg_dump -Fc mydb' | pg_atropos -f - -o ./structure
# Custom mode (lowercase directories for CI)
pg_atropos -f dump.pgdump -output ./structure -mode custom
```
## Installation
### From source
```bash
git clone https://github.com/your-project/pg_atropos.git
cd pg_atropos && make build
```
### Docker
```bash
docker build -t pg_atropos .
docker run --rm -v $(pwd)/dump.pgdump:/dump.pgdump pg_atropos -f /dump.pgdump
```
## Flags
| Flag | Default | Description |
|------|---------|-------------|
| `--db` | `""` | Database name to dump |
| `--conn` | `""` | PostgreSQL connection string |
| `--file`, `-f` | `""` | Custom-format dump file (`"-"` for stdin) |
| `--output`, `-o` | `./output` | Output directory |
| `--mode`, `-m` | `origin` | Output mode: `origin` \| `custom` |
| `--clean` | `false` | Clean output directory before processing |
| `--no-db-path` | `false` | Don't include database name in output path |
| `--blacklist-db` | `^(template\|postgres)` | Skip databases matching pattern |
| `--whitelist-db` | `""` | Only include databases matching pattern |
| `--exclude-obj` | `""` | Exclude object types matching pattern |
| `--acl-files` | `false` | Save ACLs to separate `.acl.sql` files |
| `--move-roles` | `false` | Move role files under database directory |
| `--dry-run` | `false` | Print what would be extracted without writing |
| `--quiet` | `false` | Suppress informational output |
| `--version` | — | Print version and exit |
## Modes
### origin (default)
Mirrors the dump structure exactly — each object type gets its own
directory (`TABLE/`, `FUNCTION/`, `INDEX/`, …). Good for inspection.
### custom
Lowercase directories (`table/`, `function/`, …). Indexes, constraints,
triggers are **not** merged into their parent table (pg_restore headers
don't carry the parent table name, so merge would require SQL content
parsing). Better suited for CI / automation workflows.
## Tests
```bash
make test # unit tests (no database required)
make coverage # with coverage report
```
Tests use the `--test-sql` flag to inject pre-extracted SQL directly,
bypassing `pg_restore`. No PostgreSQL installation needed.
## Limitations
- INDEX / CONSTRAINT / TRIGGER merging into table files is not supported
in custom mode (pg_restore headers lack parent table name).
- Object names with spaces or special characters may not round-trip.
## Performance
~150 objects / 2226 SQL lines:
| Version | Avg Time | vs pg_atropos |
|---------|----------|---------------|
| **pg_atropos (Go)** | **0.044s** | **1×** |
| pgdump_splitter (Go) | 0.109s | 2.5× slower |
## Future Improvements
- **Merge INDEX/CONSTRAINT/TRIGGER under table/** — would require
parsing the SQL content to identify the parent table (doable, adds
complexity).
- **Memory limits** — configurable scanner buffer for constrained
Docker environments.
- **Git integration** — commit each object separately with structured
commit messages.
## License
MIT — see [LICENSE](LICENSE)