{"id":50410056,"url":"https://github.com/mickamy/seeder","last_synced_at":"2026-05-31T03:03:03.881Z","repository":{"id":358864098,"uuid":"1243269380","full_name":"mickamy/seeder","owner":"mickamy","description":"Populate your database with realistic fake data, one command.","archived":false,"fork":false,"pushed_at":"2026-05-28T06:57:50.000Z","size":490,"stargazers_count":2,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-05-28T08:20:34.700Z","etag":null,"topics":["cli","database","developer-tools","fake-data","go","postgresql","seeder","test-data"],"latest_commit_sha":null,"homepage":null,"language":"Go","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/mickamy.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":".github/FUNDING.yaml","license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null},"funding":{"github":["mickamy"]}},"created_at":"2026-05-19T07:32:40.000Z","updated_at":"2026-05-28T06:57:52.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/mickamy/seeder","commit_stats":null,"previous_names":["mickamy/seeder"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/mickamy/seeder","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mickamy%2Fseeder","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mickamy%2Fseeder/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mickamy%2Fseeder/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mickamy%2Fseeder/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/mickamy","download_url":"https://codeload.github.com/mickamy/seeder/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mickamy%2Fseeder/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":33717419,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-05-31T02:00:06.040Z","response_time":95,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["cli","database","developer-tools","fake-data","go","postgresql","seeder","test-data"],"created_at":"2026-05-31T03:03:03.143Z","updated_at":"2026-05-31T03:03:03.875Z","avatar_url":"https://github.com/mickamy.png","language":"Go","funding_links":["https://github.com/sponsors/mickamy"],"categories":[],"sub_categories":[],"readme":"# seeder\n\n\u003e Zero-config database seeder — one command, realistic data, no factory code.\n\n[![CI](https://github.com/mickamy/seeder/actions/workflows/ci.yaml/badge.svg)](https://github.com/mickamy/seeder/actions/workflows/ci.yaml)\n[![Go Report Card](https://goreportcard.com/badge/github.com/mickamy/seeder)](https://goreportcard.com/report/github.com/mickamy/seeder)\n[![License: MIT](https://img.shields.io/badge/license-MIT-blue.svg)](LICENSE)\n[![GitHub Sponsors](https://img.shields.io/github/sponsors/mickamy?label=sponsor\u0026logo=github)](https://github.com/sponsors/mickamy)\n\n`seeder` populates your MySQL or Postgres database with realistic fake data\nstraight from the schema. No factory code, no YAML, no AI key. Point it at a\nDSN and it figures out the rest: it introspects your tables, infers what each\ncolumn should look like from its name and type, and bulk-inserts (multi-row\n`INSERT` on MySQL, `COPY` on Postgres) while respecting your foreign-key\nconstraints.\n\n```bash\n$ seeder mysql://root:pass@localhost:3306/mydb --rows 1000 --truncate --seed 42\n$ seeder postgres://user:pass@localhost:5432/mydb --rows 1000 --truncate --seed 42\nseeder: 3 table(s), 3 FK(s)\norder:  users -\u003e orders -\u003e comments\nmode:   truncate + insert\n  users     1000 rows (6.7ms)\n  orders    1000 rows (7.0ms)\n  comments  1000 rows (9.9ms)\ndone:   3000 row(s) in 53ms\n```\n\n## Why\n\nEvery backend project hits the same wall: dev DBs with two rows where prod\nhas a million. Engineers respond by writing factory code, scripting `INSERT`s,\nor maintaining fixtures — all of which rot. `seeder` skips that step:\nzero-code, zero-config, single Go binary.\n\nThe three core promises:\n\n1. **Zero-code.** No factories, no config file. Just a DSN.\n2. **Smart inference.** `email` columns get emails; `created_at` gets\n   timestamps in the past year; `order_status` (enum) gets one of its labels.\n3. **Referential integrity.** FK dependencies are resolved with a\n   topological sort; children always point at real parents.\n\n## Install\n\n### Homebrew (macOS / Linux)\n\n```bash\nbrew install mickamy/tap/seeder\n```\n\n### Windows\n\nGrab the latest `seeder_\u003cversion\u003e_windows_\u003carch\u003e.zip` from the\n[Releases](https://github.com/mickamy/seeder/releases) page, unzip, and move\n`seeder.exe` into a directory already on your `PATH` (e.g.,\n`%USERPROFILE%\\bin`). The PowerShell snippet below picks the right archive\nfor the current host arch and unpacks it next to the working directory; you\nstill need the final move step yourself.\n\n```powershell\n$ver  = (Invoke-RestMethod https://api.github.com/repos/mickamy/seeder/releases/latest).tag_name.TrimStart('v')\n$arch = if ($env:PROCESSOR_ARCHITECTURE -eq 'ARM64') { 'arm64' } else { 'amd64' }\nInvoke-WebRequest -OutFile seeder.zip \"https://github.com/mickamy/seeder/releases/latest/download/seeder_${ver}_windows_${arch}.zip\"\nExpand-Archive seeder.zip -DestinationPath .\\seeder -Force\n# Move .\\seeder\\seeder.exe into a directory on $env:PATH (e.g., $HOME\\bin).\n```\n\n### From source\n\n```bash\ngo install github.com/mickamy/seeder@latest\n```\n\nRequires Go 1.26+ to build from source. Pre-built binaries (macOS / Linux /\nWindows × amd64 / arm64) are published on\n[GitHub Releases](https://github.com/mickamy/seeder/releases) on every tag.\n\n## Supported databases\n\n- **PostgreSQL 15+** — versions older than 15 are out of upstream maintenance. CI exercises every active major (15 – 18).\n- **MySQL 8.4+ (LTS)** — MySQL 8.0 reached community-server EOL in April 2026, so seeder targets 8.4 onward. CI exercises\n  the 8.4 LTS line and the current `mysql:9` innovation release. `column_type` parsing for `enum` / `set` literals and\n  the `tinyint(1)` boolean convention follow MySQL 8 semantics.\n\nBoth drivers connect through the standard DSN form (`postgres://...` or `mysql://...`) and use only read-only SELECTs\nagainst `information_schema` / `pg_catalog` plus normal INSERT / COPY / TRUNCATE statements — no privileged access\nneeded.\n\n## Usage\n\n```\nseeder \u003cdsn\u003e [flags]\n\nFLAGS:\n  --batch-size int Rows generated per INSERT batch (default: 1000)\n  --cache \u003cfile\u003e   Load/save introspected schema to this file (delete to invalidate)\n  --config \u003cfile\u003e  Path to seeder.yaml (default: auto-detect ./seeder.yaml)\n  --dry-run        Print plan, do not insert\n  --exclude string Comma-separated tables to skip (cannot combine with --tables)\n  --locale string  Locale for name-rule generators (en, ja, sv; default: en)\n  --output string  Alternate output: sql | ndjson (default: insert into DB)\n  --rate int       Rows per second across tables when --stream is set\n  --rows int       Rows per table (default: 1000; overrides yaml when set)\n  --seed N         Deterministic RNG seed (\u003e= 0; default: time-based)\n  --stream         Continuously append rows after the initial seed (CDC mode)\n  --tables string  Comma-separated tables to include (default: all)\n  --truncate       TRUNCATE before insert (default: append)\n  --verbose        Print per-column inference decisions\n  --version, -v    Print seeder version\n  --help, -h       Show this help\n```\n\nBoth `seeder \u003cdsn\u003e [flags]` and `seeder [flags] \u003cdsn\u003e` are accepted.\n\n### Common scenarios\n\n**Frontend / pagination testing.** Two rows is not enough to test page 100\nor the \"...\" truncation in your table UI.\n\n```bash\nseeder $DATABASE_URL --rows 10000\n```\n\n**Backend / N+1 hunting.** `EXPLAIN ANALYZE` against 50 rows tells you\nnothing. Load a realistic volume and the slow path shows up.\n\n```bash\nseeder $DATABASE_URL --tables orders,order_items --rows 1000000\n```\n\n**Onboarding.** Replace half a page of seed-script setup with one line:\n\n```markdown\n1. make db-up\n2. make migrate\n3. seeder $DATABASE_URL\n```\n\n**Dev DB reset after a migration experiment.**\n\n```bash\ndropdb mydb \u0026\u0026 createdb mydb \u0026\u0026 goose up \u0026\u0026 seeder $DATABASE_URL --rows 5000\n```\n\n**Reproducible CI test data.**\n\n```yaml\n- run: |\n    make migrate\n    seeder $TEST_DB --rows 1000 --seed 42\n    go test -tags=integration ./...\n```\n\n**Japanese-locale data.** Swap inferred names, addresses, prefectures, phone numbers, and postal codes for plausible\nJapanese values.\n\n```bash\nseeder $DATABASE_URL --locale ja\n```\n\n**Large schemas / repeated runs.** Cache the introspected schema so subsequent runs skip the `information_schema`\nround-trip. Delete the file after a migration to invalidate.\n\n```bash\nseeder $DATABASE_URL --cache /tmp/seeder-schema.gob --rows 5000\n```\n\n**SQL dump for migration repos.** Emit INSERT statements instead of writing to the DB — useful when you want a\nreproducible `seed.sql` checked in alongside migrations. Dialect is chosen from the DSN scheme (`mysql://` or\n`postgres://`); no connection is opened beyond the initial introspection. On Postgres, INSERTs for tables with IDENTITY\ncolumns include `OVERRIDING SYSTEM VALUE` so dumps load into both `BY DEFAULT` and `ALWAYS` identity schemas. After\nloading a dump that supplies explicit values for `serial` / IDENTITY columns, run `setval(...)` on the backing sequences\nso subsequent inserts don't collide with the seeded ids (out of scope for the dump itself).\n\n```bash\nseeder $DATABASE_URL --output sql --rows 100 \u003e seed.sql\n```\n\n**ETL / streaming pipelines.** `--output ndjson` writes one JSON object per row, prefixed with `_table`, so downstream\nconsumers can route rows by destination.\n\n```bash\nseeder $DATABASE_URL --output ndjson --rows 100 | kafkacat -P -t seed-stream\n```\n\n**Continuous load / replication-lag testing.** `--stream --rate N` seeds the schema once, then keeps appending rows at\nroughly N total per second across all tables until Ctrl-C.\n\n```bash\nseeder $DATABASE_URL --stream --rate 1000\n```\n\n## Configuration\n\n`seeder` runs zero-config out of the box. When you want to pin row counts or skip specific tables without retyping flags\nevery time, drop a `seeder.yaml` next to where you run the command — it is auto-detected. Use\n`--config path/to/seeder.yaml` to point at one explicitly.\n\n```yaml\nrows: 1000\nseed: 42\nlocale: en\ntruncate: false\ntables:\n  users:\n    rows: 5000\n    columns:\n      email:\n        generator: Email\n      bio:\n        value: dogfood seed row\n  orders:\n    rows: 10000\n  comments:\n    exclude: true\n```\n\n`seeder.yaml` is validated on load: any unknown key (e.g., a typo like `truncates:` instead of `truncate:`) is rejected\nwith the line number and field name. A full annotated example lives at\n[`seeder.example.yaml`](./seeder.example.yaml).\n\nAny yaml setting with a CLI equivalent follows the same rule: **CLI flag \u003e seeder.yaml \u003e built-in default**.\n\n| yaml field                    | CLI flag     | Default          | Notes                                                                              |\n|-------------------------------|--------------|------------------|------------------------------------------------------------------------------------|\n| `rows`                        | `--rows`     | `1000`           | When `--rows` is set, it replaces yaml row counts for **every** table.             |\n| `seed`                        | `--seed`     | time-based       | uint64; pin for reproducibility.                                                   |\n| `locale`                      | `--locale`   | `en`             | `en`, `ja`, `sv`.                                                                  |\n| `truncate`                    | `--truncate` | `false`          | TRUNCATE before insert.                                                            |\n| `tables.\u003cname\u003e.rows`          | —            | top-level `rows` | Per-table override; ignored when `--rows` is set.                                  |\n| `tables.\u003cname\u003e.exclude`       | `--exclude`  | `false`          | Skip the table. `--exclude a,b` on the CLI is equivalent for the listed tables.    |\n| `tables.\u003cname\u003e.columns.\u003ccol\u003e` | —            | —                | Per-column override (see below).                                                   |\n| `tables.\u003cname\u003e.polymorphic[]` | —            | —                | Declare Rails-style polymorphic associations (see below).                          |\n\nPer-column overrides under `tables.\u003cname\u003e.columns.\u003ccol\u003e` bypass inference for a single column. Set exactly one of:\n\n- `generator: \u003cName\u003e` — force a built-in generator (e.g., `Email`, `UUID`, `Phone`, `PastDate`). Supplying an unknown\n  name surfaces the full known list as part of the preflight error. Built-ins resolve via gofakeit defaults and do not\n  switch on `--locale`; use `value:` for a fixed string when you need a specific locale.\n- `value: \u003cliteral\u003e` — pin the column to a fixed yaml value. Only scalars (string, number, bool) are accepted; arrays\n  and maps are rejected with a `value must be a scalar` error.\n- `exclude: \u003cboolean\u003e` — exclude the column to be populated with a value. Can't be a primary key nor a foreign key.\n\nForeign-key columns are not overridable: yaml entries for them are ignored and the FK pool is used instead, so children\nstill point at real parents.\n\nPolymorphic associations (Rails-style `*_type` + `*_id` pairs) cannot be detected from `information_schema` alone, so\ndeclare them under `tables.\u003cname\u003e.polymorphic`. Each entry picks one target table uniformly per row, then takes its id\nfrom the FK pool:\n\n```yaml\ntables:\n  comments:\n    polymorphic:\n      - type_col: commentable_type\n        id_col: commentable_id\n        targets:\n          - { table: posts,    type: Post }\n          - { table: articles, type: Article }\n```\n\n`id_col` on a target defaults to that table's first primary-key column; set `id_col: \u003ccol\u003e` on the target to point at a\ndifferent column (which must be in the FK pool, e.g., a `UNIQUE` non-PK column). `seeder` also adds the target tables as\nplan dependencies, so parents are seeded before the polymorphic owner. Each column may appear in at most one `type_col`\nor `id_col` slot per table; declaring the same column under two polymorphic entries fails preflight.\n\nPass `--verbose` to see which inference rule each column matched, e.g., when you are debugging why `bio` ended up with a\nlong paragraph instead of the short string you expected:\n\n```\n$ seeder $DATABASE_URL --rows 5 --verbose\nseeder: 3 table(s), 2 FK(s)\norder:  users -\u003e orders -\u003e comments\nmode:   append\n  users\n    id            skip: identity\n    email         name match: Email\n    name          name match: Name\n    created_at    name match: PastDate\n  users  5 rows (1.2ms)\n  ...\n```\n\n## How it works\n\n### Schema introspection\n\n`seeder` queries `information_schema` (plus `pg_catalog` / `pg_constraint` on\nPostgres for enum types and composite-FK column ordering) for tables, columns,\nprimary keys, foreign keys, and enum labels. The current database is scoped\nvia `DATABASE()` on MySQL; the `public` schema is scoped on Postgres. No\nschema changes, no privileged access — just standard SELECTs.\n\n### Smart inference\n\nEach column is matched against a small set of name patterns first, then\nfalls back to its SQL type:\n\n| Pattern                                                                   | Generator                 |\n|---------------------------------------------------------------------------|---------------------------|\n| `email`, `*_email`                                                        | realistic email address   |\n| `name`, `first_name`, `last_name`, `display_name`, ...                    | person name               |\n| `phone`, `tel`, `mobile`                                                  | phone number              |\n| `*_url`, `link`, `homepage`, `website`                                    | URL                       |\n| `avatar`, `image`, `photo`, `picture`, `thumbnail` (also `_url`-suffixed) | image placeholder URL     |\n| `address`, `city`, `country`, `zip`                                       | postal address parts      |\n| `description`, `bio`, `note`, `body`, `content`                           | paragraph                 |\n| `title`, `subject`, `headline`                                            | sentence                  |\n| `created_at`, `updated_at`, `*_at`                                        | timestamp in past year    |\n| `birthday`, `dob`                                                         | past date                 |\n| `age`                                                                     | 0–100                     |\n| `price`, `amount`, `cost`, `*_yen`                                        | int in money range        |\n| `count`, `quantity`, `qty`, `num_*`                                       | int                       |\n| `is_*`, `has_*`, `*_flag`, `enabled`                                      | boolean                   |\n| Postgres enum (`USER-DEFINED`)                                            | random label              |\n| anything else                                                             | fallback by inferred Kind |\n\nName patterns above that produce text (names, addresses, prefectures, cities, phone numbers, postal codes) switch\ndictionaries when `--locale ja` is set; locale-neutral patterns like `email` and `*_url` keep their English forms.\n\n`seeder` lets the database fill a column in exactly two cases:\n\n- The column is `IDENTITY` — Postgres `GENERATED ALWAYS AS IDENTITY` or\n  MySQL `AUTO_INCREMENT`.\n- The column's default is a Postgres sequence call (`DEFAULT\n  nextval('...')`). This is the form `serial` / `bigserial` expand into;\n  `seeder` leaves it to the DB so the sequence stays authoritative.\n\nEvery other column is generated by `seeder` — **even if it has a\n`DEFAULT`**. Plain defaults like `score int NOT NULL DEFAULT 0`,\n`created_at timestamptz DEFAULT now()`, or `status text DEFAULT 'active'`\nare all overridden with seeder-generated values so test data is varied\ninstead of every row sharing the same literal. If you need the DB default\nfor one of these columns, exclude the table.\n\nColumns covered by a single-column `UNIQUE` constraint are detected during\nintrospection and routed to a collision-avoiding generator: `email`\nbecomes `\u003cuuid\u003e@example.com`, other string columns get a UUID suffix on\ntop of the matched name rule, and integer columns are widened to a much\nlarger range. Composite `UNIQUE` constraints are not flagged because no\nsingle per-column generator can guarantee combined uniqueness.\n\n\u003e **Note on `json` / `jsonb` columns**: by default these emit a small\n\u003e fixed-shape object — `{\"id\": \u003cuuid\u003e, \"label\": \u003cword\u003e, \"count\": \u003cint\u003e,\n\u003e \"active\": \u003cbool\u003e}` — sized at ~85 bytes per row. This keeps `jsonb` seeding\n\u003e cheap at millions of rows while still producing realistic-looking payloads.\n\u003e Pin a different shape via `tables.\u003cname\u003e.columns.\u003ccol\u003e.value: '{...}'` in\n\u003e `seeder.yaml` when you need column-specific fields.\n\n\u003e **Note on large `--rows`**: `seeder` generates rows in chunks of\n\u003e `--batch-size` (default 1000) and flushes each chunk via `COPY` / multi-row\n\u003e `INSERT` before building the next, so memory stays bounded even at\n\u003e millions of rows. The parent-PK pool used for FK resolution is also capped\n\u003e at 100k values per (table, column); when refilled it keeps the tail of\n\u003e what the driver returned (i.e., only the last 100k entries are kept). The\n\u003e in-batch self-FK buffer used for forward-references shares the same 100k\n\u003e cap, so self-referential tables with seeder-generated PKs do not grow\n\u003e unbounded either.\n\n### Foreign keys\n\nTables are inserted in dependency order: parents first, then children pick\na random parent PK for each FK column.\n\n- **Composite FK** (multiple local columns referencing the same parent\n  tuple): `seeder` picks the parent row once and spreads its referenced\n  columns across the local FK group, so the inserted row always matches a\n  real parent tuple instead of stitching together columns from different\n  parents.\n- **Polymorphic FK** (Rails-style `*_type` + `*_id`): declared in\n  `seeder.yaml` (see [Configuration](#configuration)). `seeder` picks one\n  target table uniformly per row, then picks the id from that table's pool.\n- **Self-FK** (e.g., `employees.manager_id REFERENCES employees(id)`):\n  `seeder` resolves these forward-references inside the batch. Each row's\n  self-FK can point at a PK already generated earlier in the same batch;\n  the very first row picks its own PK (self-loop) when the constraint is\n  `NOT NULL`. This only works when the referenced PK is seeder-generated\n  (e.g., `uuid` PK on Postgres). DB-generated PKs such as `serial` or\n  `AUTO_INCREMENT` are unknown until after the insert, so a `NOT NULL`\n  self-FK against them still errors out; a nullable self-FK against them\n  silently sets `NULL` on every row.\n- **Real cycle** between distinct tables (`A → B → A`): there is no order\n  that satisfies both directions, so `seeder` reports the cycle as an error\n  rather than silently dropping one of the edges.\n\n## Current scope\n\nComposite FKs and polymorphic associations (yaml-declared) are supported,\nalong with `--output sql` / `--output ndjson` and `--stream --rate` for\nCDC-style continuous load. English, Japanese, and Swedish locales. JSON / JSONB\ncolumns still emit randomly-structured placeholder values. Out of scope for\nnow: more locales, LLM-assisted text, existing-DB statistics sampling.\n\n## Develop\n\n```bash\nmake build              # ./bin/seeder\nmake test               # go test ./... -race\nmake lint               # golangci-lint run\n\n# Bring up local Postgres + MySQL via docker compose\ndocker compose up -d\n\n# Integration tests (set either or both; unset drivers skip their tests)\nSEEDER_TEST_DSN_MYSQL=mysql://root:pass@localhost:3306/dev?parseTime=true \\\nSEEDER_TEST_DSN_POSTGRES=postgres://postgres:pass@localhost:5432/dev?sslmode=disable \\\n  make test-integration\n\n# Benchmarks (Postgres only; see bench/README.md). The bench DROPs and recreates\n# the public schema, so it uses its own env var instead of SEEDER_TEST_DSN_POSTGRES.\nSEEDER_BENCH_DSN=postgres://postgres:pass@localhost:5432/dev?sslmode=disable \\\n  go test -tags=integration -bench=. -benchmem -benchtime=5x -run=^$ \\\n    ./internal/insert/...\n```\n\n## License\n\n[MIT](./LICENSE)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmickamy%2Fseeder","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmickamy%2Fseeder","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmickamy%2Fseeder/lists"}