https://github.com/silkhorizonstudios/seedforge
One command to fill your database with realistic test data. Reads schema, resolves FK, no code needed.
https://github.com/silkhorizonstudios/seedforge
cli database developer-tools faker mysql postgresql python seed-data sqlite synthetic-data test-data
Last synced: about 1 month ago
JSON representation
One command to fill your database with realistic test data. Reads schema, resolves FK, no code needed.
- Host: GitHub
- URL: https://github.com/silkhorizonstudios/seedforge
- Owner: silkhorizonstudios
- Created: 2026-05-25T06:43:04.000Z (about 1 month ago)
- Default Branch: main
- Last Pushed: 2026-05-25T06:48:23.000Z (about 1 month ago)
- Last Synced: 2026-05-25T08:09:45.822Z (about 1 month ago)
- Topics: cli, database, developer-tools, faker, mysql, postgresql, python, seed-data, sqlite, synthetic-data, test-data
- Language: Python
- Homepage: https://pypi.org/project/seedforge/
- Size: 35.2 KB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# SeedForge
[](https://pypi.org/project/seedforge/)
[](https://pypi.org/project/seedforge/)
[](https://opensource.org/licenses/MIT)
Database seeding tool. Connects to your DB, reads the schema, generates realistic data with valid foreign keys.
```bash
pip install seedforge
seedforge connect postgresql://user:pass@localhost/mydb
seedforge generate --rows 1000
```
## Why
I got tired of writing seed scripts by hand every time I start a new project. You know the drill — empty dashboards, nothing to test against, and if you copy prod data you're asking for GDPR trouble.
SeedForge reads your schema and figures out the rest. It knows that `orders.user_id` points to `users.id`, so it fills `users` first. It knows that a column called `email` should look like an email, not random gibberish.
## Install
```bash
pip install seedforge # PostgreSQL + SQLite
pip install seedforge[mysql] # + MySQL/MariaDB
pip install seedforge[ai] # + AI providers
pip install seedforge[all] # everything
```
## Usage
```bash
# save connection (writes .seedforge.yaml)
seedforge connect postgresql://user:pass@localhost:5432/mydb
seedforge connect mysql://user:pass@localhost:3306/mydb
seedforge connect sqlite:///path/to/db.sqlite
# see what's in the database
seedforge inspect
# generate and insert
seedforge generate --rows 100
# just preview, don't touch the DB
seedforge generate --rows 10 --dry-run
# export instead of inserting
seedforge generate --rows 1000 --export sql
seedforge generate --rows 1000 --export json
# same data every time
seedforge generate --rows 100 --seed 42
# specific tables only (pulls in FK parents automatically)
seedforge generate --tables orders,payments --rows 50
# wipe tables first
seedforge generate --rows 100 --clean
```
## AI mode
If you want smarter data (realistic product names, proper bios, etc.), you can plug in an API key. SeedForge auto-detects the provider:
```bash
seedforge ai-generate --api-key sk-ant-... # Anthropic
seedforge ai-generate --api-key sk-... # OpenAI
seedforge ai-generate --api-key AIza... # Google Gemini
seedforge ai-generate --api-key gsk_... # Groq
```
Or set an env var (`ANTHROPIC_API_KEY`, `OPENAI_API_KEY`, `GEMINI_API_KEY`, `GROQ_API_KEY`) and just run `seedforge ai-generate`.
Only schema metadata goes to the API — table and column names, never your actual data.
## How it works
1. Connects to your database and reads `information_schema` (PostgreSQL/MySQL) or `PRAGMA` (SQLite)
2. Builds a dependency graph from foreign keys, topological sort gives the insertion order
3. For each column, picks a generator based on the name — `email` gets a realistic email, `price` gets a decimal, `created_at` gets a recent timestamp, and so on (80+ patterns)
4. Foreign key columns get real IDs from already-generated parent rows
5. Batch inserts everything in a single transaction
## What it recognizes
Some examples — there are 80+ patterns total:
| Column | Data |
|---|---|
| `email` | `john.smith@example.com` |
| `phone` | `+1-555-0123` |
| `first_name` / `last_name` | `John` / `Smith` |
| `price`, `amount` | `49.99` |
| `role` | `admin`, `user`, `moderator` |
| `status` | `active`, `pending`, `completed` |
| `created_at` | recent datetime |
| `is_active` | `true` (biased) |
| `password` | SHA-256 hash |
| `avatar_url` | `https://picsum.photos/...` |
| `uuid` | valid v4 UUID |
It's also context-aware: `name` in a `users` table gives person names, in `organizations` — company names, in `products` — product names.
## Config
`seedforge connect` creates a `.seedforge.yaml`:
```yaml
db_url: postgresql://user:pass@localhost:5432/mydb
default_rows: 100
default_schema: public
seed: 42
exclude_tables:
- _prisma_migrations
- django_migrations
```
Migration tables (`_prisma_migrations`, `django_migrations`, `alembic_version`, etc.) are excluded automatically.
## Databases
- PostgreSQL
- MySQL / MariaDB
- SQLite
## Contributing
PRs welcome. See [CONTRIBUTING.md](CONTRIBUTING.md).
## License
MIT