{"id":51091220,"url":"https://github.com/dj258255/minidb","last_synced_at":"2026-06-24T02:02:58.775Z","repository":{"id":366815080,"uuid":"1277973042","full_name":"dj258255/minidb","owner":"dj258255","description":"A relational database built from scratch in C to dissect how PostgreSQL/MySQL work — pages, buffer pool, heap, B+Tree, SQL parser/executor, WAL, transactions.","archived":false,"fork":false,"pushed_at":"2026-06-23T12:43:44.000Z","size":172,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"master","last_synced_at":"2026-06-23T13:28:14.633Z","etag":null,"topics":["b-tree","c","database","database-internals","dbms","from-scratch","learning-project","sql","storage-engine","wal"],"latest_commit_sha":null,"homepage":null,"language":"C","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/dj258255.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-06-23T10:53:58.000Z","updated_at":"2026-06-23T12:44:03.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/dj258255/minidb","commit_stats":null,"previous_names":["dj258255/minidb"],"tags_count":null,"template":false,"template_full_name":null,"purl":"pkg:github/dj258255/minidb","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dj258255%2Fminidb","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dj258255%2Fminidb/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dj258255%2Fminidb/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dj258255%2Fminidb/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/dj258255","download_url":"https://codeload.github.com/dj258255/minidb/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dj258255%2Fminidb/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":34713791,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-24T02:00:07.484Z","response_time":106,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["b-tree","c","database","database-internals","dbms","from-scratch","learning-project","sql","storage-engine","wal"],"created_at":"2026-06-24T02:02:56.721Z","updated_at":"2026-06-24T02:02:58.752Z","avatar_url":"https://github.com/dj258255.png","language":"C","funding_links":[],"categories":[],"sub_categories":[],"readme":"# minidb\n\nA small relational database written from scratch in C, built to dissect how\nPostgreSQL and MySQL actually work inside. It goes from raw fixed-size pages all\nthe way up to running SQL: page storage, a buffer pool, a heap, a B+Tree index,\na hand-written SQL parser and executor, a write-ahead log, and transactions.\n\nThis is a learning project. The goal isn't to invent something new; it's to\nreproduce the real structure accurately and understand it. Every layer is\ncovered by tests (182 checks across 13 suites).\n\n![minidb REPL demo](docs/demo.svg)\n\n## Quick start\n\n```sh\nmake test            # build and run the test suite\nmake repl            # build the REPL\n./build/minidb my.db # open (or create) a database and type SQL\n```\n\nA session:\n\n```\nminidb\u003e CREATE TABLE users (id INT, name TEXT);\n테이블 'users' 생성됨 (컬럼 2개)\n  (인덱스: id 컬럼)\nminidb\u003e INSERT INTO users VALUES (1, 'kim');\nminidb\u003e INSERT INTO users VALUES (2, 'lee');\nminidb\u003e SELECT * FROM users WHERE id = 2;\nid | name\n2 | lee\n(1행, 인덱스 사용)\nminidb\u003e BEGIN;\nminidb\u003e DELETE FROM users WHERE id = 1;\nminidb\u003e ROLLBACK;\nminidb\u003e SELECT * FROM users;\nid | name\n1 | kim\n2 | lee\n(2행)\n```\n\nEach table is stored as its own pair of files and survives a restart (the schema\nis persisted too, so no need to re-run `CREATE TABLE`) -- see the storage layout\nbelow.\n\n## What's inside\n\nBuilt bottom-up; each layer sits on the one below it.\n\n| Layer | What it does | Mirrors |\n|---|---|---|\n| `pager.c` | fixed-size 4KB pages \u003c-\u003e a single file (`page_id * PAGE_SIZE`) | SQLite pager, PG smgr |\n| `page.c` | slotted page: pack variable-length rows into a page | PG/InnoDB page layout |\n| `bufpool.c` | page cache with pin counts, dirty flags, LRU eviction | InnoDB buffer pool |\n| `heap.c` | table = a collection of pages; rows addressed by `RID = (page, slot)` | PG heap |\n| `sql.c` | hand-written lexer + recursive-descent parser (SQL -\u003e AST) | every DB frontend |\n| `db.c` | executor: tuple codec, multi-table catalog, joins (NLJ/index/hash), aggregates | pg_catalog, executor |\n| `btree.c` | on-disk B+Tree index for O(log n) lookups, with node splits | InnoDB clustered index |\n| `wal.c` | write-ahead log: durability and atomicity, with crash recovery | PG WAL / redo log |\n\n### Storage layout\n\nLike PostgreSQL (each relation is its own file, `relfilenode`), every table lives\nin its own files, and a catalog file lists which tables exist:\n\n```\nmydb              catalog -- table names + schemas (like pg_class)\nmydb.users.tbl    users rows  (heap)\nmydb.users.idx    users PK index (B+Tree)\nmydb.orders.tbl   orders rows\nmydb.orders.idx   orders PK index\n```\n\n## SQL supported\n\n```\nCREATE TABLE \u003ct\u003e (\u003ccol\u003e INT|TEXT, ...)\nINSERT INTO \u003ct\u003e VALUES (\u003cint|'text'\u003e, ...)\nSELECT \u003c* | item, ...\u003e FROM \u003ct\u003e [\u003calias\u003e] [JOIN \u003ct2\u003e [\u003calias\u003e] ON \u003ccolref\u003e = \u003ccolref\u003e]...\n                  [WHERE \u003ccond\u003e [AND \u003ccond\u003e] [OR ...]]\n                  [GROUP BY \u003ccol\u003e] [ORDER BY \u003ccolref\u003e [ASC|DESC]] [LIMIT \u003cn\u003e]\nUPDATE \u003ct\u003e SET \u003ccol\u003e = \u003cvalue\u003e [WHERE ...]\nDELETE FROM \u003ct\u003e [WHERE ...]\nBEGIN | COMMIT | ROLLBACK\n\n\u003citem\u003e   is  \u003ccol\u003e | COUNT(*) | COUNT|SUM|MIN|MAX|AVG(\u003ccol\u003e)\n\u003ccond\u003e   is  \u003ccolref\u003e \u003cop\u003e \u003cvalue\u003e,  \u003cop\u003e is one of  =  !=  \u003c  \u003e  \u003c=  \u003e=\n\u003ccolref\u003e is  [\u003ctable\u003e.]\u003ccol\u003e\n```\n\nAn `=`, `\u003c`, `\u003e`, `\u003c=`, or `\u003e=` on the first column (an `INT` primary key) uses\nthe B+Tree index -- `=` is an O(log n) point lookup, the others walk the linked\nleaf chain as a range scan. `!=`, conditions on other columns, and compound\n`AND` conditions fall back to a full scan -- the kind of choice a query planner\nmakes. `ORDER BY`/`LIMIT` and `GROUP BY`/aggregates take a materialize path\n(collect, then sort / sort-group). `JOIN` is a recursive N-way join that picks a\nmethod per level like an optimizer: index nested-loop (`btree_search`) when the\ninner's primary key is the `ON` key, hash join (build a hash on the inner's join\ncolumn, then O(1) probe) for any other equi-join, else a plain nested-loop scan.\nTransactions use a no-steal + force-at-commit policy across every table and roll\nback both the heap and the index.\n\nSee `DESIGN.md` for the full layer map and build order.\n\n## Scope (honest limitations)\n\nKept simple on purpose: the first column of each table is treated as a unique\ninteger primary key; `WHERE` is in disjunctive normal form (AND-groups joined by\nOR, no parentheses); joins are INNER only, each `ON` is a single `=`, chained up\nto 4 tables (aliases supported, so self-joins work); projection/aggregation and\n`GROUP BY` are single-table and don't combine with `ORDER BY`; and there is no\nisolation/concurrency (one transaction at a time). B+Tree deletion isn't\nimplemented (deleted rows are tombstoned in the heap, so a stale index entry is\nharmless). These are noted in the code where they matter.\n\n## License\n\nMIT\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdj258255%2Fminidb","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdj258255%2Fminidb","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdj258255%2Fminidb/lists"}