{"id":47328403,"url":"https://github.com/timsehn/doltlite","last_synced_at":"2026-04-12T01:14:48.431Z","repository":{"id":344889951,"uuid":"1181240525","full_name":"timsehn/doltlite","owner":"timsehn","description":"A fork of SQLite that has Dolt storage and features","archived":false,"fork":false,"pushed_at":"2026-03-30T14:14:16.000Z","size":279862,"stargazers_count":39,"open_issues_count":11,"forks_count":3,"subscribers_count":1,"default_branch":"master","last_synced_at":"2026-04-03T01:12:35.449Z","etag":null,"topics":["database-version-control","dolt","sqlite","version-controlled-database"],"latest_commit_sha":null,"homepage":"","language":"C","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/timsehn.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.md","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-03-13T22:43:20.000Z","updated_at":"2026-04-02T00:35:31.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/timsehn/doltlite","commit_stats":null,"previous_names":["timsehn/doltlite"],"tags_count":13,"template":false,"template_full_name":null,"purl":"pkg:github/timsehn/doltlite","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/timsehn%2Fdoltlite","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/timsehn%2Fdoltlite/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/timsehn%2Fdoltlite/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/timsehn%2Fdoltlite/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/timsehn","download_url":"https://codeload.github.com/timsehn/doltlite/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/timsehn%2Fdoltlite/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31580507,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-08T14:31:17.711Z","status":"ssl_error","status_checked_at":"2026-04-08T14:31:17.202Z","response_time":54,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["database-version-control","dolt","sqlite","version-controlled-database"],"created_at":"2026-03-17T19:47:34.710Z","updated_at":"2026-04-09T01:01:42.427Z","avatar_url":"https://github.com/timsehn.png","language":"C","readme":"\u003cp align=\"center\"\u003e\n  \u003cimg src=\"doltlite_logo.webp\" alt=\"Doltlite\" width=\"600\"\u003e\n\u003c/p\u003e\n\n# Doltlite\n\nA SQLite fork that replaces the B-tree storage engine with a content-addressed\nprolly tree, enabling Git-like version control on a SQL database. Everything\nabove SQLite's `btree.h` interface (VDBE, query planner, parser) is untouched.\nEverything below it -- the pager and on-disk format -- is replaced with a\nprolly tree engine backed by a single-file content-addressed chunk store.\n\n## Building\n\n### macOS / Linux\n\n```\ncd build\n../configure\nmake\n./doltlite :memory:\n```\n\n### Windows (MSYS2 / MINGW64)\n\n```\npacman -S mingw-w64-x86_64-gcc mingw-w64-x86_64-zlib make tcl\nmkdir -p build \u0026\u0026 cd build\n../configure\nmake doltlite.exe\n./doltlite.exe :memory:\n```\n\nTo verify the engine:\n\n```sql\nSELECT doltlite_engine();\n-- prolly\n```\n\nTo build stock SQLite instead (for comparison):\n\n```\nmake DOLTLITE_PROLLY=0 sqlite3\n```\n\n## Using as a C Library\n\nDoltlite is designed as a drop-in replacement for SQLite. It uses the same\n`sqlite3.h` header and `sqlite3_*` API, so existing C programs work without\ncode changes — just link against `libdoltlite` instead of `libsqlite3` to get\nversion control. The build produces `libdoltlite.a` (static) and\n`libdoltlite.dylib`/`.so` (shared) with the full prolly tree engine and all\nDolt functions included.\n\n```bash\ncd build\n../configure\nmake doltlite-lib   # builds libdoltlite.a and libdoltlite.dylib/.so\n```\n\nCompile and link your program:\n\n```bash\n# Static link (recommended — single binary, no runtime deps)\ngcc -o myapp myapp.c -I/path/to/build libdoltlite.a -lpthread -lz\n\n# Dynamic link\ngcc -o myapp myapp.c -I/path/to/build -L/path/to/build -ldoltlite -lpthread -lz\n```\n\nThe API is the standard [SQLite C API](https://sqlite.org/cintro.html) —\n`sqlite3_open`, `sqlite3_exec`, `sqlite3_prepare_v2`, etc. Dolt features are\ncalled as SQL functions (`dolt_commit`, `dolt_branch`, `dolt_merge`, ...) and\nvirtual tables (`dolt_log`, `dolt_diff_\u003ctable\u003e`, `dolt_history_\u003ctable\u003e`, ...).\n\n### Quickstart Examples\n\nComplete working examples that demonstrate commits, branches, merges,\npoint-in-time queries, diffs, and tags. Each example does the same thing\nin a different language.\n\n**C** ([`examples/quickstart.c`](examples/quickstart.c)) — based on the\n[SQLite quickstart](https://sqlite.org/quickstart.html):\n\n```bash\ncd build\ngcc -o quickstart ../examples/quickstart.c -I. libdoltlite.a -lpthread -lz\n./quickstart\n```\n\n**Python** ([`examples/quickstart.py`](examples/quickstart.py)) — uses the\nstandard `sqlite3` module, zero code changes:\n\n```bash\ncd build\nLD_PRELOAD=./libdoltlite.so python3 ../examples/quickstart.py\n```\n\n**Go** ([`examples/go/main.go`](examples/go/main.go)) — uses\n[mattn/go-sqlite3](https://github.com/mattn/go-sqlite3) with the `libsqlite3`\nbuild tag:\n\n```bash\ncd examples/go\nCGO_CFLAGS=\"-I../../build\" CGO_LDFLAGS=\"../../build/libdoltlite.a -lz -lpthread\" \\\n    go build -tags libsqlite3 -o quickstart .\n./quickstart\n```\n\n## Dolt Features\n\nVersion control operations are exposed as SQL functions and virtual tables.\n\n### Staging and Committing\n\n```sql\n-- Stage specific tables or all changes\nSELECT dolt_add('users');\nSELECT dolt_add('-A');\n\n-- Commit staged changes\nSELECT dolt_commit('-m', 'Add users table');\n\n-- Stage and commit in one step\nSELECT dolt_commit('-A', '-m', 'Initial commit');\n\n-- Shorthand (compound flags, like git commit -am)\nSELECT dolt_commit('-am', 'Initial commit');\n\n-- Commit with author\nSELECT dolt_commit('-m', 'Fix data', '--author', 'Alice \u003calice@example.com\u003e');\n```\n\n### Configuration\n\n```sql\n-- Set committer name and email (per-session)\nSELECT dolt_config('user.name', 'Tim Sehn');\nSELECT dolt_config('user.email', 'tim@dolthub.com');\n\n-- Read current config\nSELECT dolt_config('user.name');\n-- Tim Sehn\n```\n\nAll commit-creating operations (`dolt_commit`, `dolt_merge`, `dolt_cherry_pick`,\n`dolt_revert`) use these values. The `--author` flag on `dolt_commit` overrides\nthe session config for a single commit. Config is per-connection and not\npersisted — set it at the start of each session.\n\n### Status and History\n\n```sql\n-- Working/staged changes\nSELECT * FROM dolt_status;\n-- table_name | staged | status\n-- users      | 1      | modified\n-- orders     | 0      | new table\n\n-- Commit history\nSELECT * FROM dolt_log;\n-- commit_hash | committer | email | date | message\n```\n\n### History (dolt_history_\u0026lt;table\u0026gt;)\n\nTime-travel query showing every version of every row across all commits:\n\n```sql\nSELECT * FROM dolt_history_users;\n-- rowid_val | value | commit_hash | committer | commit_date\n\n-- How many times was row 42 changed?\nSELECT count(*) FROM dolt_history_users WHERE rowid_val = 42;\n\n-- What did the table look like at a specific commit?\nSELECT * FROM dolt_history_users WHERE commit_hash = 'abc123...';\n```\n\n### Point-in-Time Queries (AS OF)\n\nRead a table as it existed at any commit, branch, or tag.\nReturns the real table columns (not generic blobs):\n\n```sql\nSELECT * FROM dolt_at_users('abc123...');\n-- id | name | email (same columns as the actual table)\n\nSELECT * FROM dolt_at_users('feature');\nSELECT * FROM dolt_at_users('v1.0');\n\n-- Compare current vs historical\nSELECT count(*) FROM users;                     -- 100\nSELECT count(*) FROM dolt_at_users('v1.0');    -- 42\n```\n\n### Diff\n\nRow-level diff between any two commits, or working state vs HEAD:\n\n```sql\nSELECT * FROM dolt_diff('users');\nSELECT * FROM dolt_diff('users', 'abc123...', 'def456...');\n-- diff_type | rowid_val | from_value | to_value\n```\n\n### Schema Diff\n\nCompare schemas between any two commits, branches, or tags:\n\n```sql\nSELECT * FROM dolt_schema_diff('v1.0', 'v2.0');\n-- table_name | from_create_stmt | to_create_stmt | diff_type\n\n-- Shows tables added, dropped, or modified (schema changed)\n-- Also detects new indexes and views\n```\n\n### Audit Log (dolt_diff_\u0026lt;table\u0026gt;)\n\nFull history of every change to every row, across all commits:\n\n```sql\nSELECT * FROM dolt_diff_users;\n-- diff_type | rowid_val | from_value | to_value |\n--   from_commit | to_commit | from_commit_date | to_commit_date\n\n-- Every INSERT, UPDATE, DELETE that was ever committed is here\nSELECT diff_type, rowid_val, to_commit FROM dolt_diff_users\n  WHERE rowid_val = 42;\n```\n\nOne `dolt_diff_\u003ctable\u003e` virtual table is automatically created for each\nuser table. The table walks the full commit history and diffs each\nconsecutive pair of commits.\n\n### Reset\n\n```sql\nSELECT dolt_reset('--soft');   -- unstage all, keep working changes\nSELECT dolt_reset('--hard');   -- discard all uncommitted changes\n```\n\n### Branching (Per-Session)\n\nEach connection tracks its own active branch. Branch state (active branch\nname, HEAD commit, staged catalog hash) lives in the `Btree` struct\n(per-connection). Each connection gets its own `BtShared` and chunk store.\n\n```sql\n-- Create a branch at current HEAD\nSELECT dolt_branch('feature');\n\n-- Switch to it (fails if uncommitted changes exist)\nSELECT dolt_checkout('feature');\n\n-- See current branch\nSELECT active_branch();\n\n-- List all branches\nSELECT * FROM dolt_branches;\n-- name | hash | is_current\n\n-- Delete a branch\nSELECT dolt_branch('-d', 'feature');\n```\n\n### Tags\n\nImmutable named pointers to commits:\n\n```sql\nSELECT dolt_tag('v1.0');                  -- tag HEAD\nSELECT dolt_tag('v1.0', 'abc123...');     -- tag specific commit\nSELECT dolt_tag('-d', 'v1.0');            -- delete tag\nSELECT * FROM dolt_tags;                  -- list tags\n```\n\n### Merge\n\nThree-way merge of another branch into the current branch. Merges at the\n**row level** — non-conflicting changes to different rows of the same table\nare auto-merged. Conflicts (same row modified on both branches) are detected\nand stored for resolution.\n\n```sql\nSELECT dolt_merge('feature');\n-- Returns commit hash (clean merge), or \"Merge completed with N conflict(s)\"\n```\n\n### Conflicts\n\nView and resolve merge conflicts:\n\n```sql\n-- View which tables have conflicts (summary)\nSELECT * FROM dolt_conflicts;\n-- table_name | num_conflicts\n-- users      | 2\n\n-- View individual conflict rows for a table\nSELECT * FROM dolt_conflicts_users;\n-- base_rowid | base_value | our_rowid | our_value | their_rowid | their_value\n\n-- Resolve individual conflicts by deleting them (keeps current working value)\nDELETE FROM dolt_conflicts_users WHERE base_rowid = 5;\n\n-- Or resolve all conflicts for a table at once\nSELECT dolt_conflicts_resolve('--ours', 'users');   -- keep our values\nSELECT dolt_conflicts_resolve('--theirs', 'users'); -- take their values\n\n-- Commit is blocked while conflicts exist\nSELECT dolt_commit('-A', '-m', 'msg');\n-- Error: \"cannot commit: unresolved merge conflicts\"\n```\n\n### Cherry-Pick\n\nApply the changes from a specific commit onto the current branch:\n\n```sql\nSELECT dolt_cherry_pick('abc123...');\n-- Returns new commit hash, or \"Cherry-pick completed with N conflict(s)\"\n```\n\nCherry-pick works by computing the diff between the target commit and its\nparent, then applying that diff to the current HEAD as a three-way merge.\nConflicts are handled the same way as `dolt_merge`.\n\n### Revert\n\nCreate a new commit that undoes the changes from a specific commit:\n\n```sql\nSELECT dolt_revert('abc123...');\n-- Returns new commit hash, or \"Revert completed with N conflict(s)\"\n```\n\nRevert computes the inverse of the target commit's changes and applies\nthem to the current HEAD. The new commit message is\n`Revert '\u003coriginal message\u003e'`. Cannot revert the initial commit.\n\n### Garbage Collection\n\nRemove unreachable chunks from the store to reclaim space:\n\n```sql\nSELECT dolt_gc();\n-- \"12 chunks removed, 45 chunks kept\"\n```\n\nStop-the-world mark-and-sweep: walks all branches, tags, commit\nhistory, catalogs, and prolly tree nodes to find reachable chunks,\nthen rewrites the file with only live data. Safe and idempotent.\n\n### Merge Base\n\nFind the common ancestor of two commits:\n\n```sql\nSELECT dolt_merge_base('abc123...', 'def456...');\n```\n\n### Remotes\n\nDoltlite supports Git-like remotes for pushing, fetching, pulling, and cloning\nbetween databases.\n\n#### Filesystem Remotes\n\n```sql\n-- Add a remote\nSELECT dolt_remote('add', 'origin', 'file:///path/to/remote.doltlite');\n\n-- Push a branch\nSELECT dolt_push('origin', 'main');\n\n-- Clone a remote database\nSELECT dolt_clone('file:///path/to/source.doltlite');\n\n-- Fetch updates\nSELECT dolt_fetch('origin', 'main');\n\n-- Pull (fetch + fast-forward)\nSELECT dolt_pull('origin', 'main');\n\n-- List remotes\nSELECT * FROM dolt_remotes;\n```\n\n#### HTTP Remotes\n\n```sql\n-- Add an HTTP remote (URL includes database name)\nSELECT dolt_remote('add', 'origin', 'http://myserver:8080/mydb.db');\n\n-- All operations work identically to file:// remotes\nSELECT dolt_push('origin', 'main');\nSELECT dolt_clone('http://myserver:8080/mydb.db');\nSELECT dolt_fetch('origin', 'main');\nSELECT dolt_pull('origin', 'main');\n```\n\n#### Remote Server (`doltlite-remotesrv`)\n\nDoltlite includes a standalone HTTP server for serving databases over the\nnetwork. Build it alongside doltlite:\n\n```\ncd build\nmake doltlite-remotesrv\n```\n\nStart serving a directory of databases:\n\n```\n./doltlite-remotesrv -p 8080 /path/to/databases/\n```\n\nEvery `.db` file in that directory becomes accessible at\n`http://host:8080/filename.db`. The server supports push, fetch, pull, and\nclone — multiple clients can collaborate on the same databases.\n\nThe server is also embeddable as a library (`doltliteServeAsync` in\n`doltlite_remotesrv.h`) for applications that want to host remotes in-process.\n\n#### How It Works\n\nContent-addressed chunk transfer — only sends chunks the remote doesn't already\nhave. BFS traversal of the DAG with batch `HasMany` pruning.\n\n## Using Existing SQLite Databases\n\nDoltlite can ATTACH standard SQLite databases alongside its own prolly-tree\nstorage. This lets you keep versioned tables in doltlite and high-write\noperational tables in standard SQLite, queried through a single connection.\n\nDoltlite detects the file format automatically from the header — no\nconfiguration needed. Standard SQLite files route to SQLite's original B-tree\nengine; everything else uses the prolly tree.\n\n### Basic ATTACH\n\n```sql\n-- Attach a standard SQLite database\nATTACH DATABASE '/path/to/events.sqlite' AS ops;\n\n-- Query it (prefix table names with the alias)\nSELECT * FROM ops.events WHERE type='click';\n\n-- Main db tables need no prefix\nSELECT * FROM threads;\n\n-- Detach when done\nDETACH DATABASE ops;\n```\n\n### Cross-Database JOINs\n\n```sql\n-- Join doltlite (versioned) tables with SQLite (attached) tables\nSELECT t.title, e.type\nFROM threads t\nJOIN ops.events e ON t.id = e.thread_id;\n```\n\n### Migrating Data Between Formats\n\n```sql\n-- Copy from SQLite into doltlite (now versioned)\nINSERT INTO threads SELECT * FROM ops.threads;\n\n-- Copy from doltlite into SQLite (for export)\nINSERT INTO ops.archive SELECT * FROM threads WHERE archived=1;\n\n-- One-step copy with CREATE TABLE...AS\nCREATE TABLE local_events AS SELECT * FROM ops.events;\n```\n\n### Hybrid Storage Pattern\n\nUse doltlite for tables that benefit from version control, and standard SQLite\nfor high-throughput tables that don't need history:\n\n```sql\n-- Main DB: doltlite (versioned)\nCREATE TABLE config(key TEXT PRIMARY KEY, val TEXT);\nSELECT dolt_commit('-am', 'Add config table');\n\n-- Attached: standard SQLite (high-write, no versioning overhead)\nATTACH DATABASE 'telemetry.sqlite' AS tel;\nCREATE TABLE tel.events(seq INTEGER PRIMARY KEY, kind TEXT, payload TEXT);\n\n-- Hot write path goes to standard SQLite\nINSERT INTO tel.events VALUES(1, 'pageview', '{\"url\":\"/home\"}');\n\n-- Analytics spans both databases\nSELECT c.val, count(e.seq)\nFROM config c\nJOIN tel.events e ON e.kind = c.key\nGROUP BY c.key;\n\n-- Version control only applies to main db\nSELECT * FROM dolt_diff('config');\n```\n\n## Per-Session Branching Architecture\n\nEach connection gets its own `Btree` and `BtShared` (not shared across\nconnections). Doltlite stores the session's branch name, HEAD commit hash,\nand staged catalog hash in the `Btree` struct.\n\n- Each connection can be on a different branch. Cross-branch concurrent\n  access is safe — each branch's working catalog is stored independently\n  in a per-branch working state chunk, so one branch's autocommit never\n  corrupts another branch's reads.\n- `dolt_checkout` reloads the table registry from the target branch's catalog.\n- Write transactions (DML) are serialized via an exclusive file-level lock,\n  matching SQLite's standard behavior. Under that lock, the connection\n  refreshes from disk before writing. Multiple connections can read\n  concurrently; writes from one connection are immediately visible to\n  readers on the same branch.\n- All commit graph mutations (`dolt_commit`, `dolt_merge`, `dolt_reset`,\n  `dolt_branch`, `dolt_tag`, push, pull) are also serialized via the\n  file-level lock, preventing silent data loss from concurrent commits.\n\n## Performance\n\n### Sysbench OLTP Benchmarks: Doltlite vs SQLite\n\nDoltlite is a drop-in replacement for SQLite, so the natural question is: what\ndoes version control cost?\n\nEvery PR runs a [sysbench-style benchmark](test/sysbench_compare.sh) comparing\ndoltlite against stock SQLite on 23 OLTP workloads. Results are posted as a PR\ncomment.\n\n#### Reads\n\n| Test | SQLite (ms) | Doltlite (ms) | Multiplier |\n|------|-------------|---------------|------------|\n| oltp_point_select | 145 | 89 | 0.61 |\n| oltp_range_select | 38 | 36 | 0.95 |\n| oltp_sum_range | 21 | 18 | 0.86 |\n| oltp_order_range | 8 | 8 | 1.00 |\n| oltp_distinct_range | 9 | 7 | 0.78 |\n| oltp_index_scan | 15 | 11 | 0.73 |\n| select_random_points | 39 | 48 | 1.23 |\n| select_random_ranges | 17 | 10 | 0.59 |\n| covering_index_scan | 20 | 28 | 1.40 |\n| groupby_scan | 54 | 54 | 1.00 |\n| index_join | 9 | 11 | 1.22 |\n| index_join_scan | 3 | 6 | 2.00 |\n| types_table_scan | 13 | 13 | 1.00 |\n| table_scan | 1 | 2 | 2.00 |\n| oltp_read_only | 340 | 259 | 0.76 |\n\n#### Writes\n\n| Test | SQLite (ms) | Doltlite (ms) | Multiplier |\n|------|-------------|---------------|------------|\n| oltp_bulk_insert | 32 | 39 | 1.22 |\n| oltp_insert | 21 | 35 | 1.67 |\n| oltp_update_index | 48 | 128 | 2.67 |\n| oltp_update_non_index | 37 | 52 | 1.41 |\n| oltp_delete_insert | 44 | 76 | 1.73 |\n| oltp_write_only | 21 | 35 | 1.67 |\n| types_delete_insert | 28 | 32 | 1.14 |\n| oltp_read_write | 128 | 488 | 3.81 |\n\n_10K rows, file-backed, Linux x64 (GitHub Actions). Run `test/sysbench_compare.sh` to reproduce._\n\n### Algorithmic Complexity\n\nAll numbers below have automated assertions in CI (`test/doltlite_perf.sh` and `test/doltlite_structural.sh`).\n\n- **O(log n) Point Operations** -- SELECT, UPDATE, and DELETE by primary key are O(log n), essentially constant time from 1K to 1M rows. Tested and asserted at 1K, 100K, and 1M rows.\n- **O(n log n) Bulk Insert** -- Bulk INSERT inside BEGIN/COMMIT scales as O(n log n). 1M rows inserts in ~2 seconds. CTE-based inserts also scale linearly (5M rows in 11s).\n- **O(changes) Diff** -- `dolt_diff` between two commits is proportional to the number of changed rows, not the table size. A single-row diff on a 1M-row table takes the same time as on a 1K-row table (~30ms).\n- **Structural Sharing** -- The prolly tree provides structural sharing between versions. Changing 1 row in a 10K-row table adds only 1.9% to the file size (5.2KB on 273KB). Branch creation with 1 new row adds ~10% overhead.\n- **Garbage Collection** -- `dolt_gc()` reclaims orphaned chunks. Deleting a branch with 1000 unique rows and running GC reclaims 53% of file size. GC is idempotent and preserves all reachable data.\n\n## Running Tests\n\n### SQLite Tcl Test Suite\n\n87,000+ SQLite test cases pass with 0 correctness failures.\n\n```bash\n# Install Tcl (macOS)\nbrew install tcl-tk\n\n# Configure with Tcl support\ncd build\n../configure --with-tcl=$(brew --prefix tcl-tk)/lib\n\n# Build testfixture\nmake testfixture OPTS=\"-L$(brew --prefix)/lib\"\n\n# Run a single test file\n./testfixture ../test/select1.test\n\n# Run with timeout\nperl -e 'alarm(60); exec @ARGV' ./testfixture ../test/select1.test\n\n# Count passes\n./testfixture ../test/func.test 2\u003e\u00261 | grep -c \"Ok$\"\n```\n\nStock SQLite testfixture for comparison:\n\n```\nmake testfixture DOLTLITE_PROLLY=0 USE_AMALGAMATION=1\n```\n\n### Doltlite Shell Tests\n\n31 test suites covering all features:\n\n```bash\n# Run all suites\ncd build\nbash ../test/run_doltlite_tests.sh\n\n# Run individual suites\nbash ../test/doltlite_parity.sh          # SQLite compatibility (110 tests)\nbash ../test/doltlite_commit.sh          # Commits and log\nbash ../test/doltlite_staging.sh         # Add, status, staging\nbash ../test/doltlite_branch.sh          # Branching and checkout\nbash ../test/doltlite_merge.sh           # Three-way merge\nbash ../test/doltlite_attach_sqlite.sh   # ATTACH standard SQLite databases\n```\n\n### SQL Logic Test Suite\n\nDoltlite passes 100% of the\n[sqllogictest](https://www.sqlite.org/sqllogictest/) suite — the same\n5.7 million-statement correctness corpus that SQLite itself uses. Every PR\nruns the full suite in CI, comparing Doltlite's results against stock SQLite\nas a reference. Zero failures, zero errors.\n\nThe test works by building the official\n[sqllogictest C runner](https://www.sqlite.org/sqllogictest/) twice — once\nlinked against stock SQLite, once against Doltlite's amalgamation — and\nrunning every `.test` file through both in `--verify` mode. Any result\ndivergence from stock SQLite is a failure.\n\n```bash\n# Build both runners and run the full suite (requires Fossil)\nfossil clone https://www.sqlite.org/sqllogictest/ /tmp/sqllogictest.fossil\nmkdir -p /tmp/sqllogictest \u0026\u0026 cd /tmp/sqllogictest \u0026\u0026 fossil open /tmp/sqllogictest.fossil\n\n# Build stock runner (reference)\ncd src\ngcc -O2 -DSQLITE_NO_SYNC=1 -DSQLITE_THREADSAFE=0 \\\n    -DSQLITE_OMIT_LOAD_EXTENSION -c md5.c sqlite3.c\ngcc -O2 -o sqllogictest-stock sqllogictest.c md5.o sqlite3.o -lpthread -lm\n\n# Build doltlite runner (replace amalgamation)\ncp /path/to/doltlite/build/sqlite3.c sqlite3.c\ncp /path/to/doltlite/build/sqlite3.h sqlite3.h\ngcc -O2 -DSQLITE_NO_SYNC=1 -DSQLITE_THREADSAFE=0 \\\n    -DSQLITE_OMIT_LOAD_EXTENSION -c sqlite3.c\ngcc -O2 -o sqllogictest-doltlite sqllogictest.c md5.o sqlite3.o -lpthread -lm -lz\n\n# Run the suite\nbash test/run_sqllogictest.sh \\\n    sqllogictest-doltlite sqllogictest-stock /tmp/sqllogictest/test\n```\n\n### Concurrent Branch Tests\n\nC tests that verify cross-branch isolation — two connections on different\nbranches both write and read without corrupting each other:\n\n```bash\ncd build\ngcc -o cross_branch_test ../test/cross_branch_test.c \\\n    -I. -I../src libdoltlite.a -lz -lpthread\n./cross_branch_test\n```\n\n## Architecture\n\n### Prolly Tree Engine\n\n| File | Purpose |\n|------|---------|\n| `prolly_hash.c/h` | xxHash32 content addressing |\n| `prolly_node.c/h` | Binary node format (serialization, field access) |\n| `prolly_cache.c/h` | LRU node cache |\n| `prolly_cursor.c/h` | Tree cursor (seek, next, prev) |\n| `prolly_mutmap.c/h` | Skip list write buffer for pending edits |\n| `prolly_chunker.c/h` | Rolling hash tree builder |\n| `prolly_mutate.c/h` | Merge-flush edits into tree |\n| `prolly_diff.c/h` | Tree-level diff (drives `dolt_diff`) |\n| `prolly_arena.c/h` | Arena allocator for tree operations |\n| `prolly_btree.c` | `btree.h` API implementation (main integration point) |\n| `sortkey.c/h` | Sort key encoding for memcmp-sortable index keys |\n| `chunk_store.c` | Single-file content-addressed chunk storage |\n| `pager_shim.c` | Pager facade (satisfies pager API without page-based I/O) |\n| `btree_orig_*.c` | Original SQLite btree compiled with renamed symbols (for ATTACH) |\n| `btree_orig_api.c/h` | Bridge API between prolly dispatch and original btree |\n\n### Doltlite Feature Files\n\n| File | Purpose |\n|------|---------|\n| `doltlite.c` | `dolt_add`, `dolt_commit`, `dolt_reset`, `dolt_merge`, registration |\n| `doltlite_status.c` | `dolt_status` virtual table |\n| `doltlite_log.c` | `dolt_log` virtual table |\n| `doltlite_diff.c` | `dolt_diff` table-valued function |\n| `doltlite_branch.c` | `dolt_branch`, `dolt_checkout`, `active_branch`, `dolt_branches` |\n| `doltlite_tag.c` | `dolt_tag`, `dolt_tags` |\n| `doltlite_merge.c` | Three-way catalog and row-level merge |\n| `doltlite_conflicts.c` | `dolt_conflicts`, `dolt_conflicts_resolve` |\n| `doltlite_ancestor.c` | Common ancestor search, `dolt_merge_base` |\n| `doltlite_commit.h` | Commit object serialization/deserialization |\n| `doltlite_ancestor.h` | Ancestor-finding API |\n| `doltlite_history.c` | `dolt_history_\u003ctable\u003e` virtual table |\n| `doltlite_at.c` | `dolt_at_\u003ctable\u003e` point-in-time query |\n| `doltlite_schema_diff.c` | `dolt_schema_diff` virtual table |\n| `doltlite_gc.c` | `dolt_gc` garbage collection |\n| `doltlite_remote.c` | Remote management (`dolt_remote`, `dolt_push`, `dolt_fetch`, `dolt_clone`) |\n| `doltlite_http_remote.c` | HTTP remote client (BSD sockets) |\n| `doltlite_remotesrv.c` | Standalone HTTP server for remotes |\n\n## Dolt vs Doltlite: Storage Engine Comparison\n\nDoltlite implements the same prolly tree architecture as\n[Dolt](https://github.com/dolthub/dolt), but adapted for SQLite's constraints\nand C implementation. The core idea is identical — content-addressed immutable\nnodes with rolling-hash-determined boundaries — but the details differ\nsignificantly.\n\n### Prolly Tree\n\nBoth use prolly trees (probabilistic B-trees) where node boundaries are\ndetermined by a rolling hash over key bytes rather than fixed fan-out. This gives\ncontent-defined chunking: identical subtrees produce identical hashes regardless\nof where they appear, enabling structural sharing between versions.\n\n| | Dolt | Doltlite |\n|--|------|----------|\n| **Language** | Go | C (inside SQLite) |\n| **Node format** | FlatBuffers | Custom binary (header + offset arrays + data regions) |\n| **Hash function** | xxhash, 20 bytes | xxHash32 with 5 seeds packed into 20 bytes |\n| **Chunk target** | ~4KB | 4KB (512B min, 16KB max) |\n| **Boundary detection** | Rolling hash, `(hash \u0026 pattern) == pattern` | Same algorithm |\n\n### Key Encoding\n\n**Dolt** uses a purpose-built tuple encoding: fields are serialized as contiguous\nbytes with a trailing offset array and field count. Keys sort lexicographically,\nso comparison is a single `memcmp`.\n\n**Doltlite** uses sort key materialization for index (BLOBKEY) entries. Each\nSQLite record is converted to a memcmp-sortable byte string at insert time:\nintegers and floats are encoded as IEEE 754 doubles with sign normalization,\ntext and blobs use NUL-byte escaping with double-NUL terminators. The sort key\nis stored as the prolly tree key; the original SQLite record is stored as the\nvalue (for reads). This enables `memcmp` comparison in the tree at the cost of\n~2x index entry size. For INTKEY tables (rowid tables), keys are 8-byte\nlittle-endian integers — comparison is trivial.\n\n### Tree Mutation\n\n**Dolt** uses a chunker with `advanceTo` boundary synchronization. Two cursors\ntrack the old tree and new tree simultaneously. When the chunker fires a boundary\nthat aligns with an old tree node boundary, it skips the entire unchanged\nsubtree. This handles splits, merges, and boundary drift naturally within a\nsingle bottom-up pass.\n\n**Doltlite** uses a cursor-path-stack approach. For each edit, it seeks from root\nto leaf, clones the leaf into a node builder, applies edits, serializes the new\nleaf (with rolling-hash re-chunking for overflow/underflow), and rewrites\nancestors by walking up the path stack. Unchanged subtrees are never loaded. A\nhybrid strategy falls back to a full O(N+M) merge-walk when the edit count is\nlarge relative to tree size.\n\nBoth achieve O(M log N) for sparse edits. Dolt's approach is more elegant for\nboundary maintenance; doltlite's is simpler to implement in C and integrates\nnaturally with SQLite's cursor-based API.\n\n### Chunk Store\n\n**Dolt** uses the Noms Block Store (NBS) format with multiple table files\norganized into generations (oldgen/newgen). Writers append new table files;\nreaders see consistent snapshots. This enables MVCC-like concurrency with\noptimistic locking at the manifest level.\n\n**Doltlite** uses a single file with three regions: a 168-byte manifest header\nat offset 0, a compacted chunk data region with sorted index (written by GC),\nand a WAL region at the end of the file (append-only journal of new chunks).\nNormal commits append to the WAL region at EOF. GC rewrites the entire file\nwith all chunks compacted (empty WAL region). Concurrency uses file-level\nlocking for serialization.\n\n### Commits and Metadata\n\n**Dolt** stores commits as FlatBuffer-serialized objects forming a DAG (directed\nacyclic graph) with multiple parents for merge commits. Commits include a parent\nclosure for O(1) ancestor queries and a height field for efficient traversal.\n\n**Doltlite** stores commits as custom binary objects forming a DAG with\nmulti-parent support (merge commits record both parents). Each branch has an\nassociated WorkingSet chunk that stores staged catalog and merge state\nindependently, plus a per-branch working catalog tracked in a separate working\nstate chunk (referenced by the manifest). This allows connections on different\nbranches to each find their own catalog on refresh without reading a stale\ncatalog from another branch. The catalog hash is purely data-derived (no\nruntime metadata), enabling O(1) dirty checks via hash comparison. Branches\nand tags are stored in a serialized refs chunk referenced by the manifest.\n\n### Garbage Collection\n\nBoth use mark-and-sweep: walk all reachable chunks from branches, tags, and\ncommit history, then remove everything else. Dolt rewrites live data into new\ntable files and deletes old ones. Doltlite compacts in-place by rewriting the\nsingle database file with only live chunks.\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftimsehn%2Fdoltlite","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ftimsehn%2Fdoltlite","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftimsehn%2Fdoltlite/lists"}