{"id":13477700,"url":"https://github.com/skyzh/mini-lsm","last_synced_at":"2025-04-08T23:18:05.881Z","repository":{"id":65064467,"uuid":"581630327","full_name":"skyzh/mini-lsm","owner":"skyzh","description":"A tutorial of building an LSM-Tree storage engine in a week.","archived":false,"fork":false,"pushed_at":"2024-10-01T01:16:40.000Z","size":840,"stargazers_count":2849,"open_issues_count":10,"forks_count":393,"subscribers_count":32,"default_branch":"main","last_synced_at":"2024-10-29T15:37:19.798Z","etag":null,"topics":["database","key-value-store","kv-store","lsm","lsm-tree","rust","storage","tutorial"],"latest_commit_sha":null,"homepage":"https://skyzh.github.io/mini-lsm/","language":"Rust","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/skyzh.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2022-12-23T19:16:00.000Z","updated_at":"2024-10-29T03:11:45.000Z","dependencies_parsed_at":"2024-02-27T01:26:54.113Z","dependency_job_id":"86ee0dc4-191c-47b7-9af7-60f230f005cc","html_url":"https://github.com/skyzh/mini-lsm","commit_stats":{"total_commits":237,"total_committers":30,"mean_commits":7.9,"dds":0.189873417721519,"last_synced_commit":"b84dd3838f54e017f7c365b3cad34e9c3d6a0efc"},"previous_names":[],"tags_count":1,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/skyzh%2Fmini-lsm","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/skyzh%2Fmini-lsm/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/skyzh%2Fmini-lsm/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/skyzh%2Fmini-lsm/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/skyzh","download_url":"https://codeload.github.com/skyzh/mini-lsm/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247941705,"owners_count":21022039,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["database","key-value-store","kv-store","lsm","lsm-tree","rust","storage","tutorial"],"created_at":"2024-07-31T16:01:46.289Z","updated_at":"2025-04-08T23:18:05.865Z","avatar_url":"https://github.com/skyzh.png","language":"Rust","funding_links":[],"categories":["Rust","tutorial"],"sub_categories":[],"readme":"![banner](./mini-lsm-book/src/mini-lsm-logo.png)\n\n# LSM in a Week\n\n[![CI (main)](https://github.com/skyzh/mini-lsm/actions/workflows/main.yml/badge.svg)](https://github.com/skyzh/mini-lsm/actions/workflows/main.yml)\n\nBuild a simple key-value storage engine in a week! And extend your LSM engine on the second + third week.\n\n## [Book](https://skyzh.github.io/mini-lsm)\n\nThe Mini-LSM book is available at [https://skyzh.github.io/mini-lsm](https://skyzh.github.io/mini-lsm). You may follow this guide and implement the Mini-LSM storage engine. We have 3 weeks (parts) of the course, each of them consists of 7 days (chapters).\n\n## Community\n\nYou may join skyzh's Discord server and study with the mini-lsm community.\n\n[![Join skyzh's Discord Server](mini-lsm-book/src/discord-badge.svg)](https://skyzh.dev/join/discord)\n\n**Add Your Solution**\n\nIf you finished at least one full week of this course, you can add your solution to the community solution list at [SOLUTIONS.md](./SOLUTIONS.md). You can submit a pull request and we might do a quick review of your code in return of your hard work.\n\n## Development\n\n**For Students**\n\nYou should modify code in `mini-lsm-starter` directory.\n\n```\ncargo x install-tools\ncargo x copy-test --week 1 --day 1\ncargo x scheck\ncargo run --bin mini-lsm-cli\ncargo run --bin compaction-simulator\n```\n\n**For Course Developers**\n\nYou should modify `mini-lsm` and `mini-lsm-mvcc`\n\n```\ncargo x install-tools\ncargo x check\ncargo x book\n```\n\nIf you changed public API in the reference solution, you might also need to synchronize it to the starter crate.\nTo do this, use `cargo x sync`.\n\n## Code Structure\n\n* mini-lsm: the final solution code for \u003c= week 2\n* mini-lsm-mvcc: the final solution code for week 3 MVCC\n* mini-lsm-starter: the starter code\n* mini-lsm-book: the course\n\nWe have another repo mini-lsm-solution-checkpoint at [https://github.com/skyzh/mini-lsm-solution-checkpoint](https://github.com/skyzh/mini-lsm-solution-checkpoint). In this repo, each commit corresponds to a chapter in the course. We will not update the solution checkpoint very often.\n\n## Demo\n\nYou can run the reference solution by yourself to gain an overview of the system before you start.\n\n```\ncargo run --bin mini-lsm-cli-ref\ncargo run --bin mini-lsm-cli-mvcc-ref\n```\n\nAnd we have a compaction simulator to experiment with your compaction algorithm implementation,\n\n```\ncargo run --bin compaction-simulator-ref\ncargo run --bin compaction-simulator-mvcc-ref\n```\n\n## Course Structure\n\nWe have 3 weeks + 1 extra week (in progress) for this course.\n\n* Week 1: Storage Format + Engine Skeleton\n* Week 2: Compaction and Persistence\n* Week 3: Multi-Version Concurrency Control\n* The Extra Week / Rest of Your Life: Optimizations (unlikely to be available in 2025...)\n\n![Course Roadmap](./mini-lsm-book/src/lsm-tutorial/00-full-overview.svg)\n\n| Week + Chapter | Topic                                                       |\n| -------------- | ----------------------------------------------------------- |\n| 1.1            | Memtable                                                    |\n| 1.2            | Merge Iterator                                              |\n| 1.3            | Block                                                       |\n| 1.4            | Sorted String Table (SST)                                   |\n| 1.5            | Read Path                                                   |\n| 1.6            | Write Path                                                  |\n| 1.7            | SST Optimizations: Prefix Key Encoding + Bloom Filters      |\n| 2.1            | Compaction Implementation                                   |\n| 2.2            | Simple Compaction Strategy (Traditional Leveled Compaction) |\n| 2.3            | Tiered Compaction Strategy (RocksDB Universal Compaction)   |\n| 2.4            | Leveled Compaction Strategy (RocksDB Leveled Compaction)    |\n| 2.5            | Manifest                                                    |\n| 2.6            | Write-Ahead Log (WAL)                                       |\n| 2.7            | Batch Write and Checksums                                   |\n| 3.1            | Timestamp Key Encoding                                      |\n| 3.2            | Snapshot Read - Memtables and Timestamps                    |\n| 3.3            | Snapshot Read - Transaction API                             |\n| 3.4            | Watermark and Garbage Collection                            |\n| 3.5            | Transactions and Optimistic Concurrency Control             |\n| 3.6            | Serializable Snapshot Isolation                             |\n| 3.7            | Compaction Filters                                          |\n\n## Related Projects\n\nmini-lsm inspired several projects used in production.\n\n* [SlateDB](https://slatedb.io/docs/architecture/) is an LSM engine over the object storage system.\n* [Tonbo](https://tonbo.io/about) stores parquet files directly on the object storage and organizes them in an LSM tree structure.\n\n## License\n\nThe Mini-LSM starter code and solution are under [Apache 2.0 license](LICENSE). The author reserves the full copyright of the course materials (markdown files and figures).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fskyzh%2Fmini-lsm","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fskyzh%2Fmini-lsm","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fskyzh%2Fmini-lsm/lists"}