{"id":27882235,"url":"https://github.com/wenzhang-dev/bitcaskdb","last_synced_at":"2025-07-23T22:36:57.658Z","repository":{"id":287392288,"uuid":"963098267","full_name":"wenzhang-dev/bitcaskDB","owner":"wenzhang-dev","description":"Light-weight, fast, fixed capacity key/value storage engine base on bitcask storage model","archived":false,"fork":false,"pushed_at":"2025-04-30T09:40:00.000Z","size":247,"stargazers_count":26,"open_issues_count":0,"forks_count":7,"subscribers_count":3,"default_branch":"main","last_synced_at":"2025-05-27T07:56:48.828Z","etag":null,"topics":["bitcask","embedded","golang","key-value","kvstore","lru-cache","small-object"],"latest_commit_sha":null,"homepage":"","language":"Go","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/wenzhang-dev.png","metadata":{"files":{"readme":"README-CN.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-04-09T06:44:28.000Z","updated_at":"2025-05-27T05:51:22.000Z","dependencies_parsed_at":"2025-05-05T05:09:17.460Z","dependency_job_id":null,"html_url":"https://github.com/wenzhang-dev/bitcaskDB","commit_stats":null,"previous_names":["wenzhang-dev/bitcaskdb"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/wenzhang-dev/bitcaskDB","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/wenzhang-dev%2FbitcaskDB","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/wenzhang-dev%2FbitcaskDB/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/wenzhang-dev%2FbitcaskDB/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/wenzhang-dev%2FbitcaskDB/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/wenzhang-dev","download_url":"https://codeload.github.com/wenzhang-dev/bitcaskDB/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/wenzhang-dev%2FbitcaskDB/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":266761410,"owners_count":23980296,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-07-23T02:00:09.312Z","response_time":66,"last_error":null,"robots_txt_status":null,"robots_txt_updated_at":null,"robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["bitcask","embedded","golang","key-value","kvstore","lru-cache","small-object"],"created_at":"2025-05-05T05:09:06.254Z","updated_at":"2025-07-23T22:36:57.631Z","avatar_url":"https://github.com/wenzhang-dev.png","language":"Go","funding_links":[],"categories":[],"sub_categories":[],"readme":"# bitcaskDB 是什么？\n\nbitcaskDB是一个基于bitcask存储模型的轻量级、快速、固定容量的键值对存储引擎。\n\n它最大的特点是在内存中缓存键值对的索引，每次查询只需要单次 disk seek。按照 100 字节 key，4KB value 的小对象计算，缓存 10 million 个对象，大约需要 1GB 内存，40GB 磁盘空间。相反，如果采用类似 redis，memcached 全内存的缓存方案，相比之下，内存的开销很大。\n\n# 动机\n\n- 硬件资源受限，如 4C8G 100G 磁盘\n- 缓存数以千万的小对象\n\n\n# 特性\n\n- 追加写\n- 固定长度的 namespace\n- 固定磁盘容量和内存用量\n- 细粒度的合并\n- 近似 LRU 淘汰策略\n- 自定义记录的元数据\n- 自定义合并策略\n- 自定义挑选策略\n- 批量写\n- 允许过期时间和数据指纹 Etag\n- 基于 hint 的快速恢复\n- 软删除\n\n# 对比分析\n\n## LSM\n- 追加写\n- 读操作可能需要多次随机寻址\n- 写放大\n  - 链式合并\n- 范围查询\n- 有序性\n- 回收磁盘空间较慢\n  - 多个数据版本\n\n\n## B+Tree\n- 原地更新\n- 有序性\n- 范围查询\n- 很难回收磁盘空间\n\n\n## Bitcask\n- 追加写\n- 明确的查询和插入性能\n- 查询仅需要单次寻址\n- 快速的回收磁盘空间\n  - 内存仅保留最新的数据版本\n- 内存可使用多种数据模型，如 btree，hashtable\n  - hashtable 更加紧凑，但无序，不支持范围查询\n  - btree 支持范围查询，顺序迭代，但内存开销更大\n\n\n# 快速开始\n\n\n```golang\nimport \"github.com/wenzhang-dev/bitcaskDB\"\n\nconst data = `\n\u003c!DOCTYPE html\u003e\n\u003chtml\u003e\n\u003chead\u003e\n    \u003ctitle\u003eHello Page\u003c/title\u003e\n\u003c/head\u003e\n\u003cbody\u003e\n    \u003ch1\u003eHello, BitcaskDB!\u003c/h1\u003e\n\u003c/body\u003e\n\u003c/html\u003e\n`\n\nfunc main() {\n    opts := \u0026bitcask.Options{\n        Dir:                       \"./bitcaskDB\",\n        WalMaxSize:                1024 * 1024 * 1024, // 1GB\n        ManifestMaxSize:           1024 * 1024, // 1MB\n        IndexCapacity:             10000000, // 10 million\n        IndexLimited:              8000000,\n        IndexEvictionPoolCapacity: 32,\n        IndexSampleKeys:           5,\n        DiskUsageLimited:          1024 * 1024 * 1024 * 100, // 100GB\n        NsSize:                    DefaultNsSize,\n        EtagSize:                  DefaultEtagSize,\n    }\n\n    db, err := bitcask.NewDB(opts)\n    if err != nil {\n        panic(err)\n    }\n    defer func() {\n        _ = db.Close()\n    }()\n\n    ns := GenSha1NS(\"ns\") // fixed-size ns\n    key := []byte(\"testKey\")\n    value := []byte(data)\n    now := uint64(db.WallTime().Unix())\n\n    // customized metadata\n    appMeta := make(map[string]string)\n    appMeta[\"type\"] = \"html\"\n    meta := NewMeta(appMeta).SetExpire(now+60).SetEtag(GenSha1Etag(value))\n\n    // set a key\n    err = db.Put(ns, key, value, meta, \u0026WriteOptions{})\n    if err != nil {\n        panic(err)\n    }\n\n    // get a key\n    readVal, readMeta, err := db.Get(ns, key, \u0026ReadOptions{})\n    if err != nil {\n        panic(err)\n    }\n\n    println(readVal)\n    println(readMeta)\n\n    // delete a key\n    err = db.Delete(ns, key, \u0026WriteOptions{})\n    if err != nil {\n        panic(err)\n    }\n}\n```\n\n如果你想简单使用一个 database CRUD http server，可以考虑这个[仓库](https://github.com/wenzhang-dev/bitcaskDB-server)。\n\nhttp server 以 docker 容器运行。顺便说，读写 bitcaskDB 的开销，相比网络通信的开销而言，可以忽略不计。\n\n\n# 性能测试\n\n读写 4KB 的压测报告如下：\n\n```\ngo test -bench=PutGet -benchtime=60s -count=3 -timeout=50m\ngoos: linux\ngoarch: amd64\npkg: github.com/wenzhang-dev/bitcaskDB/bench\ncpu: Intel(R) Xeon(R) Gold 5318N CPU @ 2.10GHz\nBenchmarkPutGet/put4K-8                  5331782   25259 ns/op   11795 B/op   21 allocs/op\nBenchmarkPutGet/put4K-8                  5130870   25417 ns/op   11767 B/op   21 allocs/op\nBenchmarkPutGet/put4K-8                  4898403   26676 ns/op   11742 B/op   21 allocs/op\nBenchmarkPutGet/batchPut4K-8            10548615   15340 ns/op    1695 B/op   11 allocs/op\nBenchmarkPutGet/batchPut4K-8             9220388   14278 ns/op    1694 B/op   11 allocs/op\nBenchmarkPutGet/batchPut4K-8            10363459   15019 ns/op    1686 B/op   11 allocs/op\nBenchmarkPutGet/get4K-8                  8812342    8076 ns/op   10119 B/op   10 allocs/op\nBenchmarkPutGet/get4K-8                  7963098    7952 ns/op   10119 B/op   10 allocs/op\nBenchmarkPutGet/get4K-8                  8480240    7997 ns/op   10119 B/op   10 allocs/op\nBenchmarkPutGet/concurrentGet4K-8       17233309    4427 ns/op   10044 B/op    7 allocs/op\nBenchmarkPutGet/concurrentGet4K-8       26745726    3681 ns/op   10044 B/op    7 allocs/op\nBenchmarkPutGet/concurrentGet4K-8       29305041    3654 ns/op   10044 B/op    7 allocs/op\nBenchmarkPutGet/concurrentPut4K-8        4558645   19829 ns/op    8340 B/op   18 allocs/op\nBenchmarkPutGet/concurrentPut4K-8        4433334   18664 ns/op   10031 B/op   18 allocs/op\nBenchmarkPutGet/concurrentPut4K-8        4366149   17031 ns/op    8175 B/op   17 allocs/op\nBenchmarkPutGet/concurrentBatchPut4K-8   9443377   12520 ns/op    1527 B/op    9 allocs/op\nBenchmarkPutGet/concurrentBatchPut4K-8  11338162   12429 ns/op    1517 B/op    9 allocs/op\nBenchmarkPutGet/concurrentBatchPut4K-8  11394081   12101 ns/op    1510 B/op    9 allocs/op\nPASS\nok   github.com/wenzhang-dev/bitcaskDB/bench 2310.401s\n```\n\n同时，也测试了几个主流的 KV 存储引擎在读写 4KB 的性能，并记录了它们在测试过程中的 RSS 占用。\n性能测试仓库为：[codebase](https://github.com/wenzhang-dev/bitcaskDB-benchmark)\n\n```shell\ngo test -bench=Read -benchtime=60s -timeout=30m -count=3\ngoos: linux\ngoarch: amd64\npkg: github.com/wenzhang-dev/bitcaskDB-benchmark\ncpu: Intel(R) Xeon(R) Gold 5318N CPU @ 2.10GHz\nBenchmarkReadWithBitcaskDB/read4K-8  11459024   6313 ns/op  1.217 AvgRSS(GB)  1.275 PeakRSS(GB)  10120 B/op  10 allocs/op\nBenchmarkReadWithBitcaskDB/read4K-8  12512324   6522 ns/op  1.220 AvgRSS(GB)  1.234 PeakRSS(GB)  10120 B/op  10 allocs/op\nBenchmarkReadWithBitcaskDB/read4K-8  12414660   6468 ns/op  1.206 AvgRSS(GB)  1.231 PeakRSS(GB)  10120 B/op  10 allocs/op\nBenchmarkReadWithBadger/read4K-8      4575487  13526 ns/op  2.716 AvgRSS(GB)  4.350 PeakRSS(GB)  19416 B/op  43 allocs/op\nBenchmarkReadWithBadger/read4K-8      4960239  13741 ns/op  1.629 AvgRSS(GB)  1.681 PeakRSS(GB)  19406 B/op  43 allocs/op\nBenchmarkReadWithBadger/read4K-8      4851144  14429 ns/op  1.591 AvgRSS(GB)  1.650 PeakRSS(GB)  19422 B/op  44 allocs/op\nBenchmarkReadWithLevelDB/read4K-8     1569663  50710 ns/op  0.111 AvgRSS(GB)  0.134 PeakRSS(GB)  55021 B/op  35 allocs/op\nBenchmarkReadWithLevelDB/read4K-8     1000000  63066 ns/op  0.113 AvgRSS(GB)  0.129 PeakRSS(GB)  54264 B/op  35 allocs/op\nBenchmarkReadWithLevelDB/read4K-8     1236408  57268 ns/op  0.114 AvgRSS(GB)  0.138 PeakRSS(GB)  54624 B/op  35 allocs/op\nBenchmarkReadWithBoltDB/read4K-8     12587562   5269 ns/op  5.832 AvgRSS(GB)  5.838 PeakRSS(GB)    832 B/op  13 allocs/op\nBenchmarkReadWithBoltDB/read4K-8     16920481   4482 ns/op  5.832 AvgRSS(GB)  5.833 PeakRSS(GB)    832 B/op  13 allocs/op\nBenchmarkReadWithBoltDB/read4K-8     19141418   5276 ns/op  5.832 AvgRSS(GB)  5.835 PeakRSS(GB)    832 B/op  13 allocs/op\nPASS\nok   github.com/wenzhang-dev/bitcaskDB-benchmark 1475.172s\n```\n\n\n```shell\ngo test -bench=Write -benchtime=60s -timeout=30m -count=3\ngoos: linux\ngoarch: amd64\npkg: github.com/wenzhang-dev/bitcaskDB-benchmark\ncpu: Intel(R) Xeon(R) Gold 5318N CPU @ 2.10GHz\nBenchmarkWriteWithBitcaskDB/write4K-8  8334304  13217 ns/op  0.7905 AvgRSS(GB)   0.934 PeakRSS(GB)    1666 B/op   11 allocs/op\nBenchmarkWriteWithBitcaskDB/write4K-8  5323338  14976 ns/op  0.9732 AvgRSS(GB)   1.058 PeakRSS(GB)    1727 B/op   12 allocs/op\nBenchmarkWriteWithBitcaskDB/write4K-8  5435398  13929 ns/op  0.9639 AvgRSS(GB)   1.122 PeakRSS(GB)    1756 B/op   12 allocs/op\nBenchmarkWriteWithLevelDB/write4K-8    1047753  68691 ns/op  0.0615 AvgRSS(GB)  0.0636 PeakRSS(GB)    2946 B/op   16 allocs/op\nBenchmarkWriteWithLevelDB/write4K-8    1179555  71497 ns/op  0.0617 AvgRSS(GB)  0.0634 PeakRSS(GB)    3250 B/op   18 allocs/op\nBenchmarkWriteWithLevelDB/write4K-8     992488  74130 ns/op  0.0613 AvgRSS(GB)  0.0625 PeakRSS(GB)    3444 B/op   19 allocs/op\nBenchmarkWriteWithBadger/write4K-8     3776720  20036 ns/op   6.409 AvgRSS(GB)   7.534 PeakRSS(GB)   30062 B/op   68 allocs/op\nBenchmarkWriteWithBadger/write4K-8     4106070  50959 ns/op   10.77 AvgRSS(GB)   13.63 PeakRSS(GB)  115442 B/op  152 allocs/op\nBenchmarkWriteWithBadger/write4K-8     1491906  49955 ns/op   11.45 AvgRSS(GB)   13.72 PeakRSS(GB)   88941 B/op  130 allocs/op\nBenchmarkWriteWithBoltDB/write4K-8     2808206  23131 ns/op   0.626 AvgRSS(GB)   0.999 PeakRSS(GB)    7579 B/op   11 allocs/op\nBenchmarkWriteWithBoltDB/write4K-8     4303538  22836 ns/op   1.713 AvgRSS(GB)   2.971 PeakRSS(GB)    7765 B/op   11 allocs/op\nBenchmarkWriteWithBoltDB/write4K-8     3755002  19385 ns/op   2.481 AvgRSS(GB)   2.872 PeakRSS(GB)    7896 B/op   12 allocs/op\nPASS\nok   github.com/wenzhang-dev/bitcaskDB-benchmark 1541.068s\n```\n\n指定磁盘容量的压测报告: [benchmark2](https://github.com/wenzhang-dev/bitcaskDB/blob/main/bench/benchmark2)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fwenzhang-dev%2Fbitcaskdb","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fwenzhang-dev%2Fbitcaskdb","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fwenzhang-dev%2Fbitcaskdb/lists"}