{"id":13569208,"url":"https://github.com/bytedance/terarkdb","last_synced_at":"2025-04-10T12:17:28.715Z","repository":{"id":37781926,"uuid":"320459354","full_name":"bytedance/terarkdb","owner":"bytedance","description":"A RocksDB compatible KV storage engine with better performance","archived":false,"fork":false,"pushed_at":"2025-03-10T06:49:46.000Z","size":69554,"stargazers_count":2077,"open_issues_count":54,"forks_count":206,"subscribers_count":56,"default_branch":"dev.1.4","last_synced_at":"2025-04-03T07:57:16.637Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"C++","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/bytedance.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE.Apache","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":"AUTHORS","dei":null,"publiccode":null,"codemeta":null}},"created_at":"2020-12-11T03:42:58.000Z","updated_at":"2025-03-29T16:44:18.000Z","dependencies_parsed_at":"2024-01-14T03:48:03.496Z","dependency_job_id":"b307c3e2-079a-45ba-93ec-2c467ab42605","html_url":"https://github.com/bytedance/terarkdb","commit_stats":null,"previous_names":[],"tags_count":7,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bytedance%2Fterarkdb","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bytedance%2Fterarkdb/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bytedance%2Fterarkdb/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bytedance%2Fterarkdb/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/bytedance","download_url":"https://codeload.github.com/bytedance/terarkdb/tar.gz/refs/heads/dev.1.4","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248217166,"owners_count":21066633,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-08-01T14:00:37.047Z","updated_at":"2025-04-10T12:17:28.698Z","avatar_url":"https://github.com/bytedance.png","language":"C++","readme":"# About TerarkDB\nTerarkDB is a RocksDB replacement with optimized tail latency, throughput and compression etc. In most cases you can migrate your existing RocksDB instance to TerarkDB without any\ndrawbacks.\n\n- [All-in-one Docs](https://bytedance.feishu.cn/docs/doccnZmYFqHBm06BbvYgjsHHcKc#)\n- [Slack Channel](https://join.slack.com/t/terarkdb/shared_invite/zt-zxo71hwl-j_K4OIQ~p5_SsT4RrFesxg)\n\n**NOTES**\n- TerarkDB was only tested and production ready under Linux platform\n- Language bindings except C/C++ are not fully tested yet.\n- Existing data can be migrated from RocksDB directly to TerarkDB, but cannot migrate back to RocksDB.\n- TerarkDB was forked from RocksDB v5.18.3.\n\n\n## Performance Overview\n- RocksDB v6.12\n- Server\n  - Intel(R) Xeon(R) Gold 5218 CPU @ 2.30GHz (2 Sockets, 32 cores 64 threads)\n  - 376 GB DRAM\n  - NVMe TLC SSD (3.5 TB)\n- Bench Tools \u0026 Workloads\n  - use `db_bench`\n  - 10 client threads, 20GB requests per thread\n  - key = 24 bytes, value = 2000 bytes\n  - `heavy_write` means 90% write operations\n  - `heavy_read` means 90% read operations\n\n\n![](docs/images/compare_rocksdb.png)\n\n\n# 1. Use TerarkDB\n\n## Prerequisite\nIf you enabled TerarkZipTable support (`-DWITH_TERARK_ZIP=ON`), you should install `libaio` before compile TerarkDB:\n\n`sudo apt-get install libaio-dev`\n\nIf this is your first time using TerarkDB, we recommend you to use without TerarkZipTable by changing `-DWITH_TERARK_ZIP` to `OFF` in `build.sh`.\n\n## Method 1: Use CMake subdirectory (Recommend)\n\n1) Clone\n\n```\ncd {YOUR_PROJECT_DIR}\ngit submodule add https://github.com/bytedance/terarkdb.git\n\ncd terarkdb \u0026\u0026 git submodule update --init --recursive\n```\n\n2) Edit your Top Project's CMakeLists.txt\n\n```\nadd_subdirectory(terarkdb)\ntarget_link_libraries({YOUR_TARGET} terarkdb)\n```\n\n3) Important Default Options\n\n- CMAKE_BUILD_TYPE: RelWithDebInfo\n- WITH_JEMALLOC: ON\n  - Use Jemalloc or Not (If you are using a different malloc library, change to OFF)\n- WITH_TESTS: OFF\n  - Build test cases\n- WITH_TOOLS: OFF\n  - Build with TerarkDB tools (e.g. db_bench, ldb etc)\n- WITH_TERARK_ZIP: OFF\n  - Build with TerarkZipTable\n- WITH_ZNS: OFF\n  - Build with ZNS device support\n\n\n### Notes\n- TerarkDB is built with zstd, lz4, snappy, zlib, gtest, boost by default, if you need these libraries, you can remove them from your higher level application.\n\n\n## Method 2: Link as static library\n\n1) clone \u0026 build\n\n```\ngit clone https://github.com/bytedance/terarkdb.git\n\ncd terarkdb \u0026\u0026 git submodule update --init --recursive\n\nWITH_TESTS=OFF WITH_ZNS=OFF ./build.sh\n```\n\n2) linking\n\nDirectory:\n\n```\n  terarkdb/\n        \\___ output/\n                \\_____ include/\n                \\_____ lib/\n                         \\___ libterarkdb.a\n                         \\___ libzstd.a\n                         \\___ ...\n```\n\nWe didn't archieve all static libraries together yet, so you have to pack all libraries to your target:\n\n```\n-Wl,-Bstatic \\\n-lterarkdb -lbz2 -ljemalloc -llz4 -lsnappy -lz -lzstd \\\n-Wl,-Bdynamic -pthread -lgomp -lrt -ldl -laio\n```\n\n\n# 2. Usage\n## 2.1. BlockBasedTable\n```c++\n#include \u003ccassert\u003e\n#include \"rocksdb/db.h\"\n\nrocksdb::DB* db;\nrocksdb::Options options;\n\n// Your options here\noptions.create_if_missing = true;\noptions.wal_bytes_per_sync = 32768;\noptions.bytes_per_sync = 32768;\n\n// Open DB\nauto status = rocksdb::DB::Open(options, \"/tmp/testdb\", \u0026db);\n\n// Operations\nstd::string value;\nauto s = db-\u003ePut(rocksdb::WriteOptions(), \"key1\", \"value1\");\ns = db-\u003eGet(rocksdb::ReadOptions(), \"key1\", \u0026value);\nassert(s.ok());\nassert(\"value1\" == value);\n\ns = db-\u003eDelete(rocksdb::WriteOptions(), \"key1\");\nassert(s.ok());\n```\n\nOr manually set table format and table options:\n\n```c++\n#include \u003ccassert\u003e\n#include \"rocksdb/db.h\"\n#include \"rocksdb/options.h\"\n#include \"rocksdb/table.h\"\n\nrocksdb::DB* db;\nrocksdb::Options options;\n\n// Your db options here\noptions.create_if_missing = true;\noptions.wal_bytes_per_sync = 32768;\noptions.bytes_per_sync = 32768;\n\n// Manually specify target table and table options\nrocksdb::BlockBasedTableOptions table_options;\ntable_options.block_cache =\n    rocksdb::NewLRUCache(32ULL \u003c\u003c 30, 8, false);\ntable_options.block_size = 8ULL \u003c\u003c 10;\noptions.table_factory = std::shared_ptr\u003crocksdb::TableFactory\u003e\n                          (NewBlockBasedTableFactory(table_options));\n\n// Open DB\nauto status = rocksdb::DB::Open(options, \"/tmp/testdb2\", \u0026db);\n\n// Operations\nstd::string value;\nauto s = db-\u003ePut(rocksdb::WriteOptions(), \"key1\", \"value1\");\ns = db-\u003eGet(rocksdb::ReadOptions(), \"key1\", \u0026value);\nassert(s.ok());\nassert(\"value1\" == value);\n\ns = db-\u003eDelete(rocksdb::WriteOptions(), \"key1\");\nassert(s.ok());\n```\n\n## 2.2. TerarkZipTable\n```c++\n#include \u003ccassert\u003e\n#include \"rocksdb/db.h\"\n#include \"rocksdb/options.h\"\n#include \"rocksdb/table.h\"\n#include \"table/terark_zip_table.h\"\n\nrocksdb::DB* db;\nrocksdb::Options options;\n\n// Your db options here\noptions.create_if_missing = true;\noptions.wal_bytes_per_sync = 32768;\noptions.bytes_per_sync = 32768;\n\n// TerarkZipTable need a `fallback` options because you can indicate which LSM level you want to start using TerarkZipTable\n// For example, by setting tzt_options.terarkZipMinLevel = 2, TerarkDB will use your fallback Table on level 0 and 1.\nstd::shared_ptr\u003crocksdb::TableFactory\u003e table_factory;\nrocksdb::BlockBasedTableOptions blockbased_options;\nblockbased_options.block_size = 8ULL \u003c\u003c 10;\ntable_factory.reset(NewBlockBasedTableFactory(blockbased_options));\n\nrocksdb::TerarkZipTableOptions tzt_options;\n// TerarkZipTable requires a temp directory other than data directory, a slow device is acceptable\ntzt_options.localTempDir = \"/tmp\";\ntzt_options.indexNestLevel = 3;\ntzt_options.sampleRatio = 0.01;\ntzt_options.terarkZipMinLevel = 2; // Start using TerarkZipTable from level 2\n\ntable_factory.reset(\n    rocksdb::NewTerarkZipTableFactory(tzt_options, table_factory));\n\noptions.table_factory = table_factory;\n\n// Open DB\nauto status = rocksdb::DB::Open(options, \"/tmp/testdb2\", \u0026db);\n\n// Operations\nstd::string value;\nauto s = db-\u003ePut(rocksdb::WriteOptions(), \"key1\", \"value1\");\ns = db-\u003eGet(rocksdb::ReadOptions(), \"key1\", \u0026value);\nassert(s.ok());\nassert(\"value1\" == value);\n\ns = db-\u003eDelete(rocksdb::WriteOptions(), \"key1\");\nassert(s.ok());\n```\n\n\n# 3. Real-world Performance Improvement\nTerarkDB has been deployed in lots of applications in Bytedance, in most cases TerarkDB can help to reduce latency spike and improve throughput tremendously.\n\n### Disk Write\n![](docs/images/disk_write.png)\n\n### Get Latency (us)\n![](docs/images/get_latency.png)\n\n\n# 4. Contributing\n- TerarkDB uses Github issues and pull requests to manage features and bug fixes.\n- All PRs are welcome including code formating and refactoring.\n\n\n# 5. License\n- Apache 2.0\n\n# 6. Users\n\nPlease let us know if you are using TerarkDB, thanks! (By joining our slack channel)\n\n- ByteDance (core online services)\n","funding_links":[],"categories":["C++"],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbytedance%2Fterarkdb","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fbytedance%2Fterarkdb","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbytedance%2Fterarkdb/lists"}