{"id":24884241,"url":"https://github.com/siennathesane/cesiumdb","last_synced_at":"2026-04-17T00:02:08.535Z","repository":{"id":273721289,"uuid":"727680872","full_name":"siennathesane/cesiumdb","owner":"siennathesane","description":"Low-level LSM-tree key value store.","archived":false,"fork":false,"pushed_at":"2026-02-10T20:55:43.000Z","size":643,"stargazers_count":2,"open_issues_count":0,"forks_count":1,"subscribers_count":0,"default_branch":"mainline","last_synced_at":"2026-03-08T00:55:00.849Z","etag":null,"topics":["database","lsm-tree","rust"],"latest_commit_sha":null,"homepage":"","language":"Rust","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/siennathesane.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"COPYING","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2023-12-05T11:04:55.000Z","updated_at":"2026-02-10T20:55:47.000Z","dependencies_parsed_at":"2025-01-22T15:45:20.818Z","dependency_job_id":"2fc3ecf0-09db-45c5-acad-c12abe196258","html_url":"https://github.com/siennathesane/cesiumdb","commit_stats":null,"previous_names":["siennathesane/cesiumdb"],"tags_count":4,"template":false,"template_full_name":null,"purl":"pkg:github/siennathesane/cesiumdb","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/siennathesane%2Fcesiumdb","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/siennathesane%2Fcesiumdb/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/siennathesane%2Fcesiumdb/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/siennathesane%2Fcesiumdb/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/siennathesane","download_url":"https://codeload.github.com/siennathesane/cesiumdb/tar.gz/refs/heads/mainline","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/siennathesane%2Fcesiumdb/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31909235,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-16T18:22:33.417Z","status":"ssl_error","status_checked_at":"2026-04-16T18:21:47.142Z","response_time":69,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.6:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["database","lsm-tree","rust"],"created_at":"2025-02-01T14:26:35.825Z","updated_at":"2026-04-17T00:02:08.508Z","avatar_url":"https://github.com/siennathesane.png","language":"Rust","funding_links":[],"categories":[],"sub_categories":[],"readme":"[![builds.sr.ht status](https://builds.sr.ht/~siennathesane/cesiumdb/commits/feat/builds/amd64.yml.svg)](https://builds.sr.ht/~siennathesane/cesiumdb/commits/feat/builds/amd64.yml?)\n[![codecov](https://codecov.io/gh/siennathesane/cesiumdb/graph/badge.svg?token=D7RBD3OX2U)](https://codecov.io/gh/siennathesane/cesiumdb)\n\n# CesiumDB\n\nA key-value store focused on performance.\n\n# Work In Progress\n\nThis project is an active work-in-progress.\n\nIt will likely compile, and most tests will likely pass, but it is not feature complete yet. The current state of work\nis stabilizing the embedded filesystem implementation so the front end memtables can rely on the backend embedded\nfilesystem. Once that work is done, then it's just implementing levels (relatively easy) and compaction (easy enough).\n\n## Inspiration\n\nThis project was heavily inspired and influenced by (in no particular order):\n\n* Long compile times for Facebook's `rocksdb`\n* Howard Chu's `lmdb`\n* CockroachDB's `pebble`\n* Ben Johnson's `boltdb`\n* Google's `leveldb`\n* Giorgos Xanthakis et al's `parallax`\n* A burning desire to have a rust-native LSM-tree that has column family/namespace support\n\n## Interesting Features\n\nIt's :sparkles: __FAST__ :sparkles: and has a few interesting features:\n\n* A blazingly fast hybrid logical clock (HLC) for ordering operations instead of MVCC semantics\n* A high-performance, lock-free, thread-safe, portable filesystem that works with block devices\n* An insanely fast bloom filter for fast lookups\n\n### How _Fast_ is Fast?\n\nI'm glad you asked! Here are some benchmarks:\n\n* Internal bloom filter lookups: ~860 _picoseconds_\n* Merge operator: ~115ms for a full table scan of 800,000 keys across 8 memtables\n\n## Usage\n\nAdd this to your `Cargo.toml`:\n\n```toml\n\n[dependencies]\ncesiumdb = \"1.0\"\n```\n\nAnd use:\n\n```rust\nuse cesiumdb::CesiumDB;\n\n// use a temp file, most useful for testing\nlet db = CesiumDB::default ();\n\n// no namespace\ndb.put(b\"key\", b\"value\");\ndb.get(b\"key\");\n\n// with a namespace\ndb.put(1, b\"key\", b\"value\");\ndb.get(1, b\"key\");\n```\n\nSee the [API documentation](https://docs.rs/cesiumdb) for more information.\n\n## Namespaces are not Column Families\n\nCesiumDB uses a construct I call \"namespacing\". It's a way for data of a similar type to be grouped together, but it is\nnot stored separately than other namespaced data. Namespaces are ultimately glorified range markers to ensure fast data\nlookups across a large set of internal data, and a bit of a way to make it easy for users to manage their data. I would\nargue namespaces are closer to tables than column families.\n\n## Hybrid Logical Clocks\n\nCesiumDB does let you bring your own hybrid logical clock implementation for key versioning. This is useful if you have\na specific HLC implementation you want to use, or if you want to use a different clock entirely. This is done by\nimplementing the `HLC` trait and passing it to the `CesiumDB` constructor. However, if you can provide a more precise\nclock than the provided one, please submit an issue or PR so we can all benefit from it.\n\n## Unsafety: Or... How To Do Dangerous Things Safely\n\nThere is a non-trivial amount of `unsafe` code. Most of it is related to the internal implementation with `mmap` (which\ncannot be made safe) and it's entrypoints (the handlers and such). I also make use of pointer arithmetic on\nmemory-mapped file locations. This is one of the areas where safety comes at the cost of performance. However, if you\ncan find a way to make it safe, please submit an issue or PR. I would love to see it!\n\nThere is :sparkles: __EXTENSIVE__ :sparkles: testing around the `unsafe` code, and I am confident in its correctness. My\ngoal is to keep this project at a high degree of code coverage with tests to help continue to ensure said confidence.\nHowever, if you find a bug, please submit an issue or PR.\n\n## Contributing\n\nContributions are welcome! Please submit a PR with your changes. If you're unsure about the changes, please submit an\nissue first.\n\n## To Do's\n\nAn alphabetical list of things I'd like to actually do for the long-term safety and stability of the project.\n\n- [ ] Add `loom` integration tests.\n- [ ] Add `miri` integration tests.\n- [ ] Add more granular `madvise` commands to the filesystem to give the kernel some hints.\n- [ ] Add some kind of `fsck` and block checksums since journaling is already present. There are basic unit tests for\n  this but no supported tool for it.\n- [ ] Bloom filter size is currently hardcoded. I'd like to make it configurable.\n- [ ] Determine how to expose the untrustworthiness of the bloom filter.\n- [ ] Figure out how hard it would be to support `no_std` for the embedded workloads. I suspect it would be... difficult\n  lol\n- [ ] Investigate the point at which we can no longer `mmap` a physical device. Theoretically, even without swap space,\n  I can `mmap` a 1TiB physical device to the filesystem implementation. But I feel like shit gets real weird. Idk, it's\n  a Linux-ism I want to investigate.\n- [ ] Remove the question mark operator.\n- [ ] Revisit the merge iterator. The benchmarks have it at ~115ms for a full scan of 8 memtables with 100,000 keys\n  each. I have no idea if this is a mismatch of my expectations or a gross inability of mine to optimize it further.\n  Every optimization I've tried is 5-20% slower (including my own cache-optimized min heap) than this.\n- [ ] Write some kind of auto-configuration for the generalized configs.\n\n## License\n\nCesiumDB is licensed under GPL v3.0 with the Class Path Exception. This means you can safely link to CesiumDB in your\nproject. So it's safe for corporate consumption, just not closed-source modification :simple_smile:\n\nIf you would like a non-GPL license, please reach out :simple_smile:\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsiennathesane%2Fcesiumdb","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsiennathesane%2Fcesiumdb","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsiennathesane%2Fcesiumdb/lists"}