https://github.com/jdockerty/squirrel
A bitcask inspired, replicated key-value store.
https://github.com/jdockerty/squirrel
bitcask-like key-value replicated-cache rust rust-learning
Last synced: 3 months ago
JSON representation
A bitcask inspired, replicated key-value store.
- Host: GitHub
- URL: https://github.com/jdockerty/squirrel
- Owner: jdockerty
- Created: 2023-12-21T12:33:24.000Z (about 2 years ago)
- Default Branch: main
- Last Pushed: 2024-07-13T14:33:36.000Z (over 1 year ago)
- Last Synced: 2025-01-13T13:19:52.741Z (about 1 year ago)
- Topics: bitcask-like, key-value, replicated-cache, rust, rust-learning
- Language: Rust
- Homepage:
- Size: 72.3 KB
- Stars: 2
- Watchers: 1
- Forks: 0
- Open Issues: 9
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# squirrel
_As in "[to squirrel away][squirrel-away]"._
A (replicated) persistent key-value store which uses a simple implementation of [bitcask](https://github.com/basho/bitcask/blob/develop/doc/bitcask-intro.pdf) as the underlying storage mechanism.
## How it works
This follows a simple structure and exposes a very common API surface: `set(k,v)`, `get(k)`, and `remove(k)`.
### Get
A `get(k)` operation will read from the running store via its `keydir`. The value
**must** exist here in order for it to be returned[^1], as the `keydir` contains a
mapping of all known values and their respective offsets in the active or compacted
log files.
[^1]: When a call to `open` is made, the directory is scanned for any log files,
which means that in the event of a crash or restart the `keydir` is always rebuilt
to its prior state.
If a key exists, its containing log file is opened, the offset is seeked to, and the
entry deserialised for return.
In the event of a `None` value, this signifies either a tombstone value (from prior removal)
or a key which has never existed. In either case, the value does not exist in the
`keydir` so no value is returned.
### Set
```mermaid
sequenceDiagram
participant client
participant squirrel
participant WAL as active log file
participant keydir
client->>squirrel: set(k, v)
squirrel->>WAL: Append 'LogEntry'
note right of WAL: LogEntry consists of
key (k), value (v)
and metadata
WAL-->>squirrel: Acknowledged
squirrel->>keydir: Upsert KeydirEntry
note right of keydir: Maps key to
log file and offset
keydir-->>squirrel: Acknowledged
```
> [!NOTE]
> For now, both keys and values are restricted to `String` types.
### Remove
A call to `remove(k)` is similar to `get(k)`, except a tombstone value, represented
by `None`, is appended to the active log file and updated in the `keydir`.
The tombstone value signifies that the entry should be dropped on the next compaction
cycle[^2]. This means that the value will no longer be present afterwards.
[^2]: Compaction is not a background job, it is a simple check over `MAX_LOG_FILE_SIZE`
after either a `set` or `remove` operation, as these cause an append to the active log file.
Attempting to remove a key which does not exist will result in an error.
### Client/Server
TL;DR it uses gRPC with `tonic`.
See [`client.rs`](./src/client.rs)/[`server.rs`](./src/server.rs)/[`proto`](./proto/) definitions.
### Replication
A simplistic strategy for replication is achieved via the [`replication`](./src/replication/mod.rs) crate.
## Notes
This was initially built through my implementation of the PingCAP talent plan course for building a key-value store in Rust:
- [Course](https://github.com/pingcap/talent-plan/tree/master/courses/rust#the-goal-of-this-course)
- [Lesson plan](https://github.com/pingcap/talent-plan/blob/master/courses/rust/docs/lesson-plan.md#pna-rust-lesson-plan)
And has since grown into my own toy project.
[squirrel-away]: https://dictionary.cambridge.org/dictionary/english/squirrel-away