https://github.com/healeycodes/bitcask-lite
📚 A log-structured hash table database. Speedy K/V store for datasets larger than memory.
https://github.com/healeycodes/bitcask-lite
bitcask database log-structured riak
Last synced: 7 months ago
JSON representation
📚 A log-structured hash table database. Speedy K/V store for datasets larger than memory.
- Host: GitHub
- URL: https://github.com/healeycodes/bitcask-lite
- Owner: healeycodes
- Created: 2022-08-09T20:52:49.000Z (over 3 years ago)
- Default Branch: main
- Last Pushed: 2022-08-14T21:50:48.000Z (over 3 years ago)
- Last Synced: 2025-05-05T14:52:14.581Z (11 months ago)
- Topics: bitcask, database, log-structured, riak
- Language: Go
- Homepage:
- Size: 56.6 KB
- Stars: 25
- Watchers: 1
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# bitcask-lite
> My blog post: [Implementing Bitcask, a Log-Structured Hash Table](https://healeycodes.com/implementing-bitcask-a-log-structured-hash-table/)
A key/value database and server. Partial implementation of the Bitcask paper: https://riak.com/assets/bitcask-intro.pdf
- Low latency per item read or written
- Handles datasets larger than RAM
- Human readable data format
- Small specification
- Human-readable data format
- Just uses the Go standard library
## Spec
Keys are kept in-memory and point to values in log files. Log files are append-only and contain any number of adjacent items with the schema: `expire, keySize, valueSize, key, value,`.
An item with a key of `a` and a value of `b` that expires on 10 Aug 2022 looks like this in a log file:
```text
1759300313415,1,1,a,b,
```
Not yet implemented: checksums, log file merging, hintfiles.
### HTTP API
- GET: `/get?key=a`
- POST: `/set?key=b&expire=1759300313415`
- HTTP body is read as the value
- `expire` is optional (default is infinite)
- DELETE: `/delete?key=c`
## Performance
The key store is a concurrent map with locking map shards.
Reading a value requires a single disk seek.
Only one goroutine may write to the the active log file at a time so read-heavy workloads are ideal.
## Tests
Tests perform real I/O to disk and generate new files every run.
```bash
pip install -r requirements.txt # (it just uses the requests library)
python e2e.py # run e2e tests covering the main function
go test ./... # unit tests
```
## Deployment
As this is fairly standard Go application: set `PORT`, `DATABASE_DIR`, and run.
It deploys to `railway.app` with zero configuration (presumably most platforms-as-a-service as well).