An open API service indexing awesome lists of open source software.

https://github.com/healeycodes/bitcask-lite

📚 A log-structured hash table database. Speedy K/V store for datasets larger than memory.
https://github.com/healeycodes/bitcask-lite

bitcask database log-structured riak

Last synced: 7 months ago
JSON representation

📚 A log-structured hash table database. Speedy K/V store for datasets larger than memory.

Awesome Lists containing this project

README

          

# bitcask-lite
> My blog post: [Implementing Bitcask, a Log-Structured Hash Table](https://healeycodes.com/implementing-bitcask-a-log-structured-hash-table/)

A key/value database and server. Partial implementation of the Bitcask paper: https://riak.com/assets/bitcask-intro.pdf

- Low latency per item read or written
- Handles datasets larger than RAM
- Human readable data format
- Small specification
- Human-readable data format
- Just uses the Go standard library

## Spec

Keys are kept in-memory and point to values in log files. Log files are append-only and contain any number of adjacent items with the schema: `expire, keySize, valueSize, key, value,`.

An item with a key of `a` and a value of `b` that expires on 10 Aug 2022 looks like this in a log file:

```text
1759300313415,1,1,a,b,
```

Not yet implemented: checksums, log file merging, hintfiles.

### HTTP API

- GET: `/get?key=a`
- POST: `/set?key=b&expire=1759300313415`
- HTTP body is read as the value
- `expire` is optional (default is infinite)
- DELETE: `/delete?key=c`

## Performance

The key store is a concurrent map with locking map shards.

Reading a value requires a single disk seek.

Only one goroutine may write to the the active log file at a time so read-heavy workloads are ideal.

## Tests

Tests perform real I/O to disk and generate new files every run.

```bash
pip install -r requirements.txt # (it just uses the requests library)
python e2e.py # run e2e tests covering the main function
go test ./... # unit tests
```

## Deployment

As this is fairly standard Go application: set `PORT`, `DATABASE_DIR`, and run.

It deploys to `railway.app` with zero configuration (presumably most platforms-as-a-service as well).