Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/portasynthinca3/sidx
Simple key-value store with subindex support for the BEAM implemented in pure Elixir. Mainly intended to be used by the `markov` library
https://github.com/portasynthinca3/sidx
database elixir key-value
Last synced: about 2 months ago
JSON representation
Simple key-value store with subindex support for the BEAM implemented in pure Elixir. Mainly intended to be used by the `markov` library
- Host: GitHub
- URL: https://github.com/portasynthinca3/sidx
- Owner: portasynthinca3
- License: wtfpl
- Created: 2022-12-27T21:39:04.000Z (almost 2 years ago)
- Default Branch: master
- Last Pushed: 2023-10-19T16:30:34.000Z (about 1 year ago)
- Last Synced: 2024-10-28T17:22:45.534Z (about 2 months ago)
- Topics: database, elixir, key-value
- Language: Elixir
- Homepage:
- Size: 183 KB
- Stars: 2
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE.txt
Awesome Lists containing this project
README
# Subindex
Simple key-value store with subindex support for the BEAM implemented in pure
Elixir. Mainly intended to be used by the
[markov](https://github.com/portasynthinca3/markov) library.## Goals
This database was created for a very specific purpose - to support the
aforementioned markov library, which in turn powers the
[Deuterium](https://github.com/portasynthinca3/deutexrium) Discord bot. It
strives to achieve the following goals:
- Integrated into the BEAM VM as its own OTP app.
- Write speed at the expense of durability; this is because Deuterium receives
about 10 write requests for every read request.
- Support for "subindexing", or partial keys.
- No need for entire tables to be loaded into memory; this requirement arose
after trying `ets` and `mnesia`.## Installation
```elixir
defp deps do
[
{:sidx, "~> 0.1"}
]
end
```## Usage
Example workflow:
```elixir
# loads table at "./path" or creates one if it doesn't exist
table = Sidx.open!("./path", keys: 3):ok = Sidx.insert(table, [:a, :b, :c], :d) # the 3 keys and a value
:ok = Sidx.insert(table, [:e, :f, :g], :h)
:ok = Sidx.insert(table, [:e, :f, :i], :j)
:ok = Sidx.insert(table, [:e, :k, :l], :m)Sidx.select(table, [:a, :b, :c]) # {:ok, [{[], :d}]}
Sidx.select(table, [:e, :f, :g]) # {:ok, [{[], :h}]}
Sidx.select(table, [:e, :f, :i]) # {:ok, [{[], :j}]}
Sidx.select(table, [:e, :f]) # {:ok, [{[:g], :h}, {[:i], :j}]}
Sidx.select(table, [:e, :k]) # {:ok, [{[:l], :m}]}
Sidx.select(table, [:e]) # {:ok, [
# {[:g, :f], :h} # subkeys in reverse order
# {[:i, :f], :j},
# {[:l, :k], :m}]
# }# inserts with keys to an existing row overwrite the value
:ok = Sidx.insert(table, [:a, :b, :c], 0)
Sidx.select(table, [:a, :b, :c]) # {:ok, [{[], 0}]}# update is atomic and more efficient than select+insert
Sidx.update(table, [:a, :b, :c], fn _path, value -> value + 1 end)
Sidx.select(table, [:a, :b, :c]) # {:ok, [{[], 1}]}Sidx.close!(table)
```## Inner workings
- The database is partitioned by the first key in the key list
- Every partition consists of a **key tree** and a **value store**
- Partitions are **serialized** into a binary
- A serialized partition is **fragmented** into equally-sized **slots**
- Multiple serialized partitions' slots are intertwined and put into one file
in a process called **unification**
- When a partition needs to be read from disk, the unifier will first
**collect** the slots corresponding to that partition which will then be
**concatenated** and finally **deserialized**![Visual representation of the above operations](./inner_workings.png)
## Results
This database has achieved the required performance target, however it fails to
fail gracefully. A sudden stop of the VM or an interruption of the writing
process leads to a corrupted database. Ideally, this should not happen, however
it is by design _okay_ to lose writes in such an event.