https://github.com/pyrustic/jinbase
Multi-model transactional embedded database
https://github.com/pyrustic/jinbase
blob database embedded-data kv-store multi-model persistence pyrustic python queue relational-database sql sqlite stack
Last synced: about 1 month ago
JSON representation
Multi-model transactional embedded database
- Host: GitHub
- URL: https://github.com/pyrustic/jinbase
- Owner: pyrustic
- License: mit
- Created: 2023-09-27T12:10:13.000Z (over 1 year ago)
- Default Branch: master
- Last Pushed: 2024-12-10T20:58:01.000Z (5 months ago)
- Last Synced: 2025-03-30T16:12:12.782Z (about 2 months ago)
- Topics: blob, database, embedded-data, kv-store, multi-model, persistence, pyrustic, python, queue, relational-database, sql, sqlite, stack
- Language: Python
- Homepage:
- Size: 69.3 KB
- Stars: 68
- Watchers: 2
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
[](https://opensource.org/licenses/MIT)
[](https://pypi.org/project/jinbase)
[](https://pepy.tech/project/jinbase)# Jinbase
Multi-model transactional embedded database## Table of contents
- [Overview](#overview)
- [Multiple data models coexisting in a single embedded database](#multiple-data-models-coexisting-in-a-single-embedded-database)
- [Support for transactions and complex data of arbitrary size](#support-for-transactions-and-complex-data-of-arbitrary-size)
- [Bulk and partial access to records from byte-level to field-level](#bulk-and-partial-access-to-records-from-byte-level-to-field-level)
- [Highly configurable database and timestamped records](#highly-configurable-database-and-timestamped-records)
- [Why use Jinbase ?](#why-use-jinbase-)
- [Data models and their corresponding storage interfaces](#data-models-and-their-corresponding-storage-interfaces)
- [Kv](#kv)
- [Depot](#depot)
- [Queue](#queue)
- [Stack](#stack)
- [Relational](#relational)
- [The unified BLOB interface](#the-unified-blob-interface)
- [Command line interface](#command-line-interface)
- [Related projects](#related-projects)
- [Testing and contributing](#testing-and-contributing)
- [Installation](#installation)# Overview
**Jinbase** (pronounced as **/ˈdʒɪnˌbeɪs/**) is a multi-model [transactional](https://en.wikipedia.org/wiki/Database_transaction) [embedded database](https://en.wikipedia.org/wiki/Embedded_database) that uses [SQLite](https://www.sqlite.org/) as storage engine. Its reference implementation is an eponymous lightweight [Python](https://www.python.org/) library available on [PyPI](#installation).## Multiple data models coexisting in a single embedded database
A single Jinbase database supports **key-value**, **depot**, **queue**, **stack**, and **relational** data models. While a Jinbase file can be populated with multi-model data, depending on the needs, it is quite possible to dedicate a database file to a given model.For each of the first four data models, there is a programmatic interface accessible via an eponymous property of a Jinbase instance.
## Support for transactions and complex data of arbitrary size
Having SQLite as the storage engine allows Jinbase to benefit from [transactions](https://www.sqlite.org/lang_transaction.html). Jinbase ensures that at the top level, reads and writes on key-value, depot, queue and stack stores are transactional. For user convenience, context managers are exposed to create transactions of different modes.When for a write operation, the user submits data, whether it is a dictionary, string or integer, Jinbase serializes (except binary data), chunks and stores the data iteratively with the [Paradict](https://github.com/pyrustic/paradict) compact binary data format. This then allows for the smooth storage of complex data of [arbitrary size](https://www.sqlite.org/limits.html) with Jinbase.
## Bulk and partial access to records from byte-level to field-level
Jinbase not only offers bulk access to records, but also two levels of partial access granularity.SQLite has an impressive capability which is [incremental I/O](https://sqlite.org/c3ref/blob_open.html) for [BLOBs](https://www.sqlite.org/datatype3.html). While this capability is designed to target an individual BLOB column in a row, Jinbase extends this so that for each record, incremental reads cover all chunks as if they were a single [unified BLOB](#the-unified-blob-interface).
For [dictionary](https://en.wikipedia.org/wiki/Associative_array) records only, Jinbase automatically creates and maintains a lightweight index consisting of pointers to root fields, which then allows extracting from an arbitrary record the contents of a field automatically deserialized before being returned.
## Highly configurable database and timestamped records
Jinbase exposes a database connection object to the underlying SQLite storage engine, allowing for sophisticated configuration. The [Paradict](https://github.com/pyrustic/paradict) binary data format used for serializing records also allows for customizing data types via a `paradict.TypeRef` object.Each record stored in a key-value, depot, queue, or stack store is automatically timestamped. This allows the user to provide a `time_range` tuple when querying records. The precision of the timestamp, which defaults to milliseconds, can also be configured.
# Why use Jinbase ?
Jinbase implements persistence for familiar data models whose stores coexist in a single file with an intuitive programmatic interface. Supported [data types](https://github.com/pyrustic/paradict) range from simple to complex and of [arbitrary size](https://www.sqlite.org/limits.html).For convenience, all Jinbase-related tables are prefixed with `jinbase_`, allowing the user to define their own tables and interact with them as they would with a regular SQLite database.
Thanks to its multi-model coexistence capability, Jinbase can be used to open legacy SQLite databases to add four useful data models (key-value, depot, queue, and stack).
All this makes Jinbase relevant from prototype to production stages of software development of various sizes and scopes.
Following are few of the most obvious use cases:
- Storing user preferences
- Persisting session data before exit
- Order-based processing of data streams
- Exposing data for other processes
- Upgrading legacy SQLite files with new data models
- Bespoke data persistence solution# Data models and their corresponding storage interfaces
Following subsections discuss data models and their corresponding storage interfaces.## Kv
The [key-value](https://en.wikipedia.org/wiki/Key%E2%80%93value_database) data model associates to a string or an integer key, a value that is serializable with the [Paradict](https://github.com/pyrustic/paradict) binary data format.String keys can be searched with a [glob](https://en.wikipedia.org/wiki/Glob_(programming)) pattern and integer keys can be searched within a range of numbers. Since records are automatically timestamped, a `time_range` tuple can be provided by the user to search for them as well as keys.
Example:
```python
import os.path
from datetime import datetime
from jinbase import Jinbase, JINBASE_HOMEuser_data = {"id": 42, "name": "alex", "created_at": datetime.now(),
"photo": b'\x45\xA6\x42\xDF\x69',
"books": {"sci-fi": ["book 1", "book 2"],
"thriller": ["book 3", ["book4"]]}}db_filename = os.path.join(JINBASE_HOME, "test.db")
with Jinbase(db_filename) as db:
# set 'user'
kv_store = db.kv
kv_store.set("user", user_data) # returns a UID# get 'user'
data = kv_store.get("user")
assert data == user_data# count total records and bytes
print(kv_store.count_records())
print(kv_store.count_bytes("user"))# list keys (the time_range is optional)
time_range = ("2024-11-20 10:00:00Z", "2035-11-20 10:00:00Z")
print(tuple(kv_store.keys(time_range=time_range)))# find string keys with a glob pattern
print(tuple(kv_store.str_keys(glob="use*")))# load the 'books' field (partial access)
books = kv_store.load_field("user", "books")
assert books == user_data["books"]# iterate (descending order)
for key, value in kv_store.iterate(asc=False):
pass
```
> Check out the API reference for the [key-value store](https://github.com/pyrustic/jinbase/blob/master/docs/api/modules/jinbase/store/kv/class-Kv.md).## Depot
The depot data model shares similarities with the [List](https://en.wikipedia.org/wiki/List_(abstract_data_type)) data structure. An
unique identifier (UID) is automatically assigned to a record appended to the store. This record can be retrieved later either by its unique identifier or by its [0-based](https://en.wikipedia.org/wiki/Zero-based_numbering) position in the store.Example:
```python
import os.path
from datetime import datetime
from jinbase import Jinbase, JINBASE_HOMEuser_data = {"id": 42, "name": "alex", "created_at": datetime.now(),
"photo": b'\x45\xA6\x42\xDF\x69',
"books": {"sci-fi": ["book 1", "book 2"],
"thriller": ["book 3", ["book4"]]}}db_filename = os.path.join(JINBASE_HOME, "test.db")
with Jinbase(db_filename) as db:
# append 'user_data' to the depot
depot_store = db.depot
uid = depot_store.append(user_data)# get 'user_data'
data = depot_store.get(uid)
assert data == user_data# get the record at position 0 in the depot
print(depot_store.uid(0)) # prints the UID
# get the position of a record in the depot
print(depot_store.position(uid)) # prints the position# count total records and bytes
print(depot_store.count_records())
print(depot_store.count_bytes(uid))# list UIDs (unique identifiers)
time_range = ("2024-11-20 10:00:00Z", "2035-11-20 10:00:00Z")
print(tuple(depot_store.uids(time_range=time_range)))# load the 'books' field (partial access)
books = depot_store.load_field(uid, "books")
assert books == user_data["books"]# iterate (descending order)
for uid, data in depot_store.iterate(asc=False):
pass
```> Check out the API reference for the [depot store](https://github.com/pyrustic/jinbase/blob/master/docs/api/modules/jinbase/store/depot/class-Depot.md).
## Queue
The [queue](https://en.wikipedia.org/wiki/Queue_(abstract_data_type)) data model like other stores, is transactional. While this store provides methods to enqueue and dequeue records, there is also `peek_xxx` methods to look at the record at the front or the back of the queue, that is, read it without dequeuing.Example:
```python
import os.path
from datetime import datetime
from jinbase import Jinbase, JINBASE_HOMEuser_data = {"id": 42, "name": "alex", "created_at": datetime.now(),
"photo": b'\x45\xA6\x42\xDF\x69',
"books": {"sci-fi": ["book 1", "book 2"],
"thriller": ["book 3", ["book4"]]}}db_filename = os.path.join(JINBASE_HOME, "test.db")
with Jinbase(db_filename) as db:
# enqueue 'user_data'
queue_store = db.queue
queue_store.enqueue(user_data) # returns a UID# peek
data1 = queue_store.peek_front()
data2 = queue_store.peek_back()
assert data1 == data2 == user_data# dequeue
data = queue_store.dequeue()
assert data == user_data# we could have dequeued the record inside a transaction
# to ensure that its processing completed successfully
# (if it fails, an automatic rollback is performed)
with db.write_transaction():
data = queue_store.dequeue()
# from here, process the data
...
```> Check out the API reference for the [queue store](https://github.com/pyrustic/jinbase/blob/master/docs/api/modules/jinbase/store/queue/class-Queue.md).
## Stack
The [stack](https://en.wikipedia.org/wiki/Stack_(abstract_data_type)) data model like other stores, is transactional. While this store provides methods to push and pop records, there is also a `peek` method to look at the record on top of the stack, that is, read it without popping it from the stack.```python
import os.path
from datetime import datetime
from jinbase import Jinbase, JINBASE_HOMEuser_data = {"id": 42, "name": "alex", "created_at": datetime.now(),
"photo": b'\x45\xA6\x42\xDF\x69',
"books": {"sci-fi": ["book 1", "book 2"],
"thriller": ["book 3", ["book4"]]}}db_filename = os.path.join(JINBASE_HOME, "test.db")
with Jinbase(db_filename) as db:
# push 'user_data' on top of the stack
stack_store = db.stack
stack_store.push(user_data)# peek
data = stack_store.peek()
assert data == user_data# pop 'user_data'
data = stack_store.pop()
assert data == user_data# we could have popped the record inside a transaction
# to ensure that its processing completed successfully
# (if it fails, an automatic rollback is performed)
with db.write_transaction():
data = stack_store.pop()
# from here, process the data
...
```> Check out the API reference for the [stack store](https://github.com/pyrustic/jinbase/blob/master/docs/api/modules/jinbase/store/stack/class-Stack.md).
## Relational
As Jinbase uses [SQLite](https://en.wikipedia.org/wiki/SQLite) as its storage engine, it de facto supports the [relational](https://en.wikipedia.org/wiki/Relational_model) data model for which it exposes an interface, [LiteDBC](https://github.com/pyrustic/litedbc), for querying SQLite.LiteDBC is an SQL interface compliant with the DB-API 2.0 specification described by [PEP 249](https://peps.python.org/pep-0249/), itself wrapping Python's [sqlite3](https://docs.python.org/3/library/sqlite3.html) module for a more intuitive interface and multithreading support by default.
Example:
```python
import os.path
from jinbase import Jinbase, JINBASE_HOMEdb_filename = os.path.join(JINBASE_HOME, "test.db")
with Jinbase(db_filename) as db:
lite_dbc = db.dbc
with lite_dbc.transaction() as cursor:
# query the table names that exist in this database
query = ("SELECT name FROM sqlite_master "
"WHERE type='table' AND name NOT LIKE 'sqlite_%'")
cursor.execute(query)
# Although the Cursor object already has the traditional "fetchone",
# "fetchmany" and "fetchall" methods, LiteDBC adds a new lazy "fetch"
# method for intuitive iteration.
# Note that 'fetch()' accepts 'limit' and 'buffer_size' as arguments.
for row in cursor.fetch():
table_name = row[0]
print(table_name)```
> Check out [LiteDBC](https://github.com/pyrustic/litedbc).
# The unified BLOB interface
When for a write operation, the user submits data, whether it is a dictionary, string or integer, Jinbase serializes (except binary data), chunks and stores the data iteratively with the [Paradict](https://github.com/pyrustic/paradict) compact binary data format. Under the hood, these chunks are actually stored as SQLite Binary Large Objects ([BLOBs](https://www.sqlite.org/datatype3.html)).SQLite has an impressive capability which is [incremental I/O](https://sqlite.org/c3ref/blob_open.html) for BLOBs. While this capability is designed to target an individual BLOB column in a row, Jinbase extends it to enable incremental reads of record chunks as if they form a single [unified BLOB](#the-unified-blob-interface).
Example:
```python
import os.path
from jinbase import Jinbase, JINBASE_HOMEdb_filename = os.path.join(JINBASE_HOME, "test.db")
CHUNK_SIZE = 1 # 1 byte, thus a 5-byte input will have 5 chunks# The 'chunk_size' can be defined only once when Jinbase creates or opens
# the database for first time. New values for 'chunk_size' will be ignored.
# So, for this example to work, ensure that the 'db_filename' is nonexistent.
with Jinbase(db_filename, chunk_size=CHUNK_SIZE) as db:
# some binary data
USER_DATA = b'\x20\x55\xA9\xBC\x69\x42\xD1' # seven bytes !
# set the data
kv_store = db.kv
kv_store.set("user", USER_DATA)
# count chunks
n_chunks = kv_store.count_chunks("user")
assert n_chunks == len(USER_DATA) # seven bytes !# access the unified blob interface for incremental reads
with kv_store.open_blob("user") as blob:
# read the entire unified blob
data = blob.read()
assert data == USER_DATA
assert blob.tell() == len(USER_DATA) # cursor position
assert blob.read() == b''
# move the cursor back to the beginning of the blob
blob.seek(0)
# read the first byte
assert blob.read(1) == bytes([USER_DATA[0]])
# read the last byte
assert blob[-1] == bytes([USER_DATA[-1]])
# read a slice
slice_obj = slice(2, 5)
assert blob[slice_obj] == USER_DATA[slice_obj]
```> The unified BLOB interface for incremental reads will only work on Python >=3.11
# Command line interface
Not yet implemented.# Related projects
- [LiteDBC](https://github.com/pyrustic/litedbc): Lite database connector
- [Paradict](https://github.com/pyrustic/paradict): Streamable multi-format serialization with schema
- [Asyncpal](https://github.com/pyrustic/asyncpal): Preemptive concurrency and parallelism for sporadic workloads
- [KvF](https://github.com/pyrustic/kvf): The key-value file format with sections# Testing and contributing
Feel free to **open an issue** to report a bug, suggest some changes, show some useful code snippets, or discuss anything related to this project. You can also directly email [me](https://pyrustic.github.io/#contact).## Setup your development environment
Following are instructions to setup your development environment```bash
# create and activate a virtual environment
python -m venv venv
source venv/bin/activate# clone the project then change into its directory
git clone https://github.com/pyrustic/jinbase.git
cd jinbase# install the package locally (editable mode)
pip install -e .# run tests
python -m tests# deactivate the virtual environment
deactivate
```# Installation
**Jinbase** is **cross-platform**. It is built on [Ubuntu](https://ubuntu.com/download/desktop) and should work on **Python 3.8** or **newer**.## Create and activate a virtual environment
```bash
python -m venv venv
source venv/bin/activate
```## Install for the first time
```bash
pip install jinbase
```## Upgrade the package
```bash
pip install jinbase --upgrade --upgrade-strategy eager
```## Deactivate the virtual environment
```bash
deactivate
```# About the author
Hello world, I'm Alex (😎️), a tech enthusiast and the architect of [Pyrustic](https://pyrustic.github.io) ! Feel free to get in touch with [me](https://pyrustic.github.io/#contact) !
[Back to top](#readme)