Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/tonbo-io/tonbo
A portable embedded database using Arrow.
https://github.com/tonbo-io/tonbo
arrow big-data embedded-database olap rust store-engine
Last synced: 3 months ago
JSON representation
A portable embedded database using Arrow.
- Host: GitHub
- URL: https://github.com/tonbo-io/tonbo
- Owner: tonbo-io
- License: apache-2.0
- Created: 2024-07-15T03:23:00.000Z (4 months ago)
- Default Branch: main
- Last Pushed: 2024-08-16T04:39:21.000Z (3 months ago)
- Last Synced: 2024-08-16T05:07:11.647Z (3 months ago)
- Topics: arrow, big-data, embedded-database, olap, rust, store-engine
- Language: Rust
- Homepage: https://tonbo.io
- Size: 226 KB
- Stars: 270
- Watchers: 8
- Forks: 25
- Open Issues: 7
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
- awesome-rust - Tonbo - Tonbo is an embedded persistent database built on Apache Arrow & Parquet [![crates.io](https://img.shields.io/crates/v/tonbo.svg)](https://crates.io/crates/tonbo) (Applications / Database)
- awesome-trevor - Tonbo - embedded, persistent key-value database for Rust, using Apache Arrow (Programming / Rust 🦀)
- fucking-awesome-rust - Tonbo - Tonbo is an embedded persistent database built on Apache Arrow & Parquet [![crates.io](https://img.shields.io/crates/v/tonbo.svg)](https://crates.io/crates/tonbo) (Applications / Database)
- fucking-awesome-rust - Tonbo - Tonbo is an embedded persistent database built on Apache Arrow & Parquet [![crates.io](https://img.shields.io/crates/v/tonbo.svg)](https://crates.io/crates/tonbo) (Applications / Database)
README
# Tonbo
**[Website](https://tonbo.io/) | [Rust Doc](https://docs.rs/tonbo/latest/tonbo/) | [Blog](https://tonbo.io/blog/introducing-tonbo) | [Community](https://discord.gg/8jm9WMfX)**
## Introduction
Tonbo is an embedded persistent database built on [Apache Arrow & Parquet](https://github.com/apache/arrow-rs). It offers essential KV-like methods—insert, filter, and range scan—to efficiently and conveniently query type-safe structured data. Tonbo is able to integrate seamlessly with other Arrow analytical tools, such as DataFusion. For an example, refer to this [example](examples/datafusion.rs). Official support for DataFusion will be included in the next release.
## Example
```rust
use std::ops::Bound;use futures_util::stream::StreamExt;
use tonbo::{executor::tokio::TokioExecutor, tonbo_record, Projection, DB};// use macro to define schema of column family just like ORM
// it provides type safety read & write API
#[tonbo_record]
pub struct User {
#[primary_key]
name: String,
email: Option,
age: u8,
}#[tokio::main]
async fn main() {
// pluggable async runtime and I/O
let db = DB::new("./db_path/users".into(), TokioExecutor::default())
.await
.unwrap();// insert with owned value
db.insert(User {
name: "Alice".into(),
email: Some("[email protected]".into()),
age: 22,
})
.await
.unwrap();{
// tonbo supports transaction
let txn = db.transaction().await;// get from primary key
let name = "Alice".into();// get the zero-copy reference of record without any allocations.
let user = txn
.get(
&name,
// tonbo supports pushing down projection
Projection::All,
)
.await
.unwrap();
assert!(user.is_some());
assert_eq!(user.unwrap().get().age, Some(22));{
let upper = "Blob".into();
// range scan of
let mut scan = txn
.scan((Bound::Included(&name), Bound::Excluded(&upper)))
.await
// tonbo supports pushing down projection
.projection(vec![1])
.take()
.await
.unwrap();
while let Some(entry) = scan.next().await.transpose().unwrap() {
assert_eq!(
entry.value(),
Some(UserRef {
name: "Alice",
email: Some("[email protected]"),
age: Some(22),
})
);
}
}// commit transaction
txn.commit().await.unwrap();
}
}```
## Features
- [x] Fully asynchronous API.
- [x] Zero-copy rusty API ensuring safety with compile-time type and lifetime checks.
- [x] Vendor-agnostic:
- [ ] Various usage methods, async runtimes, and file systems:
- [x] Rust library:
- [x] [Customizable async runtime and file system](https://github.com/from-the-basement/tonbo/blob/main/src/executor.rs#L5).
- [x] [Tokio and Tokio fs](https://github.com/tokio-rs/tokio).
- [ ] [Async-std](https://github.com/async-rs/async-std).
- [ ] Python library (via [PyO3](https://github.com/PyO3/pyo3) & [pydantic](https://github.com/pydantic/pydantic)):
- [ ] asyncio (via [pyo3-asyncio](https://github.com/awestlake87/pyo3-asyncio)).
- [ ] JavaScript library:
- [ ] WASM and OPFS.
- [ ] Dynamic library with a C interface.
- [x] Most lightweight implementation to Arrow / Parquet LSM Trees:
- [x] Define schema using just Arrow schema and store data in Parquet files.
- [x] (Optimistic) Transactions.
- [x] Leveled compaction strategy.
- [x] Push down filter, limit and projection.
- [ ] Runtime schema definition (*in next release*).
- [ ] SQL (via [Apache DataFusion](https://datafusion.apache.org/)).
- [ ] Fusion storage across RAM, flash, SSD, and remote Object Storage Service (OSS) for each column-family, balancing performance and cost efficiency per data block:
- [ ] Remote storage (via [Arrow object_store](https://github.com/apache/arrow-rs/tree/master/object_store) or [Apache OpenDAL](https://github.com/apache/opendal)).
- [ ] Distributed query and compaction.
- [ ] Blob storage (like [BlobDB in RocksDB](https://github.com/facebook/rocksdb/wiki/BlobDB)).## Contributing to Tonbo
Please feel free to ask any question or contact us on Github [Discussions](https://github.com/orgs/tonbo-io/discussions) or [issues](https://github.com/tonbo-io/tonbo/issues).