Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/appaquet/extindex-rs
Disk persisted immutable index built using a single pass iterator
https://github.com/appaquet/extindex-rs
Last synced: 13 days ago
JSON representation
Disk persisted immutable index built using a single pass iterator
- Host: GitHub
- URL: https://github.com/appaquet/extindex-rs
- Owner: appaquet
- Created: 2018-12-09T20:31:09.000Z (almost 6 years ago)
- Default Branch: master
- Last Pushed: 2024-03-25T02:03:06.000Z (8 months ago)
- Last Synced: 2024-04-26T09:02:58.148Z (7 months ago)
- Language: Rust
- Size: 89.8 KB
- Stars: 5
- Watchers: 3
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
Awesome Lists containing this project
README
# extindex
[![crates.io](https://img.shields.io/crates/v/extindex.svg)](https://crates.io/crates/extindex)
Immutable persisted index (on disk) that can be built in one pass using a sorted
iterator, or can use [extsort](https://crates.io/crates/extsort) to externally
sort the iterator first, and then build the index from it.The index allows random lookups and sorted scans. An indexed entry consists of a
key and a value. The key needs to implement `Eq` and `Ord`, and both the key
and values need to implement a `Serializable` trait for serialization to and
from disk. It is possible to rely on the [`serde`](https://crates.io/crates/serde)
library to implement this trait for most types.The index is built using a skip list-like data structure, but lookups start from
the end of the index instead of the beginning. This allows building the index in
a single pass on a sorted iterator, as starting from the beginning would require
knowing checkpoints/nodes ahead in the file.## Example
```rust
extern crate extindex;
extern crate serde;use extindex::{Builder, Entry, Reader, SerdeWrapper};
#[derive(Ord, PartialOrd, Eq, PartialEq, Debug, serde::Serialize, serde::Deserialize)]
struct SomeStruct {
a: u32,
b: String,
}fn main() {
let index_file = tempfile::NamedTempFile::new().unwrap();let builder = Builder::new(index_file.path());
let entries = vec![Entry::new(
"my_key".to_string(),
SerdeWrapper(SomeStruct {
a: 123,
b: "my value".to_string(),
}),
)];
builder.build(entries.into_iter()).unwrap();let reader = Reader::>::open(index_file).unwrap();
assert!(reader.find(&"my_key".to_string()).unwrap().is_some());
assert!(reader.find(&"notfound".to_string()).unwrap().is_none());
}
```## Roadmap
- Possibility to use a Bloom filter to avoid disk access when the index does not contain a key.