https://github.com/quickwit-oss/mrecordlog
https://github.com/quickwit-oss/mrecordlog
Last synced: about 1 year ago
JSON representation
- Host: GitHub
- URL: https://github.com/quickwit-oss/mrecordlog
- Owner: quickwit-oss
- License: mit
- Created: 2022-09-15T05:06:54.000Z (almost 4 years ago)
- Default Branch: main
- Last Pushed: 2025-03-30T11:16:17.000Z (about 1 year ago)
- Last Synced: 2025-04-16T12:18:45.618Z (about 1 year ago)
- Language: Rust
- Size: 217 KB
- Stars: 25
- Watchers: 14
- Forks: 7
- Open Issues: 9
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# What is it?
This crate implements a solution to efficiently handle several record logs.
Each recordlog has its own "local" notion of position.
It is possible to truncate each of the queues individually.
# Goals
- be durable, offer some flexibility on `fsync` strategies.
- offer a way to truncate a queue after a specific position
- handle an arbitrary number of queues
- have limited IO
- be fast
- offer the possibility to implement push back
```rust
pub struct MultiRecordLog {
pub fn create_queue(&mut self, queue: &str) -> Result<(), CreateQueueError>;
pub fn delete_queue(&mut self, queue: &str) -> Result<(), DeleteQueueError>;
pub fn queue_exists(&self, queue: &str) -> bool;
pub fn list_queues(&self) -> impl Iterator {
pub fn append_record(
&mut self,
queue: &str,
position_opt: Option,
payload: &[u8],
);
pub fn truncate(&mut self, queue: &str, position: u64) -> Result<(), TruncateError>;
pub fn range(
&self,
queue: &str,
range: R,
) -> Option + '_>;
}
```
# Non-goals
This is not Kafka. This recordlog is designed for a "small amount of data".
All retained data can fit in RAM.
In the context of Quickwit, this queue is used in the ingest API and is meant to contain
1 minute worth of data. (At 60MB/s, means 3.6 GB of RAM)
Reading the recordlog files only happens on startup.
High-performance when reading the recordlog files is not a goal.
Writing fast on the other hand is important.
# Implementation details.
`mrecordlog` is multiplexing several independent queues into the same record log.
This approach has the merit of limiting the number of file descriptors necessary,
and more importantly, to limit the number of `fsync`.
It also offers the possibility to truncate the queue for a given record log.
The actual deletion of the data happens when a file only contains deleted records.
Then, and only then, the entire file is deleted.
That recordlog emits a new file every 1GB.
A recordlog file is deleted once all queues have been truncated after the
last record of a of a file.
There is no compaction logic.
# TODO
- add backpressure.
- add fsync policy
- better testing.
- non auto-inc position
- less Arc