https://github.com/rusty-ferris-club/processor
https://github.com/rusty-ferris-club/processor
Last synced: 7 months ago
JSON representation
- Host: GitHub
- URL: https://github.com/rusty-ferris-club/processor
- Owner: rusty-ferris-club
- Created: 2022-10-16T06:52:34.000Z (almost 3 years ago)
- Default Branch: master
- Last Pushed: 2022-12-19T12:42:38.000Z (almost 3 years ago)
- Last Synced: 2025-02-23T16:03:43.563Z (8 months ago)
- Language: Rust
- Size: 5.86 KB
- Stars: 1
- Watchers: 3
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Processor
## The Problem
Given a blackbox of monolithic data (looking from the outside, but comprised of well-defined blocks internally),
how do we process it efficiently using all compute power at our disposal?## The Steps
1. Split the data to chunk, [0, N1], [N1, N2], ... [Nn, Nn+1].
2. When starting to process each chunk, first lock on the previous/next block.
3. Process the chunk and store relevant information to enable monolithic context.
```rust
struct Chunk {
start: u64,
end: u64,
first_block_offset: u64,
last_block_offset: u64,
results: Vec,
}struct Block {
relative_offset: u64,
data: Vec,
}
```4. In case our monolithic data allows efficient random access,
we can traverse backwards to ensure each block covers the expected range.
If not, post-processing will require to detect if some small boundary blocks are missing.5. Combine the results using the offset information.