https://github.com/rusty-ferris-club/processor

Last synced: 7 months ago
JSON representation

Host: GitHub
URL: https://github.com/rusty-ferris-club/processor
Owner: rusty-ferris-club
Created: 2022-10-16T06:52:34.000Z (almost 3 years ago)
Default Branch: master
Last Pushed: 2022-12-19T12:42:38.000Z (almost 3 years ago)
Last Synced: 2025-02-23T16:03:43.563Z (8 months ago)
Language: Rust
Size: 5.86 KB
Stars: 1
Watchers: 3
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

          # Processor

## The Problem

Given a blackbox of monolithic data (looking from the outside, but comprised of well-defined blocks internally),

how do we process it efficiently using all compute power at our disposal?

## The Steps

1. Split the data to chunk, [0, N1], [N1, N2], ... [Nn, Nn+1].

2. When starting to process each chunk, first lock on the previous/next block.

3. Process the chunk and store relevant information to enable monolithic context.

```rust

struct Chunk {

    start: u64,

    end: u64,

    first_block_offset: u64,

    last_block_offset: u64,

    results: Vec,

}

struct Block {

    relative_offset: u64,

    data: Vec,

}

```

4. In case our monolithic data allows efficient random access,

   we can traverse backwards to ensure each block covers the expected range.

   If not, post-processing will require to detect if some small boundary blocks are missing.

5. Combine the results using the offset information.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/rusty-ferris-club/processor

Awesome Lists containing this project

README