https://github.com/by-sabbir/rust-fearless-concurrency

Last synced: 5 months ago
JSON representation
Host: GitHub
URL: https://github.com/by-sabbir/rust-fearless-concurrency
Owner: by-sabbir
Created: 2024-06-22T12:20:08.000Z (about 1 year ago)
Default Branch: master
Last Pushed: 2024-12-02T10:50:50.000Z (7 months ago)
Last Synced: 2025-01-03T22:29:58.547Z (6 months ago)
Language: Rust
Size: 37.1 KB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
Awesome Lists containing this project

README

        # Rust-Fearless-Concurrency

=========================

## Introduction

This project, "Rust-Fearless-Concurrency", is a showcase of concurrent programming in Rust. It aims

to provide a set of reusable and easy-to-use concurrency utilities for building scalable and

performant systems.

## Features

- A set of abstractions for handling concurrency in Rust applications.

- Implementations of common concurrency patterns, such as:

* Channels: Send and receive data between threads using channels.

* Locks: Implement fine-grained locking mechanisms for shared resources.

* Semaphores: Manage access to critical sections with semaphores.

## Fearless System Thread Concurrency

System threads are OS level construct. They can run multiple process at the same time. If only one

CPU is available, threads allow us to fake running multiple tasks.

- Threas live in the parent process and shares the memory with it.

- Each thread has its own `stack` so it can run whatever we put into it.

- `Heap` is shared among threads, so we can share heap shared data between threads.

- Threads are scheduled by the OS itself.

- System threads are limited

  - We can get the max thread count in linux by `cat /proc/sys/kernel/threads-max`

  - On my machine it's `126284`. Which is lot, but we should minimize the number of threads to

    optimize the thread schedule time. Or else our CPU will use a lot of time scheduling them.

- It's always a good idea to create a thread-pool before hand. As creating threads it time consuming

  task.

### Creating a Thread

1. Spawn a thread - `std::thread::spawn(fn)`

2. Join the thread with parent process.

### [Sharing data from/with threads](./sharing-data-in-threads/src/main.rs)

To share data with thread we need to move the value with the help of closure ie.

```rust

let n = 1024;

std::thread::spawn(move || do_something(n));

```

Reading data from a spawned thread is farely simple. The data is returned as te output of `.join()`

function. ie. if we modify the above script to do some calculation on `n` -

```rust

do_calc(n: u32) -> u32 {

    do_fake_work();

    n

}

let n = 1024;

let th = std::thread::spawn(move || do_something(n));

let calculated_value = th.join().unwrap();

println("{}", calculated_value);

```

### [Thread Builder Pattern](./thread-builder-pattern/src/main.rs)

Thread builder is native to rust, we can name a thread to understand what's going on in the

production. Usually used in a large program with a lot of threads spawned.

```rust

std::thread::Builder::new().

    .name("Named Thread".to_string())

    .stack_size(std::mem::size_of::() * 4)

    .spawn(fn)

    .unwrap();

```

Note: `::<>` this notation is called turbofish format.

So, when do we want to use `stack_size` when the threads are by default 2mb in size.

1. When we have to work with a lot of threads like 20K or 30K. (but in this case we should

   use async/await)

2. When we know the exact size of the stack.

3. Reducing the stack size also helps the thread to load faster.

### [Scoped Thread](./scoped-thread/src/main.rs)

Scope thread is a conveniant way to spawn a group of thread without worrying about the `join()`

function. For that we need to prove that the group of threads only belong to a scope.

First we create a scoped thread. The `scope()` function takes another fuction that defines the

thread lifecycle.

```rust

std::thread::scope(|s| {

    // spawn and join here

});

```

### [Sharing Data with `Atomics`](./atomics/src/main.rs)

When we need to share a global state, ie. a counter, rust provides a bunch or Atomic premitives

that are concurrency safe.We don't need to make a variable mutable for the atomics as the use a

pattern called `interior mutability`. For example if we want to declare an i32 atomic counter,

the assignment will be-

```rust

use std::sync::atomic::AtomicI32;

static COUNTER: AtomicI3232 = AtomicI32::new(0);

// to add a number to our atomic counter

COUNTER.fetch_add(1, std::sync::atomic::Ordering::Relaxed);

// to get the COUNTER value

COUNTER.load(std::sync::atomic::Ordering::Relaxed);

```

### [Mutexes](./mutexes/src/main.rs)

Mutexes are mainly used for complex data synchronization which are not available in

`sync::atomic::*` crate. Mutexes are a bit slower than the atomics. To initiate a Mutex we use

the following syntax -

```rust

use std::sync::Mutex;

...

// Declaring a mutex with i32 vector array

let data Mutex> = Mutex::new(Vec::new());

```

to access the value we gotta acquire the lock

```rust

let mut lock = data.lock().unwrap();

// now we can perform any vector ops on lock

```

### [Read/Write Locks](./rwlocks/src/main.rs)

We can request a read/write locks that allows us to lock a data for either reading or writing. Read

locks are very fast where write locks wait for all the read ops to finish. RW Locks are part of

`std::sync::RwLock`

Let's inspect the following lines from the example

```rust

std::thread::spawn(|| loop {

    let users = USERS.read().unwrap();

    println!("current user in thread: {users:?}");

    std::thread::sleep(std::time::Duration::from_secs(3));

});

loop {

    println!("input new user: ");

    let user = read_line();

    if user == "q" {

        break;

    }

    let mut lock = USERS.write().unwrap();

    lock.push(user);

}

```

We are spawning a thread and reading from the `USERS` static data with a read lock on it. Also we

are adding a "thread sleep" for human readable aspect. On the second loop we are handling user

input and and pushing it to `USERS` vector with write lock. Keep in mind the program will terminate

if the user chooses to input `q`, also as we've discussed earlier the child thread will die as soon

as the parent process terminates.

### [Thread Parking](./thread-parking/src/main.rs)

Sometimes we want to park a thread and unpark it on-request. A thread can be parked by itself and

unpark from outside the thread. For a parked thread we don't need to join them. We just `unpark`

them to perform the tasks in the thread.

```rust

fn parkable_thread(n: u32) {

    loop {

        // thread::park_timeout(std::time::Duration::from_secs(5));

        thread::park();

        println!("unparked {n}")

    }

}

```

Here we are parking the threads in a loop. And to access them and run the task we just need to run

spawn new thread handler put it in a thread pool and access the pool with

`threads[number].thread().unpark();` where number is the id given in `parkable_thread(n: u32)`

function.

### [Channel](./channels/src/main.rs)

Rust includes a multiproducer single consumer(MPSC) channel in standard library. Channel has to be

statically typed. We can start using mpsc by importing `std::sync::mpsc`. We need to get the sender

and receiver from the mpsc channel before we proceed further -

```rust

let (tx, rx) = std::sync::mpsc::channel::();

// send and recieve data from the

    tx.send(CHANNEL_TYPE);

    let result = rx.recv();

```

### [work-queue pattern](./work-queue/src/main.rs)

Work queue is a common concurrency pattern where we create a pool of workers waiting on queues to

get some jobs to do. When done, they sit idle waiting for receiving jobs. And as we queue jobs the

idle workers picks up jobs from queue and the cycle continues.

Let's go through the code...

Initially we declared our work queue as a `String` of `VecDeque` for simplicity -

```rust

    static WORK_QUEUE: Lazy>> = Lazy::new(|| Mutex::new(VecDeque::new()));

```

with the following block we are simulating the total number of CPU in our system

```rust

    let cpu_count = 2;

    let mut threads = Vec::with_capacity(cpu_count);

    let mut broadcast = Vec::with_capacity(cpu_count);

```

We are ready build our work queue logic. First we need our mpsc `Sender` and `Receiver` channels.

We will create an array holding all the senders(tx) later we will send jobs with the broadcast

container. Then we'll move all our receiver in seperate threads the logic will be like below -

- we try to acuire lock on `WORK_QUEUE`

- if successful then get a job from the queue

- then we drop the lock and simulate the job by adding a duration

Finally, we have the threads ready by adding them to the threads vector.

```rust

for cpu in 0..cpu_count {

    let (tx, rx) = mpsc::channel::<()>();

    broadcast.push(tx);

    let th = thread::spawn(move || {

        while rx.recv().is_ok() {

            let mut lock = WORK_QUEUE.lock().unwrap();

            if let Some(work) = lock.pop_front() {

                std::mem::drop(lock);

                println!("CPU #{cpu} got work {work}");

                std::thread::sleep(Duration::from_secs(2));

                println!("#{cpu} task finished");

            } else {

                println!("CPU #{cpu} found no work queue");

            }

        }

    });

    threads.push(th);

}

```

The next loop is basically implementing the work-queue simulation. We try get the lock of the

WORK_QUEUE, then check if we have less than 5 jobs queued. if true we add a new job to the queue.

Note that we are consuming the queue from the front and adding jobs in the back simulating FIFO

queue.

```rust

loop {

    let sent = {

        let mut lock = WORK_QUEUE.lock().unwrap();

        let len = lock.len();

        println!("Total Queue Item: #{len}");

        if len < 5 {

            lock.push_back("hello".to_string());

            true

        } else {

            false

        }

    };

    if sent {

        broadcast.iter().for_each(|tx| tx.send(()).unwrap());

    }

    thread::sleep(Duration::from_secs(1));

}

```

### [Basic Async](./basic_async/src/main.rs)

The perfect intro for `async` I found is in the [Async Book](https://rust-lang.github.io/async-book/)

> In concurrent programming, the program does multiple things at the same time (or at least appears

> to). Programming with threads is one form of concurrent programming. Code within a thread is

> written in sequential style and the operating system executes threads concurrently. With async

> programming, concurrency happens entirely within your program (the operating system is not

> involved). An async runtime (which is just another crate in Rust) manages async tasks in

> conjunction with the programmer explicitly yielding control by using the await keyword.

We get two things from this -

1. Async is not dependent on os threads

2. And we need a runtime to do the context switching and the argonomics like `spawn`, `join`

`await` etc

For this example we are using a very basic async-runtime, `Futures`. Actually, it's half a runtime, it only

executors. To add futures we need to run the following command -

```bash

cargo add futures

```

## How to Run the Project

To run this project, simply execute the following command:

```bash

cd 

cargo run

```
ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/by-sabbir/rust-fearless-concurrency

Awesome Lists containing this project

README