An open API service indexing awesome lists of open source software.

https://github.com/andrewssobral/dse

Single-process distributed system emulator with deterministic scheduling, fault injection, and trace-driven debugging in C++20.
https://github.com/andrewssobral/dse

consensus cpp20 deterministic-simulation distributed-systems emulator event-driven fault-injection leader-election network-partition simulation

Last synced: 19 days ago
JSON representation

Single-process distributed system emulator with deterministic scheduling, fault injection, and trace-driven debugging in C++20.

Awesome Lists containing this project

README

          

# DSE

Distributed System Emulator in a Single Process.

`dse` is a small C++20 project that emulates a distributed system inside one process with a deterministic event loop. It is designed as a fast experimentation environment for leader election, fault injection, network partitions, and trace-driven debugging.

## What It Does

- Runs multiple logical nodes in one process.
- Uses a virtual clock and an event queue instead of real sockets or threads.
- Simulates:
- message latency
- message drops
- node crash and recovery
- network partitions and heal
- Exports a readable execution trace.
- Includes a toy leader election protocol for demonstration.

## Current Scope

This is a first MVP, not a full consensus implementation.

Current characteristics:
- single-process
- single-threaded runtime
- deterministic scheduling from a fixed seed
- toy leader election protocol with heartbeats and vote requests
- configurable network behavior for latency, drops, partitions, and reorder/FIFO delivery
- simple CLI demo runner with input validation
- 13 smoke/regression tests covering convergence, determinism, drops, reorder semantics, vote safety, and `run_until_idle` behavior

Not implemented yet:
- real coroutines for nodes
- real network transport
- JSON trace export
- GUI timeline or visualization
- full Raft or Paxos semantics

## Project Layout

```text
./dse
|-- CMakeLists.txt
|-- README.md
|-- include/dse
| |-- leader_election.hpp
| |-- sim.hpp
| |-- trace.hpp
| `-- types.hpp
|-- src
| |-- leader_election.cpp
| `-- sim.cpp
|-- app
| `-- main.cpp
`-- tests
`-- smoke.cpp
```

## Core Concepts

### Runtime

The runtime is centered around:
- a virtual clock (`Tick`)
- an ordered event queue
- explicit delivery/timer/action events
- a seeded random generator for repeatable latency and drop behavior

The system advances only when events are processed. This makes runs reproducible as long as the seed and scenario are the same.

The runtime exposes two execution styles:
- `run_for(delta)` advances simulated time by a fixed amount
- `run_until_idle(max_ticks = 10'000)` stops when the queue drains and returns `true`, or returns `false` if the safety cap is hit first

### Nodes

Each node is represented by a `Node` implementation. The current demo uses `LeaderElectionNode`, which tracks:
- role (`follower`, `candidate`, `leader`)
- current term
- voted peer
- known leader
- timer tokens to invalidate stale timers

On crash/recover, the node preserves its current term and remembered vote for that term. The stored vote is cleared only when a higher term is observed.

### Network Model

The network model supports:
- bounded random latency
- probabilistic message drop
- link blocking through partitions
- optional reordering control:
- `reorder = true`: random latency between `min_latency` and `max_latency`
- `reorder = false`: fixed `min_latency`, preserving FIFO delivery order for messages sent at the same simulated time

There is no real I/O. All communication becomes scheduled delivery events.

## Build

From `./dse`:

```bash
cmake -S . -B build
cmake --build build
```

## Test

```bash
ctest --test-dir build --output-on-failure
```

You can also run the smoke binary directly:

```bash
./build/dse_smoke
```

Current coverage includes:
- normal, crash/recover, and partition/heal convergence
- replay determinism across repeated runs with the same seed
- convergence across multiple seeds
- convergence with probabilistic drops
- FIFO delivery behavior when `reorder` is disabled
- vote safety across crash/recover within the same term
- both success and safety-cap paths of `run_until_idle`

## Run

Basic demo:

```bash
./build/dse_demo
```

Common variants:

```bash
./build/dse_demo --scenario normal
./build/dse_demo --scenario crash
./build/dse_demo --scenario partition
./build/dse_demo --scenario partition --ticks 120 --no-trace
./build/dse_demo --seed 123 --nodes 5 --ticks 180
```

Supported flags:
- `--seed `
- `--nodes `
- `--ticks `
- `--scenario `
- `--no-trace`

Invalid numeric values, unknown options, unknown scenarios, and `--nodes 0` are rejected with a non-zero exit status.

## Example Scenarios

### `normal`

Starts a cluster and lets the election stabilize.

Expected result:
- one node becomes leader
- the others converge as followers

### `crash`

Crashes one node and later recovers it.

Expected result:
- the cluster keeps operating
- leadership converges again after the failure window

### `partition`

Splits the cluster into two groups and later heals the network.

Expected result:
- multiple elections may happen during the partition
- after heal, the system should converge back to one leader

## Design Notes

- The runtime is intentionally explicit and simple.
- Timer tokens are used to ignore stale scheduled timers after state changes.
- `run_until_idle` has a safety cap so protocol loops with recurring timers do not block forever.
- The protocol is meant to demonstrate scheduler behavior and fault injection, not to claim consensus correctness.
- The most important property right now is deterministic replay of a scenario.

## Next Steps

Good follow-up work:
- add JSON trace export
- add richer metrics
- add timeline visualization
- replace the toy protocol with a stronger protocol
- introduce coroutine-based node execution
- add property-based or fuzz-style scenario tests

## Why This Project Exists

This project is meant to become a practical lab for distributed systems experiments:
- build fast scenarios
- replay failures deterministically
- inspect traces
- iterate on protocol ideas without deploying a real cluster