An open API service indexing awesome lists of open source software.

https://github.com/shekharshwetank/rtos

RTOS for Raspberry Pi 5
https://github.com/shekharshwetank/rtos

linux-kernel preempt-rt real-time-depth-estimation real-time-object-detection-models real-time-operating-systems

Last synced: 29 days ago
JSON representation

RTOS for Raspberry Pi 5

Awesome Lists containing this project

README

          

# Real-Time OS (RTOS) for Raspberry Pi 5

**Build and Benchmarking Suite for Deterministic Real-Time Systems**

> Award-winning project: [CATERPILLAR TECH CHALLENGE 2025 Winners](https://www.linkedin.com/posts/shwetank-shekhar-002b9b203_caterpillartechchallenge-caterpillar100-rtos-activity-7357453575540658177-fU2y?utm_source=share&utm_medium=member_desktop&rcm=ACoAADP3l0IB_pF2cEhbDCVtuh9m-Vzfyl9vxcI)

---

## Table of Contents

- [Overview](#overview)
- [System Specifications](#system-specifications)
- [Quick Start](#quick-start)
- [Components](#components)
- [Kernel Build](#kernel-build)
- [Benchmarking Suite](#benchmarking-suite)
- [End-to-End Inference Testing](#end-to-end-inference-testing)
- [Installation & Setup](#installation--setup)
- [Running Benchmarks](#running-benchmarks)
- [Performance Results](#performance-results)
- [Troubleshooting](#troubleshooting)
- [Project Structure](#project-structure)

---

## Overview

This repository provides a **complete framework for building and validating a PREEMPT_RT real-time kernel** on Raspberry Pi 5, along with a **comprehensive benchmarking suite** to measure and characterize deterministic system behavior.

The RTOS kernel enables:
- **Microsecond-level latency** (<200 µs under full stress)
- **Deterministic scheduling** with bounded jitter
- **Suitable for real-time AI/ML workloads** including monocular depth estimation, robotics, and safety-critical edge computing

### Key Features

✅ **Custom PREEMPT_RT kernel** (v6.15.y branch, native compilation on Pi 5)
✅ **Multi-scenario cyclictest benchmarking** (idle, light, moderate, heavy, thermal stress)
✅ **End-to-end latency measurement** for inference pipelines
✅ **Comprehensive statistical analysis** (percentiles, jitter, WCET)
✅ **Thermal and power consumption tracking**
✅ **CPU isolation and frequency scaling support**

---

## System Specifications

### Operating System & Kernel Details

| Parameter | Value |
|-----------|-------|
| **Operating System** | Raspberry Pi OS (64-bit), Debian Bookworm |
| **Kernel Version** | 6.15.0-rc7-v8-16k-NTP+ |
| **Architecture** | aarch64 (64-bit) |
| **Build Method** | Native compilation on Raspberry Pi 5 |
| **Real-Time Model** | PREEMPT_RT (Full Real-Time Preemption) |

### Critical Kernel Configurations

| Configuration | Setting | Purpose |
|---|---|---|
| `CONFIG_PREEMPT_RT` | `y` | Full kernel preemption for real-time scheduling |
| `CONFIG_HZ_1000` | `y` | High-resolution timer (1000 Hz tick rate) |
| `CONFIG_NO_HZ_FULL` | `y` | Tickless kernel on isolated cores |
| `CONFIG_NTP_PPS` | `y` | Kernel PPS (Pulse Per Second) timing support |
| `CONFIG_PPS_CLIENT_GPIO` | `y` | GPIO-based precise time synchronization |
| CPU Governor | `performance` | Disabled frequency scaling during tests |

---

## Hardware Requirements

| Component | Specification |
|-----------|---------------|
| **CPU** | Raspberry Pi 5 (4-core ARM Cortex-A76 @ 2.4 GHz) |
| **RAM** | 8GB LPDDR4X-4267 SDRAM (minimum) |
| **Storage** | 64GB microSD card (minimum) |
| **Cooling** | Official Raspberry Pi Active Cooler (recommended) |
| **Power** | 27W USB-C PSU (Pi 5 recommended supply) |
| **Optional** | 128GB USB 3.2 for logging/datasets, GPIO test hardware |

---

## Quick Start

### Option 1: Automated Build (Recommended)

```bash
chmod +x build_rt_kernel.sh
./build_rt_kernel.sh
```

The script handles:
- Dependency installation
- Kernel source cloning
- Configuration & compilation
- Boot directory setup
- System reboot into RT kernel

**⏱️ Estimated time:** 45–90 minutes (native compilation on Pi 5)

### Option 2: Manual Build

Follow the step-by-step instructions in [Kernel Build Process](Kernel_Build_Process.md).

---

## Components

### Kernel Build

**Purpose:** Compile a PREEMPT_RT kernel optimized for Raspberry Pi 5.

**Files:**
- [`build_rt_kernel.sh`](build_rt_kernel.sh) — Automated build script
- [`README.md`](#) — This file (detailed instructions)

**Key Optimizations:**
- `-O3` compiler optimization + `-march=native` for Pi 5 CPU features
- `-j6` parallel compilation (1.5× CPU cores for stability)
- Native compilation avoids cross-compilation overhead
- Device tree blobs (DTBs) tailored for BCM2712 (Pi 5 SoC)

**Output:**
- RT kernel binary: `/boot/firmware/kernel_2712-NTP.img`
- Modules: `/lib/modules/$(uname -r)/`
- Device trees: `/boot/firmware/NTP/`

---

### Benchmarking Suite

**Purpose:** Comprehensive validation of real-time determinism under multiple load scenarios.

**File:** [`Enhanced_rtos_benchmark_v2.1.sh`](Enhanced_rtos_benchmark_v2.1.sh)

**What It Measures:**

| Metric | Tool | Purpose |
|--------|------|---------|
| **Scheduling Latency** | `cyclictest` | RT timer interrupt response time under load |
| **Thermal Profile** | `vcgencmd` | Temperature and throttling behavior |
| **System Stress** | `stress-ng` | CPU, memory, I/O load simulation |
| **Jitter Distribution** | Statistical analysis | Percentile latencies (50th–99.99th) |
| **Power Consumption** | VCM monitoring | Voltage, frequency, throttle events |

**Benchmark Scenarios:**

1. **Idle** — No load, baseline performance (~2 µs latency)
2. **Light** — 1 CPU core + light I/O (~1 µs latency)
3. **Moderate** — 2 CPU cores + medium I/O (~5–10 µs latency)
4. **Heavy** — 3 CPU cores + high memory pressure + I/O (~50–100 µs latency)
5. **Thermal** — Sustained load to trigger thermal throttling

**Output Structure:**
```
rtos_benchmark_YYYYMMDD_HHMMSS/
├── system_info.txt # Hardware/kernel configuration
├── thermal_power_log.csv # Continuous monitoring (5s intervals)
├── power_thermal_stats.txt # Power analysis & throttle events
├── statistical_summary.txt # Complete percentile statistics
├── statistical_summary.json # Machine-readable results
├── cyclictest_idle/
│ ├── cyclictest.json
│ ├── cyclictest_raw.txt
│ ├── min_latency.txt
│ ├── avg_latency.txt
│ └── max_latency.txt
├── cyclictest_light/
├── cyclictest_moderate/
├── cyclictest_heavy/
└── cyclictest_thermal/
```

**Key Features:**
- ✅ Pre-flight validation (RT kernel, tools, CPU isolation)
- ✅ Background thermal monitoring (continuous)
- ✅ Per-scenario stress load management
- ✅ Configurable test duration (default: 600s)
- ✅ JSON export for automated analysis

---

### End-to-End Inference Testing

**Purpose:** Measure complete latency pipeline for AI/ML inference (e.g., depth estimation).

**File:** [`e2e_inference_benchmark.py`](e2e_inference_benchmark.py)

**What It Measures:**

```
Frame Capture → Preprocessing → Inference → Postprocessing → Decision
↓ ↓ ↓ ↓ ↓
tflite runtime + OpenCV benchmarking

Total E2E Latency + Component Breakdown
```

**Breakdown Components:**
- **Preprocessing:** Frame crop, resize, normalization (1–5 µs)
- **Inference:** TFLite model inference on CPU (50–200 µs)
- **Postprocessing:** Depth alignment, ROI extraction, threshold decision (5–10 µs)

**Configuration:**
```python
MODEL_PATH = "ADALITE_TFLITE.tflite"
MODEL_INPUT_HEIGHT = 256
MODEL_INPUT_WIDTH = 256
SAMPLE_COUNT = 1000
```

**Output:**
- `e2e_inference_latency.csv` — Per-sample breakdown (Sample, Total, Preproc, Inference, Postproc)
- `e2e_inference_stats.txt` — Statistical summary with percentiles & throughput

**Example Usage:**
```bash
python3 e2e_inference_benchmark.py
# Outputs: e2e_inference_latency.csv, e2e_inference_stats.txt
```

---

## Installation & Setup

### Prerequisites

**Hardware:**
- Raspberry Pi 5 (8GB RAM minimum)
- 64GB microSD with Raspberry Pi OS (Bookworm, 64-bit)
- Active cooling (fan or heatsink)

**Software Dependencies (Auto-installed by scripts):**
```bash
# Kernel build dependencies
git bc bison flex libssl-dev make libncurses5-dev raspberrypi-kernel-headers

# Benchmarking dependencies
rt-tests # Contains cyclictest
stress-ng # Load generation
python3 # Analysis scripts
python3-opencv # Frame processing (for e2e_inference_benchmark.py)
tflite-runtime # For inference benchmarking
```

### Step 1: Initialize Environment

```bash
sudo apt update && sudo apt upgrade -y

# Install essential build tools
sudo apt install -y git bc bison flex libssl-dev make libncurses5-dev

# Benchmarking tools
sudo apt install -y rt-tests stress-ng

# Python dependencies
pip3 install --break-system-packages \
numpy opencv-python tflite-runtime pandas
```

### Step 2: Clone or Download Repository

```bash
# Clone from GitHub (if available)
git clone https://github.com/ShekharShwetank/RTOS.git
cd RTOS

# Or extract if provided as ZIP
unzip RTOS.zip && cd RTOS
```

### Step 3: Build RT Kernel

```bash
chmod +x build_rt_kernel.sh
./build_rt_kernel.sh
```

**What Happens:**
1. Downloads Raspberry Pi Linux v6.15.y
2. Applies BCM2712 (Pi 5) default configuration
3. Prompts for manual config (or use defaults)
4. Compiles kernel with `-O3 -march=native -j6`
5. Installs modules
6. Copies kernel + DTBs to `/boot/firmware/NTP/`
7. Reboots into RT kernel

**Troubleshooting:**
- If `menuconfig` appears: press Escape → Save → Exit (to use defaults)
- Build fails? Ensure `/boot/firmware/` has >500MB free
- Reboot hangs? Hold Ctrl+C, insert old SD, rebuild

### Step 4: Verify RT Kernel

After reboot:

```bash
uname -a
# Expected: ...PREEMPT_RT...

# Check RT config
cat /boot/config-$(uname -r) | grep CONFIG_PREEMPT_RT
# Expected: CONFIG_PREEMPT_RT=y

# Check timer frequency
cat /boot/config-$(uname -r) | grep CONFIG_HZ
# Expected: CONFIG_HZ_1000=y
```

---

## Running Benchmarks

### Benchmark 1: Multi-Scenario Cyclictest

```bash
./Enhanced_rtos_benchmark_v2.1.sh [DURATION_SECONDS]
```

**Examples:**

```bash
# Default: 600 seconds (10 minutes)
./Enhanced_rtos_benchmark_v2.1.sh

# Extended run: 3600 seconds (1 hour) for statistical significance
./Enhanced_rtos_benchmark_v2.1.sh 3600

# Quick test: 300 seconds (5 minutes)
./Enhanced_rtos_benchmark_v2.1.sh 300
```

**Output:**
```
Enhanced RTOS Benchmarking Suite v2.1
✓ PREEMPT_RT kernel detected
✓ Pre-flight checks complete
[1/7] Capturing System Baseline...
[2/7] Starting System Health Monitoring...
[3/7] Preparing GPIO End-to-End Latency Test...
[4/7] Running Multi-Scenario Cyclictest...
[5/7] Power Consumption Analysis...
[6/7] Computing Jitter and Percentile Statistics...
[7/7] Generating Final Report...

Results saved in: rtos_benchmark_*/
```

**Interpreting Results:**

```bash
# View statistical summary
cat rtos_benchmark_*/statistical_summary.txt

# View thermal profile
cat rtos_benchmark_*/power_thermal_stats.txt

# Analyze per-scenario latencies
cat rtos_benchmark_*/cyclictest_idle/min_latency.txt
cat rtos_benchmark_*/cyclictest_heavy/max_latency.txt
```

### Benchmark 2: End-to-End Inference Latency

```bash
python3 e2e_inference_benchmark.py
```

**Prerequisites:**
```bash
# Ensure TFLite model is in current directory
ls -lh ADALITE_TFLITE.tflite

# Or provide test video
ls -lh input_road.mp4
```

**Output Example:**
```
End-to-End ADALITE Inference Latency Benchmark
============================================================
Model: ADALITE_TFLITE.tflite
Samples: 1000
Resolution: 256x256

Metric Total Preproc Inference Postproc
------------------------------------------------------
Mean 95.32 μs 2.15 μs 89.42 μs 3.75 μs
99th %ile 156.00 μs 5.20 μs 148.30 μs 8.90 μs
99.9th %ile 201.00 μs 8.10 μs 195.20 μs 12.50 μs

✓ Detailed statistics saved to: e2e_inference_stats.txt
✓ Raw data saved to: e2e_inference_latency.csv
```

---

## Performance Results

![alt text](benchmarking/publication_results/figures/fig1_latency_comparison.png)
![alt text](benchmarking/publication_results/figures/fig2_jitter_analysis.png)
![alt text](benchmarking/publication_results/figures/fig3_thermal_power.png)
![alt text](benchmarking/publication_results/figures/fig4_e2e_latency.png)

### Scheduling Latency Statistics Across Load Scenarios

| Scenario | Mean (µs) | Median (µs) | 95th %ile (µs) | 99th %ile (µs) | 99.9th %ile (µs) | WCET (µs) | Jitter (µs) |
|----------|-----------|-------------|----------------|----------------|------------------|-----------|-------------|
| **Idle** | 2.01 | 2.00 | 2.00 | 3.00 | 4.00 | 16.00 | 0.20 |
| **Light** | 1.02 | 1.00 | 1.00 | 2.00 | 4.00 | 23.00 | 0.23 |
| **Moderate** | 1.16 | 1.00 | 2.00 | 3.00 | 9.00 | 109.00 | 0.79 |
| **Heavy** | 1.34 | 1.00 | 3.00 | 5.00 | 11.00 | 76.00 | 1.01 |
| **Thermal** | 1.55 | 2.00 | 2.00 | 2.00 | 5.00 | 20.00 | 0.55 |

**Table Notes:**
- WCET: Worst-Case Execution Time (maximum observed latency)
- Jitter: Standard deviation of scheduling latencies
- All measurements conducted on Raspberry Pi 5 with PREEMPT_RT Linux kernel v6.15
- Each scenario tested with 3–6 million samples

### End-to-End ADALITE Inference Latency

| Metric | Latency (µs) | Latency (ms) |
|--------|-------------|------------|
| **Mean** | 116,436 | 116.4 |
| **Median** | 115,899 | 115.9 |
| **95th percentile** | 118,885 | 118.9 |
| **99th percentile** | 140,788 | 140.8 |
| **99.9th percentile** | 192,077 | 192.1 |
| **Maximum (WCET)** | 221,341 | 221.3 |

**Component Breakdown (Mean):**

| Component | Latency (µs) | Latency (ms) | % of Total |
|-----------|-------------|------------|-----------|
| Preprocessing | 5,066 | 5.1 | 4.4% |
| Inference | 111,076 | 111.1 | 95.4% |
| Postprocessing | 282 | 0.3 | 0.2% |

**Table Notes:**
- Measured over 1,000 samples processing KITTI road scenes
- Inference component dominates at 95.4% of total latency
- End-to-end latency suitable for real-time robotics and autonomous systems

---

## Troubleshooting

### Issue: PREEMPT_RT kernel not detected

```bash
uname -a
# Should show: PREEMPT_RT

# If not shown:
cat /boot/config-$(uname -r) | grep CONFIG_PREEMPT_RT
# Should output: CONFIG_PREEMPT_RT=y
```

**Solution:**
1. Verify boot configuration: `cat /boot/firmware/config.txt`
2. Check kernel copy: `ls -lh /boot/firmware/kernel_2712-NTP.img`
3. Rebuild if needed: `./build_rt_kernel.sh`

### Issue: cyclictest reports high latencies (>1000 µs)

**Causes:** Frequency scaling, CPU interrupts, kernel debugging

**Solutions:**
```bash
# Check frequency governor
cat /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
# Should be: performance

# Force performance governor if needed
echo performance | sudo tee /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor

# Verify CPU isolation
cat /sys/devices/system/cpu/isolated
# Should include: 3 (isolated core)

# Add to /boot/cmdline.txt if missing:
# isolcpus=3 nohz_full=3 rcu_nocbs=3
```

### Issue: Benchmarking script fails (missing tools)

```bash
# Install missing dependencies
sudo apt install -y rt-tests stress-ng python3 python3-pip

# Install Python packages
pip3 install --break-system-packages \
numpy pandas opencv-python tflite-runtime
```

### Issue: Thermal throttling detected

```
Throttle Events: 12 (ARM frequency capped)
```

**Solutions:**
1. Attach active cooler to Pi 5
2. Improve airflow (case ventilation)
3. Reduce test duration temporarily
4. Disable heavy load scenarios (thermal test)

---

## Project Structure

```
RTOS/
├── README.md # This file
├── build_rt_kernel.sh # Automated kernel build & deployment
├── Enhanced_rtos_benchmark_v2.1.sh # Multi-scenario benchmarking suite
├── e2e_inference_benchmark.py # End-to-end inference latency measurement
├── benchmarking/ # Example benchmark results
│ ├── system_info.txt
│ ├── statistical_summary.txt
│ ├── statistical_summary.json
│ ├── power_thermal_stats.txt
│ ├── cyclictest_idle/
│ ├── cyclictest_light/
│ ├── cyclictest_moderate/
│ ├── cyclictest_heavy/
│ ├── cyclictest_thermal/
│ └── publication_results/
│ ├── ANALYSIS_SUMMARY.txt
│ ├── figures/ # PNG/PDF plots
│ └── latex_tables/ # Publication-ready LaTeX tables
├── assets/ # Supporting files (if any)
└── .git/ # Version control
```

---

## Advanced Configuration

### CPU Isolation for Ultra-Low Latency

To dedicate core 3 entirely to RT tasks:

Edit `/boot/cmdline.txt` and add:
```
isolcpus=3 nohz_full=3 rcu_nocbs=3 kthread_cpus=0-2 irqaffinity=0-2
```

Then rebuild/reboot. This prevents:
- Kernel threads from running on core 3
- IRQ handling on core 3
- Timer ticks on core 3

**Expected Latency Improvement:** 5–15% reduction in jitter under stress

### GPIO PPS Timing (Advanced)

For synchronized clock with external PPS source:

```bash
# Install PPS tools
sudo apt install pps-tools gpsd

# Connect GPIO pin 17 to PPS source
# Verify detection
sudo ppstest /dev/pps0
```

Then configure NTP:
```bash
# Edit /etc/ntp.conf
# Add: server 127.127.8.0 minpoll 4 maxpoll 4
```

---

## Performance Tuning Tips

| Tuning | Expected Gain | Difficulty |
|--------|---------------|-----------|
| CPU isolation (isolcpus) | 5–15% lower latency | Easy |
| Performance governor | 10% lower latency | Easy |
| Disable USB hub scanning | 5% lower jitter | Easy |
| Move IRQs off core | 2–5% improvement | Medium |
| Disable unused CPUs | 3% lower idle latency | Medium |
| Custom cyclictest priority | Varies | Hard |

---

## References

- **Linux Kernel Documentation:** https://www.kernel.org/doc/html/latest/
- **RT-Tests (cyclictest):** https://wiki.linuxfoundation.org/realtime/documentation/howto/tools/rt-tests
- **Raspberry Pi Linux:** https://github.com/raspberrypi/linux
- **PREEMPT_RT Wiki:** https://rt.wiki.kernel.org/

---

## Contributing

Contributions, bug reports, and performance improvements are welcome. Submit via:
- GitHub Issues/PRs
- Email: shwetankshekharcode@gmail.com

---

## License

This project documentation and scripts are provided as-is for research and educational purposes.

---

## Acknowledgments

**Caterpillar Tech Challenge 2025 Winners** — Complete RTOS implementation and validation framework for real-time edge AI on embedded systems.

---