https://github.com/cong-or/hud
Find code blocking your Tokio workers. eBPF-powered, no instrumentation.
https://github.com/cong-or/hud
async cli debugging ebpf linux observability performance profiler rust tokio tui
Last synced: about 10 hours ago
JSON representation
Find code blocking your Tokio workers. eBPF-powered, no instrumentation.
- Host: GitHub
- URL: https://github.com/cong-or/hud
- Owner: cong-or
- License: mit
- Created: 2025-12-22T12:35:25.000Z (about 2 months ago)
- Default Branch: main
- Last Pushed: 2026-02-12T13:11:59.000Z (3 days ago)
- Last Synced: 2026-02-12T18:57:21.390Z (3 days ago)
- Topics: async, cli, debugging, ebpf, linux, observability, performance, profiler, rust, tokio, tui
- Language: Rust
- Homepage:
- Size: 6.25 MB
- Stars: 133
- Watchers: 0
- Forks: 1
- Open Issues: 2
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# hud
[](https://github.com/cong-or/hud/actions)
[](LICENSE)
[](docs/ARCHITECTURE.md)
Find what's blocking your Tokio runtime. Zero-instrumentation eBPF profiler.
> **Linux only.** This tool uses eBPF, which is a Linux kernel feature. It does not work on macOS or Windows.

## The Problem
Tokio uses cooperative scheduling. Tasks yield at `.await` points, trusting that work between awaits is fast. When it isn't—CPU-heavy code, sync I/O, blocking locks—one task starves the rest.
These bugs are silent. No errors, no panics—just degraded throughput. hud makes them visible.
## How It Works
Watches the Linux scheduler via eBPF. When a worker thread experiences high OS-level scheduling latency (time the thread waits in the kernel run queue, not Tokio's task queue), captures a stack trace. High scheduling latency is a symptom of blocking—when one task monopolizes a worker, others queue up waiting.
## Why hud?
Unlike [tokio-console](https://github.com/tokio-rs/console) or [tokio-blocked](https://github.com/theduke/tokio-blocked), hud requires no code changes—attach to any running Tokio process.
**Why not just use tokio-console?** It's the official tool and more accurate—it measures actual task poll durations. Use it if you can. But it requires adding `console-subscriber` and rebuilding.
**What about Tokio's unstable blocking detection?** Compile with `RUSTFLAGS="--cfg tokio_unstable"` and Tokio warns when task polls exceed a threshold. This catches the *blocker* directly, not victims—more accurate than hud. But it requires a rebuild, and only catches blocks exceeding the threshold during that run.
hud exists for profiling without code changes or rebuilds—staging environments, load testing, quick triage of a running process, or confirming blocking is even the problem before investing in instrumentation.
### When to use what
| Tool | Best for | Trade-off |
|------|----------|-----------|
| **hud** | Quick triage of running processes | Measures symptoms, not direct cause |
| **Tokio unstable detection** | Find the blocker directly | Requires rebuild with `tokio_unstable` |
| **tokio-console** | Precise task poll times | Requires code instrumentation |
| **perf + flamegraphs** | CPU profiling, broad analysis | Manual interpretation needed |
| **Custom metrics** | Production monitoring | Must know where to instrument |
Use hud to narrow down suspects, then dig deeper with instrumentation if needed.
## Requirements
**System:**
- Linux 5.8+
- x86_64 architecture
- Root privileges
**Your application needs debug symbols** (so hud can show function names):
```toml
# Cargo.toml
[profile.release]
debug = true
force-frame-pointers = true
```
> `debug = true` adds ~10-20% to binary size. `force-frame-pointers` adds ~1-2% runtime overhead. For production, you can swap in a debug-enabled binary temporarily for investigation.
## Install
**Option A: Pre-built binary** (no Rust toolchain needed)
```bash
curl -L https://github.com/cong-or/hud/releases/latest/download/hud-linux-x86_64.tar.gz | tar xz
sudo ./hud my-app
```
**Option B: Build from source**
```bash
git clone https://github.com/cong-or/hud.git && cd hud
cargo xtask build-ebpf --release && cargo build --release
sudo ./target/release/hud my-app
```
## Usage
```bash
# Profile by process name
sudo hud my-app
# Profile by PID
sudo hud --pid 1234
# Custom blocking threshold (default: 5ms)
sudo hud my-app --threshold 10 # less sensitive
sudo hud my-app --threshold 1 # more sensitive
# Rolling time window (only show last N seconds)
sudo hud my-app --window 30 # metrics decay when load stops
# Custom Tokio thread names are auto-detected. Override if needed:
sudo hud my-app --workers my-io-worker
# Headless mode (CI/scripting) - run for 60 seconds then exit
sudo hud my-app --headless --export trace.json --duration 60
```
See [Tuning](docs/TUNING.md) for threshold selection guide.
## Demo
Try hud with the included demo server (requires Option B):
```bash
# Build demo server - MUST be debug build (release inlines functions)
cargo build --example demo-server
./target/debug/examples/demo-server &
# Profile it (auto-detects PID and binary)
sudo ./target/release/hud demo-server
# Generate load (another terminal)
./hud/examples/load.sh
```
The demo server has intentionally blocking endpoints (`/hash`, `/compress`, `/read`, `/dns`). You'll see `bcrypt` and `blowfish` hotspots from the `/hash` endpoint, with `demo-server.rs` highlighted as the entry point in call traces.
> **Important**: The demo-server **must** be a debug build. Release builds aggressively inline functions, hiding your code from stack traces. If you don't see `demo-server.rs` in drilldowns, rebuild without `--release`.
Press `Q` to quit hud.
## Limitations
- Measures scheduling latency (a *symptom* of blocking), not blocking directly
- Captures the **victim's** stack, not the **blocker's**—if Task A blocks causing Task B to wait, you see Task B's stack. Look for patterns across multiple traces.
- System CPU pressure can cause false positives—look for consistent, repeatable traces
- Lock contention where threads sleep (not spin) may not appear
- Tokio 1.x only. Worker detection tries the default thread name prefix, then stack-based classification (looks for Tokio scheduler frames), then largest thread group heuristic. Use `--workers ` to skip auto-detection
- See [Troubleshooting](docs/TROUBLESHOOTING.md) for common issues
## Docs
- [Tuning](docs/TUNING.md) — Threshold selection, debugging workflow
- [Exports](docs/EXPORTS.md) — JSON format, before/after analysis
- [Troubleshooting](docs/TROUBLESHOOTING.md) — Common issues
- [Architecture](docs/ARCHITECTURE.md) — How it works internally
- [Development](docs/DEVELOPMENT.md) — Contributing
## Further Reading
- [Async: What is blocking?](https://ryhl.io/blog/async-what-is-blocking/) — Alice Ryhl's deep dive on blocking in async Rust
- [tokio::task::spawn_blocking](https://docs.rs/tokio/latest/tokio/task/fn.spawn_blocking.html) — Offload blocking I/O to a thread pool
- [rayon](https://docs.rs/rayon) — For parallelizable CPU work; call from within `spawn_blocking`, not directly from async code
- [Reducing tail latencies with automatic cooperative task yielding](https://tokio.rs/blog/2020-04-preemption) — Tokio's approach to preemption
## License
MIT or Apache-2.0