https://github.com/byebyebryan/powered-descent-lab
Native-first control and simulation lab for powered descent
https://github.com/byebyebryan/powered-descent-lab
Last synced: about 1 month ago
JSON representation
Native-first control and simulation lab for powered descent
- Host: GitHub
- URL: https://github.com/byebyebryan/powered-descent-lab
- Owner: byebyebryan
- Created: 2026-04-17T20:38:17.000Z (about 2 months ago)
- Default Branch: main
- Last Pushed: 2026-04-27T22:41:30.000Z (about 2 months ago)
- Last Synced: 2026-04-28T00:18:39.606Z (about 2 months ago)
- Language: Rust
- Size: 938 KB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Roadmap: docs/roadmap.md
Awesome Lists containing this project
README
# Powered Descent Lab
Powered Descent Lab (`pd-lab`) grows out of the bot and simulation work explored
in `pylander`: a native-first lab for deterministic 2D rocket flight,
controller development, scenario design, benchmarking, and replay/telemetry
analysis.
This project is not the eventual player-facing game. The split is intentional:
- the lab optimizes for determinism, throughput, interfaces, and evaluation
- the game can later optimize for feel, UX, content, and presentation
## Direction
Current design direction:
- Rust cargo workspace, native-first
- `clap` + `serde` + `tracing` style native tooling, with web reserved for static
report viewing rather than an interactive runtime
- fixed-step deterministic simulation
- controller updates may run at a lower fixed rate than physics, with commands
held between controller ticks
- a proper bot framework, not only a thin `Observation -> Command` callback
- one primary goal per scenario
- formal controller API with full immutable scenario context at setup time and a
compact per-tick observation
- controller outputs that can include status, phase, metrics, and report/debug
markers in addition to vehicle commands
- authored scenario packs, curated scenario families, and seeded regression
sweeps
- 1D heightfield terrain as the canonical world model, with richer query APIs
layered on top
- no LOD in v1 for controller-facing terrain data
- landing means stable touchdown on the designated target, based on
touchdown/contact-frame metrics rather than only world-frame `vx`/`vy`
- replay and trace artifacts as first-class outputs
- project-owned artifact schemas with lightweight OSS analysis tools layered on
top, rather than a production observability stack
- action and event logs as authoritative replay inputs, with sampled traces kept
as optional report/debug caches
- native multithreaded batch evaluation in `pd-eval`, especially for seeded
scenario sweeps
- basic static inspection/reporting as part of the near-term controller workflow,
not only a late polish phase
- static web reports over captured artifacts, with optional lightweight
trajectory inspection later, but no browser runtime target
## Why Reboot
`pylander` proved the problem was interesting, but it also mixed too many
concerns in one place:
- game runtime and presentation
- controller logic
- evaluation and benchmark orchestration
- plotting and trace tooling
- browser and Pygame delivery constraints
What mattered from that broader experimentation was the project split:
- native core
- controller layer
- evaluation and reporting layer
- thin presentation over generated artifacts
`pd-lab` borrows ideas, concepts, scenario lessons, and telemetry vocabulary
from `pylander`, but not its implementation or module layout.
It should also reuse the scenario lessons that proved useful in `pylander`
without treating the old scenario files as fixtures to transliterate directly.
## Scope
`pd-lab` owns:
- deterministic simulation
- controller and bot development
- scenario packs
- evaluation and benchmarking
- telemetry, traces, and replay artifacts
- controller telemetry and report/debug artifacts
`pd-lab` does not own:
- player progression or economy systems
- content-heavy mission design
- final game UX
- browser-first runtime constraints
## Docs
- [Architecture](docs/architecture.md)
- [Roadmap](docs/roadmap.md)
- [Progress](docs/progress.md)
- [Terminal Suite Design](docs/terminal_suite.md)
- [Early Design Scratchpad](docs/early_design.md)
## Report Serving
Generated reports live under `outputs/` and can be served locally with:
```bash
./scripts/serve-reports start
```
The script starts a simple HTTP server inside a named detached `tmux` session
and serves `outputs/` on `0.0.0.0:8000` by default. The root URL now lands on a
generated `outputs/index.html` page, and `/reports/` remains the clean
report-only subtree. The printed LAN URL resolves automatically when available.
Useful commands:
```bash
./scripts/serve-reports status
./scripts/serve-reports attach
./scripts/serve-reports stop
```
This is intentionally explicit. Agent skills or local tooling can call the same
script when they need a report server, but the repo-owned script remains the
canonical entrypoint.
Generated single-run, replay, and batch outputs also maintain `latest` links
under `outputs/` when written through the project CLIs, for example:
- `outputs/runs/latest/report.html`
- `outputs/replays/latest/report.html`
- `outputs/eval/latest/summary.json`
- `outputs/eval//runs/latest/report.html`
Stable HTML entrypoints also live under `outputs/reports/`, for example:
- `outputs/reports/index.html`
- `outputs/reports/runs/latest/`
- `outputs/reports/eval/latest/`
The root landing page is:
- `outputs/index.html`
## Batch Eval
`pd-eval` owns scenario packs, scenario-family expansion, seed sweeps, and
native multithreaded execution.
Example:
```bash
cargo run -p pd-eval -- run-pack fixtures/packs/terminal_bot_lab_suite.json --workers 4
```
`run-pack` now writes stable review output to `outputs/eval//`, but stores
the actual batch artifacts under:
- `outputs/eval/cache///`
By default it will:
- reuse a complete candidate cache when the resolved pack digest matches
- try `--compare-ref auto`
- on a dirty workspace, compare against the clean `HEAD` cache if it exists
- on a clean workspace, compare against the previous clean commit cache if it
exists
Add `--enforce-regression-policy` when a run should exit nonzero if the
resolved compare target fails the default regression gate. That flag requires
an explicit or cached compare baseline.
After a dirty run becomes the new checkpoint, promote it into the clean commit
key:
```bash
cargo run -p pd-eval -- promote-cache fixtures/packs/terminal_bot_lab_suite.json
```
Run the same matrix with the full seed tier:
```bash
cargo run -p pd-eval -- run-pack fixtures/packs/terminal_bot_lab_full.json --workers 8
```
Run the trajectory-error matrix:
```bash
cargo run -p pd-eval -- run-pack fixtures/packs/terminal_traj_err_suite.json --workers 8
cargo run -p pd-eval -- run-pack fixtures/packs/terminal_traj_err_full.json --workers 8
```
Force a rerun and skip cache reuse if needed:
```bash
cargo run -p pd-eval -- run-pack fixtures/packs/terminal_bot_lab_suite.json --workers 8 --no-reuse
```
Use `terminal_bot_lab_suite` as the primary controller workbench. It is the
smoke-tier Earth `half_arc_terminal_v1` matrix over:
- `condition_set = clean`
- `vehicle_variant = empty`
- `pylander`-aligned Earth baseline hardware with empty payload
- `vehicle_variant = half`
- the same hardware with half payload
- `vehicle_variant = full`
- the same hardware with full payload
- `arc_point x velocity_band`
- `baseline` and `current` controller lanes
The maintained terminal baseline now matches the core `pylander`
vehicle/engine envelope closely enough to reason about directly:
- `8m x 10m` hull
- `7200kg` dry mass
- `6300kg` max fuel
- `240000N` max thrust
- `25%` ignited minimum throttle
- `90 deg/s` max rotation rate
The one intentional simplification is fuel use:
- `pd-lab` does not yet model `pylander` overdrive or the nonlinear burn
penalty above nominal thrust
- fuel burn currently scales linearly between minimum and maximum thrust
At the moment:
- `baseline` means the older heuristic baseline controller
- `current` means `terminal_pdg_v1`, the first serious terminal-only PDG-shaped
native Rust controller lane
Batch reports now prefer cached current-lane history compare when a promoted
clean cache exists. The internal `baseline` lane is still useful as a
reference-controller check, but it is no longer the primary progress signal.
Use `terminal_bot_lab_full` when the same matrix should run with the full
seed tier for spread measurement. The `terminal_compare_*_fixture` packs are
only for smoke-testing pack-vs-pack compare output.
Use `terminal_traj_err_suite` and `terminal_traj_err_full` when the same
Earth/payload matrix should exercise projected miss conditions. These packs use
current-lane-only runs over:
- `traj_undershoot_small`
- `traj_undershoot_large`
- `traj_overshoot_small`
- `traj_overshoot_large`
The clean matrix keeps small seed-level radial/speed jitter. The trajectory
error matrix instead owns the lateral miss as a condition-set perturbation:
undershoot stays short on the approach side, overshoot crosses to the far side,
and the configured small/large projected miss magnitudes are recorded in each
resolved run.
That writes:
- `pack.json`
- `resolved_runs.json`
- `summary.json`
- `report.html`
- optional `compare.json`
- per-run bundles under `outputs/eval//runs/`
Batch output keeps stable semantic run directories for inspection while also
recording stable digests for the resolved pack and resolved run set.
The batch report is intentionally compare-friendly:
- a selector-aware review tree as the main drill-down surface
- explicit report context near the top of the page:
- standalone
- lane compare
- external compare
- compare basis
- scope resolution
- compare status
- a regression-policy panel and overview chip for compare runs
- optional candidate-vs-baseline deltas over shared run IDs
- stable links back to per-run detail reports and bundles
At this point the batch/single-run reporting stack and cache workflow are good
enough for real controller iteration. The evaluator can now:
- reuse and promote batch caches
- prefer cached current-lane history compare by default
- classify analytically impossible terminal runs separately from scored
failures
- annotate low-thrust/high-energy frontier cells without removing them from
scoring
- evaluate a default thresholded regression policy over compare runs, scoped to
the preferred current controller lane when both reports contain one
Current checkpoint on the maintained Earth payload tiers:
- `terminal_bot_lab_suite`
- `current`: `168 / 180` scored successes, `12` scored failures,
`9` impossible warnings, `12` frontier annotations
- `terminal_bot_lab_full`
- `current`: `676 / 720` scored successes, `44` scored failures,
`36` impossible warnings, `48` frontier annotations
- by vehicle tier:
- `empty`: `252 / 252`
- `half`: `252 / 252`
- `full`: `172 / 216` scored, `44` fail, `36` impossible warnings,
`48` frontier annotations
Trajectory-error checkpoint:
- `terminal_traj_err_suite`
- `current`: `689 / 720` scored successes, `31` scored failures,
`36` impossible warnings, `48` frontier annotations
- `terminal_traj_err_full`
- `current`: `2754 / 2880` scored successes, `126` scored failures,
`144` impossible warnings, `192` frontier annotations
- by condition:
- `traj_undershoot_small`: `693 / 720` scored, `27` fail,
`36` impossible warnings, `48` frontier annotations
- `traj_undershoot_large`: `707 / 720` scored, `13` fail,
`36` impossible warnings, `48` frontier annotations
- `traj_overshoot_small`: `683 / 720` scored, `37` fail,
`36` impossible warnings, `48` frontier annotations
- `traj_overshoot_large`: `671 / 720` scored, `49` fail,
`36` impossible warnings, `48` frontier annotations
- by vehicle tier:
- `empty`: `1008 / 1008`
- `half`: `1006 / 1008`, `2` fail
- `full`: `740 / 864` scored, `124` fail, `144` impossible warnings,
`192` frontier annotations
So the main next bottleneck is no longer basic controller viability on the
Earth-aligned workbench. Clean `empty` and `half` are solved, clean `full`
is still the low-thrust/high-energy frontier and its failed cells remain
scored, trajectory-error `empty` is solved, trajectory-error `half` has sparse
high-energy scored failures concentrated in `traj_overshoot_large / a60`, and
trajectory-error `full` is the main
frontier-annotated stress tier. Detailed checkpoint history lives in
`docs/progress.md` and `docs/terminal_suite.md`.
The next useful slice is Phase 2 closure work that uses the regression-policy
gate, refines frontier/feasibility semantics where they still affect
interpretation, and adds the next terminal condition space such as terrain or
obstacles. Broad terminal-controller tuning should now be optional and
hypothesis-gated rather than the default path.