An open API service indexing awesome lists of open source software.

https://github.com/bitomule/mav

Deterministic iOS app validation CLI for AI coding agents.
https://github.com/bitomule/mav

accessibility agents ai-agents appium automation claude-code cli ios mobile-testing swift testing xcode

Last synced: 2 days ago
JSON representation

Deterministic iOS app validation CLI for AI coding agents.

Awesome Lists containing this project

README

          


mav logo

# MAV

[![CI](https://github.com/bitomule/mav/actions/workflows/ci.yml/badge.svg)](https://github.com/bitomule/mav/actions/workflows/ci.yml)
[![Release](https://img.shields.io/github/v/release/bitomule/mav?display_name=tag)](https://github.com/bitomule/mav/releases)
[![License: MIT](https://img.shields.io/badge/license-MIT-blue.svg)](LICENSE)

The iOS control plane for AI coding agents: one command surface, native drivers underneath, and evidence your agent can hand back to a human.


A split screen showing an iOS simulator on the left and a terminal on the right running mav ui tap, mav ui tree, and mav capture, with compact agent-readable output

Mobile Agent Verifier (`mav`) is the interface between an agent and iOS. The
agent asks for intent-level operations like `ui tree`, `tap`, `pinch`,
`network start`, or `evidence report`; MAV routes each operation to the best
native backend available on that target, records what happened, and returns a
compact result the next turn can act on.

MAV is intentionally not an autonomous testing agent. It runs the command. The agent decides what to run next.

## Why MAV?

MAV gives agents one stable API over the messy iOS toolchain. Agents ask for a
capability; MAV picks the driver for the selected simulator or device and
returns compact output the next turn can parse:

- Accessibility tree, semantic taps, waits, and screenshots go through AXe when
it is healthy.
- Simulator multitouch, system UI, hardware buttons, erase, and hideKeyboard go
through Baguette.
- Physical device install, launch, coordinate input, logs, screenshots, and
crashes go through idb.
- Simulator crash checks read local DiagnosticReports directly.
- Simulator lifecycle, video, and logs go through simctl.
- Simulator network evidence goes through mitmproxy HAR capture.

Runs can record accepted video, named screenshots, accessibility tree snapshots,
log tails, crash reports, command trails, and optional HAR network traffic.
`mav evidence report` writes a verified manifest for those artifacts; the MAV
skill turns the manifest into a visual HTML report for humans.

Native MAV YAML flows compose setup, UI actions, waits, assertions, logs,
crashes, network capture, and report generation without hiding the underlying
command trail.

MAV uses a project-local launch recipe to build, locate, install, and launch
the app. Bazel, Xcode, Tuist, Make, Just, and project scripts are setup-time
templates only; runtime executes the configured recipe.

## How an agent uses MAV


The mav loop: agent decides next action, mav executes a deterministic command, agent reads the compact output, loops

Each call is one verb. The agent picks the next verb based on the previous
output. The commands that cover most flows are `mav ui tree`, `mav ui tap`,
`mav capture`, and `mav logs`. Use `mav --help` and nested help such as
`mav ui tap --help` or `mav evidence report --help` for the full command
surface.

## Used at

`mav` runs in development on these production iOS apps:

- [Undolly](https://undolly.app) — finding duplicate photos
- [Boxy](https://boxy-app.com/) — organising physical items
- [HiddenFace](https://hiddenface.app) — privacy-first face blur

## Status

MAV is early and evolving. The current stable pieces are:

- Configurable project launch recipes.
- Setup-time detection for common project launch commands.
- Simulator selection, boot, install, launch, screenshot, and video.
- Physical device selection, install, launch, logs, screenshots, UI actions,
crashes, and evidence screenshots.
- AXe-first accessibility tree inspection and semantic interactions.
- idb coordinate taps and device/simulator fallback capabilities.
- Baguette-backed multitouch gestures, system UI tree, hardware buttons, and
keyboard helpers on simulator.
- Native MAV YAML flows through `mav run`.
- Verified evidence manifests in `.mav/runs//report.json`; the MAV
skill authors the visual HTML report from that data.
- Filtered unified log capture for explicit MAV probes.

![MAV driver router](assets/router.svg)

## Requirements

- macOS.
- Xcode command line tools.
- Go, for development builds.
- AXe, for accessibility tree and semantic UI actions.
- idb, for coordinate taps and device/simulator fallback operations.
- Baguette, for simulator multitouch (pinch, two-finger pan), the
SpringBoard / system UI tree, hardware buttons, keyboard erase, and
hideKeyboard. Sim-only — device multitouch is intentionally unsupported.
- mitmproxy, optional, for `mav network start|stop` HAR capture on the
simulator. Install with `mav setup --install mitmproxy`.

Check the local environment:

```bash
mav doctor
```

`mav doctor` reports capability availability. MAV routes commands by
capability: accessibility and semantic actions use AXe, coordinate taps and
device fallback use idb, multitouch and system UI use baguette on simulator.
Physical iOS devices require idb for install, launch, logs, screenshots, and
crashes. Simulator crash checks use local DiagnosticReports directly, avoiding
idb_companion crash-list parser failures from unrelated malformed reports.
Multitouch gestures, system-UI trees, and hideKeyboard return structured errors
on device — use a simulator for those flows.

Configure the project or install supported helper tools:

```bash
mav setup
```

`mav setup` is idempotent and interactive by default. It scaffolds or refreshes
`.mav/config.yaml` by detecting app identity, simulator defaults, UI tools, and
an editable launch recipe, then asks you to accept or replace each value.
Existing explicit choices in `.mav/config.yaml` are preserved. Use
`mav setup --non-interactive` for CI/scripts.

```bash
mav setup --install axe idb baguette
```

`mav setup --install idb` prefers pipx with Python 3.12/3.13 for `fb-idb` and
uses Homebrew for `idb-companion`. AXe and Baguette are installed via Homebrew
(`cameroncooke/axe/axe` and `tddworks/baguette/baguette`).

## Install

With Homebrew:

```bash
brew install bitomule/tap/mav
```

Install the MAV skill globally with Vercel's Skills CLI:

```bash
mav install-skills
```

This runs:

```bash
npx skills add bitomule/mav --skill mav --global --yes
```

Build from source:

```bash
git clone https://github.com/bitomule/mav.git
cd mav
make build
```

Run the development binary:

```bash
.build/mav help
```

Or put it on your `PATH`:

```bash
ln -sf "$PWD/.build/mav" /usr/local/bin/mav
```

Release binaries are built by the GitHub release workflow for tagged releases.
Homebrew packaging lives in `packaging/homebrew/mav.rb` and is published to
`bitomule/tap`.

The release workflow can also update `bitomule/homebrew-tap` automatically. The
`bitomule/mav` repo must define a `COMMITTER_TOKEN` secret with permission to
push to `bitomule/homebrew-tap`; this is the same pattern used by Koubou.

## Quick Start

Run from the root of an iOS app repo:

```bash
mav setup
mav sim list
mav sim select --device "iPhone 17 Pro Max" --ios 26
mav open
mav ui tree
```

`mav setup` scaffolds `.mav/config.yaml`. By default it is interactive: MAV detects a bundle id, selected simulator, locale/language,
available tools, and a launch recipe when it can infer one, then lets you accept
or replace each value. Use `mav setup --non-interactive` for CI/scripts.
Launch recipe detection is intentionally conservative: MAV recognizes explicit
`Makefile`/`justfile` MAV targets, `scripts/mav-build` plus
`scripts/mav-app-path`, and standard Bazel/Tuist/Xcode project shapes.

`mav open` executes the configured launch recipe. It creates a persistent run
directory under `.mav/runs//` and starts `logs.txt` for MAV probes. Use
`mav open --clear-state` to uninstall the configured bundle before install and
launch. If a Bazel app bundle from `bazel-out` fails simulator install with a
permission error, MAV copies the `.app` into the run directory with writable
permissions and retries the install.

Use `mav open --no-relaunch` when the app was launched manually with custom
environment such as `SIMCTL_CHILD_*` and MAV should only attach run logging to
the app already in front.

Example compact output:

```text
ok cmd=setup bundle=com.example.app config=/repo/.mav/config.yaml launch_recipe=ok multitouch=missing multitouch_next="mav setup --install baguette"
ok cmd=open run=7fd logs=/repo/.mav/runs/7fd/logs.txt target="iPhone 17 Pro Max"
ok cmd=ui.tree driver=axe nodes=42 screen=unknown recognized_screen=settings screen_source=recognized
node index=1 id=settings_button label=Settings role=button enabled=true frame="{{20, 120}, {180, 44}}"
```

Use `--raw` only when the underlying tool output is needed:

```bash
mav --raw ui tree
```

## Help

```bash
mav --help
mav ui --help
mav ui tap --help
mav flow lint --help
mav evidence report --help
```

Help is intentionally hierarchical. The README explains the workflow; the CLI
owns the current command reference.

## Output Contract

Default output starts with one compact status line. Commands that inspect
structured state, such as `mav ui tree`, may add bounded detail lines after it:

```text
ok cmd= key=value key=value
fail code= key=value key=value
```

Examples:

```text
ok cmd=capture file=/tmp/mav/7fd/captures/20260503T120000.000.png run=7fd
ok cmd=logs file=/tmp/mav/7fd/logs.txt matches=1 run=7fd
fail code=ui_tree_empty driver=axe reason=simulator_accessibility_unavailable recovered=false
```

The goal is to give agents the minimum useful fields: what happened, where the
artifact is, and what to do next when the command failed.

## Project And Run State

Project state:

```text
.mav/config.yaml
```

Run state:

```text
.mav/runs//logs.txt
.mav/runs//commands.jsonl
.mav/runs//evidence.jsonl
.mav/runs//steps/*.png
.mav/runs//trees/*.json
.mav/runs//video.mov
.mav/runs//crashes/
.mav/runs//report.json
```

`/tmp` may resolve to a macOS per-user temporary directory such as
`/var/folders/.../T`.

Prefer target selectors in this order:

1. Accessibility id: `mav ui tap --id home_settings_button`
2. Coordinates: `mav ui tap --x 398 --y 84`
3. Text: `mav ui tap --text Settings`

Coordinates should be used only when the accessibility tree is insufficient and
a screenshot makes the target unambiguous. Text is the last fallback because
labels change with localization and copy edits.

## UI Usage

Start with the accessibility tree:

```bash
mav ui tree
mav ui tree --include-system
```

MAV chooses drivers by capability. AXe is the default fast path for
accessibility tree inspection, semantic taps, typing, swipes, waits, and
assertions. idb is used for coordinate taps and device/simulator fallback
operations. Baguette provides multitouch, system UI, hardware buttons, erase,
and hideKeyboard on simulator.

For `mav ui tree` and semantic `mav ui tap`, `--prefer-driver auto` is the
default. Use `--prefer-driver axe` to debug AXe-only behavior. `mav ui tree
--include-system` asks baguette for the SpringBoard/system tree when a system
process or cross-app surface is in front (PHPicker, App Tracking Transparency,
permission prompts, SpringBoard, iOS 26 service processes). System-tree
inspection is simulator-only.

If `mav ui tap --text X` fails because AXe sees `X` as a value/placeholder but
not as a label, MAV reports `ui_tap_text_no_label_match` with `matched_value`.
Prefer stable accessibility ids when possible.

For exact syntax, ask the command:

```bash
mav ui tap --help
mav ui wait --help
mav ui pinch --help
```

`mav ui erase` and `mav ui hideKeyboard` dispatch through baguette on
simulator. On a physical device they return `erase_unsupported_on_device` and
`hide_keyboard_unsupported_on_device` respectively. Tap and retype the field,
or tap outside the input area to dismiss the keyboard.

True multitouch gestures that Baguette currently exposes (pinch and
two-finger pan) go through baguette on simulator. On device they return
`gesture_unsupported_on_device` with a remediation hint — use a simulator for
multitouch flows. Rotate and W3C Actions remain reserved flow/CLI surfaces
until MAV adds a reliable Baguette translation for them.

Observation priority:

1. `mav ui tree`
2. `mav capture`
3. Video through `mav evidence start/stop` or flows

Screenshots are for visual layout, custom rendering, media/canvas UI, or
user-facing proof. The accessibility tree is cheaper and more useful for most
agent decisions.

If AXe/idb return a single empty `AXApplication` tree, MAV treats simulator
accessibility as unavailable. It attempts a simulator reboot, app relaunch, and
tree retry before returning `ui_tree_empty`.

## Native MAV Flows

`mav run ` executes a native MAV YAML flow.

Use flows for repeatable feature validation:

```yaml
name: verify_daily_reminder
steps:
- open: { clearState: true } # clear-state is also accepted
- go: { screen: settings }
- wait: { text: Daily Reminder, timeout: 5s }
- evidence.start: { network: true }
- evidence.step: { name: before-toggle, note: Daily Reminder before tap }
- tap: { text: Daily Reminder }
- type: "Search text"
- type: { text: "user@example.com" }
- erase: { focused: true }
- hideKeyboard: {}
- delay: 500ms
- when: { visible: { text: Continue } }
do:
- tap: { text: Continue }
- whileNotVisible:
text: "You"
timeout: 30s
do:
- tap: { id: onboarding_dismiss, optional: true }
- delay: 500ms
- waitUntil:
any:
- text: "Don't Allow"
- text: "Allow"
- changedFrom: before-toggle
timeout: 5s
- evidence.step: { name: after-toggle, note: Result after tapping reminder }
- pinch: { x: 200, y: 450, scale: 0.5, panX: 80, panY: -40, duration: 800ms }
- twoFingerPan: { x: 200, y: 450, panX: 80, panY: -40, duration: 800ms }
- logs: { key: SettingsReached }
- crashes: {}
- evidence.stop: {}
- report: {}
```

Semantic flow steps inherit the process-level `--prefer-driver auto|axe`
setting from `mav run`. A step can override it with `prefer-driver` when one
interaction needs a specific backend:

```yaml
- tap: { text: "Deporte y ocio", prefer-driver: axe }
- wait: { text: "Continuar", prefer-driver: axe, timeout: 5s }
```

This applies to `tree`, `tap`, `swipe`, `wait`, `assert`, `waitUntil`, and
`scrollUntil`.

Supported step types:

```text
open
go
tree
tap
type
erase
hideKeyboard
swipe
pinch
twoFingerPan
wait
waitUntil
when
whileNotVisible
include
assert
capture
scrollUntil
delay
sleep
logs
exec
crashes
network.start
network.stop
network.status
evidence.start
evidence.step
evidence.stop
video.start
video.stop
report
```

`hideKeyboard` dispatches through baguette on simulator. On device it returns
`hide_keyboard_unsupported_on_device`.

`type`, `delay`, and `sleep` accept both scalar and object forms. These are
equivalent:

```yaml
- type: "Search text"
- type: { text: "Search text" }
- delay: 500ms
- delay: { duration: 500ms }
- sleep: 500ms
- sleep: { duration: 500ms }
```

On failure, MAV stops run-owned processes, tries to capture failure evidence,
writes report data, and returns a compact failure line.

Use `wait` for a single `id`, `text`, or `value`. Use `waitUntil` with `any`
when more than one result is acceptable, and use `changedFrom` after a named
evidence step when the UI change is visual rather than semantic.

Use `when` for optional UI. MAV evaluates the condition once; if it is visible,
it runs the `do` block, otherwise it skips the block without failing. `do`
blocks are for UI/evidence steps and cannot contain `open` or `exec`:

```yaml
- when: { visible: { id: ToggleX } }
do:
- tap: { id: ToggleX }
```

Use `whileNotVisible` for chained onboarding or permission surfaces. MAV repeats
the `do` block until the target `id`, `text`, `value`, or `any` condition is
visible, or until `timeout` expires:

```yaml
- whileNotVisible:
text: "You"
timeout: 30s
do:
- tap: { id: dismiss_button, optional: true }
- delay: 500ms
```

Use `include` to compose reusable sub-flows. The included file path is resolved
relative to the file that declares it, and `env` values are available to the
included flow as `${env.NAME}`. The `file` field may also reference values from
the same `env` block:

```yaml
- include:
file: "components/auth/${env.USER}.mav.yaml"
env:
USER: sellersXp
FRESH_INSTALL: true
```

## Evidence

Evidence is explicit. Use it when a user needs proof of verification.

For feature behavior, use a flow with named evidence points:

```yaml
- open: {}
- tap: { id: HomeView.settingsButton }
- wait: { id: daily_reminder_button, timeout: 5s }
- video.start: {}
- evidence.step: { name: before-toggle, note: Before tapping Daily Reminder }
- tap: { id: daily_reminder_button }
- waitUntil:
any:
- id: notification_permission_alert
- changedFrom: before-toggle
timeout: 5s
- evidence.step: { name: after-toggle, note: After tapping Daily Reminder }
- video.stop: {}
- report: {}
```

Start recording as late as possible: navigate and wait for the state first when
navigation is setup, then record the behavior under test. Screenshots should
prove the behavior itself, not only that the app opened. The supported video
recording flow steps are `video.start` and `video.stop`; `evidence.start` and
`evidence.stop` remain supported aliases. Add `network: true` to
`evidence.start` when the proof window should also capture a simulator HAR via
mitmproxy:

```yaml
- evidence.start: { network: true }
- tap: { id: refresh_button }
- wait: { id: loaded_state, timeout: 10s }
- evidence.stop: {}
- report: {}
```

Flows can also control network capture explicitly:

```yaml
- network.start: {}
- tap: { id: refresh_button }
- network.status: {}
- network.stop: {}
```

`mav evidence report` writes `.mav/runs//report.json` for project runs
and prints
`video=` only when a valid video exists. It prints `video=missing` when
the run has no recording, and `video=invalid` with `video_issue=...` when the
file exists but is not acceptable evidence. When `network.har` exists, the
manifest includes request, response, status, and domain counts so the HTML
report can prove which network traffic happened inside the evidence window. A
report without an accepted video does not prove video evidence was captured.

The CLI owns the evidence data. The MAV skill owns the visual HTML report: it
reads the manifest, uses `skills/mav/templates/evidence-report.html` as a
reference, and writes a self-contained `.mav/runs//report.html`
tailored to the run. MAV does not open HTML automatically; inspect the reported
HTML file after the skill writes it.

## Logs

`mav open` and `mav run` capture a filtered unified log stream into `logs.txt`.
The predicate includes the configured MAV probe subsystem/category, `MAV_LOG`
messages, the app process when `process_name` is configured, and the app bundle
subsystem when `bundle_id` is configured.

Use `OSLog.Logger` probes to prove code execution:

```swift
import OSLog

private let mavLog = Logger(
subsystem: "mav.com.example.app",
category: "probe"
)

mavLog.notice("MAV_LOG key=SettingsReached")
```

Then read logs from the current run:

```bash
mav logs --key SettingsReached
mav logs --contains SettingsReached
mav --raw logs --key SettingsReached
```

Prefer `OSLog.Logger` for validation probes. `NSLog` from the configured app
process is also captured when `process_name` is set.

For trusted project-local shell assertions, opt in through `.mav/config.yaml`:

```yaml
allow_shell: true
```

Then use an `exec` step:

```yaml
- exec: { cmd: "grep -F 'MAV_LOG key=SettingsReached' $MAV_LOGS", contains: SettingsReached, timeout: 5s }
```

`exec` runs in the project root with `MAV_ROOT`, `MAV_RUN_ID`, `MAV_RUN_DIR`,
and `MAV_LOGS` set. This is an opt-in guard for trusted project checks, not a
security sandbox for untrusted commands.

Use `out` to bind trimmed stdout for later steps. The binding name must use
letters, numbers, `_`, or `-`, and cannot start with a number or `-`. JSON
stdout exposes nested fields; plain text stdout is available as the binding
itself:

```yaml
- exec:
cmd: "node utils/get_test_user.js sellersXp"
out: credentials
timeout: 10s
- tap: { id: EmailField }
- type: "${exec.credentials.email}"
```

## Simulators

```bash
mav sim list
mav sim select --device "iPhone 17 Pro Max" --ios 26 --locale es_ES --language es
mav sim select --udid
mav sim boot
```

You can also pass simulator selection flags to `mav open`:

```bash
mav open --device "iPhone 17 Pro Max" --ios 26 --locale es_ES --language es
```

## Physical Devices

List and select connected iOS devices:

```bash
mav device list
mav device select --udid
mav device select --name "David iPhone"
```

`mav device select` switches the active target to `target_kind: device` in
`.mav/config.yaml`. `mav sim select` switches it back to `target_kind:
simulator`. For physical devices, MAV uses idb for install, launch, log
capture, screenshots, and crash listing:

```yaml
launch:
mode: custom
commands:
build: ./scripts/mav-build-device.sh
app_path: ./scripts/mav-app-path-device.sh
install: idb install --udid "$MAV_UDID" "$MAV_APP_PATH"
launch: idb launch --udid "$MAV_UDID" -f "$MAV_BUNDLE_ID"
```

The generated simulator install/launch recipe is automatically mapped to idb
when the active target is a physical device. Video recording is simulator-only
in this release; use `capture` / `evidence.step` screenshots for device
evidence.

## Launch Recipes

MAV does not own the build system. Configure project commands in
`.mav/config.yaml`:

```yaml
app:
bundle_id: com.example.app
process_name: Example

launch:
mode: custom
commands:
build: ./scripts/mav-build.sh
app_path: ./scripts/mav-app-path.sh
install: xcrun simctl install "$MAV_UDID" "$MAV_APP_PATH"
launch: xcrun simctl launch "$MAV_UDID" "$MAV_BUNDLE_ID"
```

Each command runs from `MAV_ROOT` with stable environment variables:
`MAV_ROOT`, `MAV_RUN_DIR`, `MAV_TARGET_KIND`, `MAV_IS_DEVICE`, `MAV_UDID`,
`MAV_BUNDLE_ID`, `MAV_APP_PATH`, `MAV_DEVICE_NAME`, `MAV_RUNTIME`, and
`MAV_PLATFORM`. `app_path` must print one `.app` path. If the app is already
installed, configure only `launch`.

`mav open --clear-state` runs `xcrun simctl uninstall "$MAV_UDID"
"$MAV_BUNDLE_ID" || true` before the launch recipe. When the configured install
step fails with a permission error for a `bazel-out` `.app`, MAV retries with a
writable copy at `/tmp/mav//app.tmp/.app`.

## Cleanup

Ad-hoc `mav open` sessions keep log capture running for the current run. Stop
them when done:

```bash
mav stop
```

`mav run` stops run-owned streams automatically.

## Troubleshooting

`fail code=config_not_found`

Run:

```bash
mav setup
```

`fail code=ui_tap_failed` after a screen transition

The target element is not in the current AX tree. Inspect what mav sees:

```bash
mav open
mav ui tree --include-system
```

Then refine the selector based on what shows up. Prefer accessibility ids over
text.

`fail code=ui_tree_empty`

The simulator accessibility service did not recover after MAV retried. Re-run
`mav open` or select another simulator with `mav sim select`.

`CoreSimulator` or `idb` permission failures

MAV needs direct simulator/device access for launch, accessibility, coordinate
taps, screenshots, video, and multitouch. If output says to rerun outside the
sandbox, do that instead of retrying the same command in the sandbox.

`mav logs --key ...` returns no matches

Make sure the app logs with `OSLog.Logger` using the configured MAV subsystem
and category, and make sure the behavior happened after MAV started the run.

## Development

```bash
make test
make build
make check
```

`make check` runs `gofmt`, tests, and a local build.

## Contributing

Issues and pull requests are welcome. Keep changes deterministic and preserve
compact output: commands should report the minimum information an agent needs to
continue, parse, or present evidence.

See [CONTRIBUTING.md](CONTRIBUTING.md).

## License

MIT. See [LICENSE](LICENSE).