{"id":51122508,"url":"https://github.com/aflplusplus/fuzz-reachability","last_synced_at":"2026-06-25T04:00:59.304Z","repository":{"id":366325937,"uuid":"1274400559","full_name":"AFLplusplus/fuzz-reachability","owner":"AFLplusplus","description":"Function reachability analysis for harnesses für C/C++/Rust","archived":false,"fork":false,"pushed_at":"2026-06-21T08:57:54.000Z","size":153,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-06-21T10:21:11.827Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"agpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/AFLplusplus.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-06-19T13:25:57.000Z","updated_at":"2026-06-21T08:57:58.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/AFLplusplus/fuzz-reachability","commit_stats":null,"previous_names":["aflplusplus/fuzz-reachability"],"tags_count":null,"template":false,"template_full_name":null,"purl":"pkg:github/AFLplusplus/fuzz-reachability","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/AFLplusplus%2Ffuzz-reachability","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/AFLplusplus%2Ffuzz-reachability/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/AFLplusplus%2Ffuzz-reachability/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/AFLplusplus%2Ffuzz-reachability/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/AFLplusplus","download_url":"https://codeload.github.com/AFLplusplus/fuzz-reachability/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/AFLplusplus%2Ffuzz-reachability/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":34758776,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-25T02:00:05.521Z","response_time":101,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2026-06-25T04:00:57.870Z","updated_at":"2026-06-25T04:00:59.280Z","avatar_url":"https://github.com/AFLplusplus.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Static Fuzz-Reachability Analyzer (C / C++ / Rust)\n\nGiven a project's build, this tool computes the set of functions a fuzz entry\npoint (`LLVMFuzzerTestOneInput`, a Rust cargo-fuzz target, or any entry you name)\ncan **statically reach**. It works uniformly across C, C++, and Rust — including\nmixed-language projects — by analyzing merged LLVM bitcode.\n\nThe result is a **sound-leaning over-approximation**: it answers *which functions\ncan be reached*, not which ones ran. No function that is actually reachable is\never reported unreachable. Over-reporting is expected and safe; under-reporting\nis a bug.\n\n**Deep dives:**\n- Worked examples, step by step — a generic `LLVMFuzzerTestOneInput` harness for AFL++/libfuzzer (libxml2), a ziggy harness (the `url` crate), and cargo-afl harnesses (cpp_demangle and rustyknife) — [`docs/EXAMPLES.md`](docs/EXAMPLES.md)\n- LLVM version support — [`docs/llvm-support.md`](docs/llvm-support.md)\n\nAuthor: Marc \"vanHauser\" Heuse\n\nLicense: GNU AGPL v3 or newer\n\n\n## How to use in a fuzzing campaign\n\n```mermaid\nflowchart TD\n    A[create fuzzing harness] --\u003e B[instrument target]\n    B --\u003e C[run fuzz campaign]\n    C --\u003e D[coverage analysis]\n    D --\u003e A\n\n    E[fuzz-reachability] --\u003e B\n    E --\u003e D\n\n    classDef blue fill:#cce5ff,stroke:#0056b3,color:#003366\n    classDef green fill:#d4edda,stroke:#1e7e34,color:#0f4019\n\n    class A,B,C,D blue\n    class E green\n\n    linkStyle 0,1,2,3 stroke:#0056b3,stroke-width:2px\n    linkStyle 4,5 stroke:#1e7e34,stroke-width:2px\n```\n### Instrument fuzzing target\n\nUse the reachability information to only instrument what is reachable.\n**Note that this is pointless in full link time optimization targets** (C/C++: afl-clang-lto, Rust: default) because this is already done by the compiler.\n\nFor AFL++ (AFL++ native LLVM plugins):\n```\nexport AFL_LLVM_ALLOWLIST=$(pwd)/reached.txt\nmake/meson/ninja/cmake/...\n```\n\nFor sancov (AFL++ sancov, libfuzzer, honggfuzz, etc.):\n```\nexport CFLAGS=\"-fsanitize-coverage-allowlist=$(pwd)/reached.txt\"\nexport CXXFLAGS=\"-fsanitize-coverage-allowlist=$(pwd)/reached.txt\"\nmake/meson/ninja/cmake/...\n```\n\n### Coverage analysis\n\nUse with [cov-analysis](https://github.com/AFLplusplus/cov-analysis) to see:\n- what is reachable but not reached yet by the fuzzing corpus\n- what is unreachable by the harness but should be fuzzed\n\n```\ncov-analysis report -d ../target-afl/out -e ./harness-cov -T 4 --reachability reachability.json\n```\n\n\n## How it works\n\n```\n driver (Python)              analyzer (C++ / LLVM)\n ───────────────              ─────────────────────\n   acquire bitcode ─┐\n   C/C++ : gllvm    │   llvm-link    load .bc → build call graph →\n   Rust  : rustc    ├─► merge .bc ─► resolve indirect calls → BFS from\n   --emit=llvm-bc  ─┘                entry → JSON report + sancov lists\n```\n\nTwo components, joined by merged bitcode:\n\n- **Driver** (Python) — acquires bitcode per language, merges it with\n  `llvm-link`, verifies the LLVM toolchain is version-coherent, and runs the\n  analyzer.\n- **Analyzer** (C++ linking LLVM) — loads the merged `.bc`, builds the call\n  graph, resolves indirect calls (C function pointers, C++ virtual dispatch, Rust\n  `dyn`/`fn` pointers), treats function pointers that escape to code outside the\n  bitcode (handed to an external/indirect call or returned — e.g. qsort/bsearch\n  comparators, atexit/pthread/`std::call_once` callbacks) as reachable, computes\n  reachability from the entry, and emits a JSON report plus the two sancov lists.\n  It demangles C++ (Itanium) and Rust names.\n\n## Prerequisites\n\n- **LLVM ≥ 21.** One coherent toolchain: `clang`, `clang++`, `llvm-link`, `opt`,\n  and the analyzer all share one major **M ≥ 21**, and rustc's LLVM is no newer\n  than M. See [`docs/llvm-support.md`](docs/llvm-support.md).\n  **NOTE!** especially as a Rust user, we recommend to install LLVM via\n  https://apt.llvm.org/llvm.sh instead of the distribution, as those will be outdated!\n- **Go** (to install `gllvm`), **Python ≥ 3.12**, and a **C++17** compiler. Rust\n  targets also need **rustc / cargo** (nightly, but one using LLVM 21 or prior).\n- [AFL++](https://github.com/AFLplusplus/AFLplusplus) compiled from commit 01a83a3d7098e605f0c7fd69381fcf4fc97144fe onwards (24 June 2026)\n- [cov-analysis](https://github.com/AFLplusplus/cov-analysis) from commit 72c239038430477181df99f7a2cd0a556f2701dd onwards (23 June 2026)\n\n## Install\n\nThe analyzer builds with a plain Makefile driven by `llvm-config` — no CMake.\n\n```bash\nbash scripts/setup.sh        # gllvm + rust-src, create .venv, build the analyzer\n```\n\nOr piecemeal:\n\n```bash\nmake venv                    # create .venv (driver, editable + pytest)\nmake build                   # build the analyzer on the auto-selected LLVM (≥ 21)\nmake build LLVM_MAJOR=23     # ...or pin a specific major\nmake test                    # run the full test suite\nmake matrix                  # build + test against every installed LLVM ≥ 21\nmake help                    # list all targets\n```\n\nTo run the CLI, point it at the built analyzer and put `gllvm` on `PATH`:\n\n```bash\nexport REACHABILITY_ANALYZER=$PWD/analyzer/build/reachability-analyzer\nexport PATH=\"$(go env GOPATH)/bin:$PATH\"     # gclang / gclang++ / get-bc\nsource .venv/bin/activate                    # or call .venv/bin/reachability directly\nreachability check-toolchain                 # verify LLVM version coherence first\n```\n\n## Quick start\n\n```bash\nreachability run --lang \u003ctarget\u003e --project \u003cdir\u003e [--out \u003cfile\u003e]\n```\n\n`--out` is optional; it defaults to `reachability.json` in the `--project`\ndirectory. If `--out` points at an existing directory, the report is written\nto `reachability.json` inside it.\n\n`\u003ctarget\u003e` is a source language (`c`, `cpp`, `rust`, `mixed`) or a Rust fuzz\nharness (`libfuzzer`, `ziggy`, `afl`). Each sets a default entry point, so the\ncommon case needs no `--entry`. The build command and the artifact are\nauto-detected for C/C++; override them with `--build-cmd` / `--artifact` when\nneeded.\n\nFull options: [Command-line reference](#command-line-reference).\n\n## Examples\n\nRead about real-world target examples in [docs/EXAMPLES.md](docs/EXAMPLES.md)\n\n### A C target\n\n`fixtures/c_direct` is a small C fuzz target. Its build and artifact are\nauto-detected:\n\n```bash\nreachability run --lang c --project fixtures/c_direct --out c.json -v\n```\n\n```\nreachable 3 / defined 4  (0 indirect-only, 1 unreachable)  [backend=type-based]\n```\n\n`LLVMFuzzerTestOneInput → used_a → used_b` are reachable; `dead_fn` lands in\n`unreachable_defined`.\n\n### A C++ target (CMake)\n\n`examples/cpp_cmake` uses virtual dispatch. The driver detects the CMake build,\nwraps it with `gllvm`, and analyzes the resulting executable:\n\n```bash\nreachability run --lang cpp --project examples/cpp_cmake --out cpp.json -v\n```\n\nThe virtual call `Codec::decode` over-approximates to **both** overrides, reached\nvia indirect edges:\n\n```\n  Raw::decode(unsigned char const*, unsigned long) | via indirect\n  Xor::decode(unsigned char const*, unsigned long) | via indirect\n```\n\n### A Rust target\n\n`fixtures/rust_dyn` is a Rust `staticlib` whose `LLVMFuzzerTestOneInput`\ndispatches through a `dyn Trait`. The driver builds it with\n`RUSTFLAGS=\"--emit=llvm-bc …\"` and collects the per-crate bitcode:\n\n```bash\nreachability run --lang rust --project fixtures/rust_dyn --out rust.json -v\n```\n\nThe trait-object call resolves to both implementations, via indirect edges:\n\n```\n  \u003crust_dyn::Inc as rust_dyn::Op\u003e::run | via indirect\n  \u003crust_dyn::Dbl as rust_dyn::Op\u003e::run | via indirect\n```\n\n### A mixed C + Rust target (cargo-fuzz shape)\n\n`fixtures/mixed_c_rust` has C++ glue calling an `extern \"C\"` Rust entry. Use\n`--lang mixed`; the driver builds and merges both sides' bitcode (gllvm for the\nglue, cargo for Rust), and the cross-language edge resolves by C-ABI symbol name:\n\n```bash\nreachability run --lang mixed --project fixtures/mixed_c_rust \\\n  --artifact glue.o --out mixed.json -v\n```\n\nPoint `--artifact` at the C/C++ object so it is picked out from the Rust build\noutputs.\n\n### A target that links a static library\n\nA tool linked against a static library (say `tools/thumbnail` linking\n`libtiff.a`) embeds only the archive members the linker actually pulled in. To\nanalyze the **whole library** — not just the slice the linker kept — point\n`--artifact` at the linked binary and keep the default `--static-libs auto`:\n\n```bash\nreachability run --lang c --project tiff-4.0.4 --artifact tools/thumbnail \\\n  --out tiff.json -v\n```\n\nThe driver merges `thumbnail`'s own objects with the full contents of\n`libtiff.a`. Functions in members the linker discarded (e.g. `TIFFReadRGBAImage`,\n`TIFFPrintDirectory`) now show up as unreachable instead of vanishing, while the\nreachable set is unchanged from the linker's view — adding the rest of the\nlibrary can only add *unreachable* functions, never remove reachable ones. Use\n`--static-libs none` for the linker's view only, or `all` to include every\nbitcode archive in the tree.\n\n### A ziggy harness\n\nA [ziggy](https://github.com/srlabs/ziggy) harness is a Rust binary whose fuzz\nloop lives in `main` rather than in `LLVMFuzzerTestOneInput`. `--lang ziggy`\nbuilds it with its own driver (`cargo ziggy build --no-honggfuzz`) and roots at\n`main` automatically:\n\n```bash\nreachability run --lang ziggy --project \u003charness\u003e --out z.json\n```\n\nBuilding through the fuzzer's own command (likewise `cargo afl build` for\n`--lang afl`, `cargo fuzz build` for `--lang libfuzzer`) is deliberate: it sets\nthe same `cfg(fuzzing)`, optimization level, and instrumentation as the binary\nyou actually fuzz, so the reachable set matches it. Override the command with\n`--build-cmd` (e.g. to pick a profile, sanitizer, or single target).\n\n\u003e For complete, start-to-finish walkthroughs on real targets — ziggy (the `url`\n\u003e crate), cargo-afl (cpp_demangle and rustyknife), and libFuzzer (libxml2)\n\u003e harnesses — see [`docs/EXAMPLES.md`](docs/EXAMPLES.md).\n\n## Command-line reference\n\nThe `reachability` CLI has two subcommands.\n\n### `reachability check-toolchain`\n\nResolves and validates the LLVM toolchain (analyzer, `clang`/`clang++`,\n`llvm-link`, `opt`, `rustc`) for version coherence and prints what it found. Run it\nfirst; it exits non-zero on any incoherence. See\n[`docs/llvm-support.md`](docs/llvm-support.md) for the policy.\n\n### `reachability run`\n\nBuilds a project, merges its bitcode, and computes the reachable set from the\nentry point(s).\n\n| Option | Default | Meaning |\n|--------|---------|---------|\n| `--project DIR` | *(required)* | Project directory to build and analyze. |\n| `--lang TARGET` | *(required)* | Target type (see the table below): sets how bitcode is acquired and the default entry. |\n| `--out FILE` | `reachability.json` in `--project` | Path for the JSON report. A directory writes `reachability.json` into it. The two sancov lists default to `reached.txt` / `not_reached.txt` beside it. |\n| `--entry NAME` | per `--lang` | Entry to root reachability at. **Repeatable**; overrides the target default. See [Entry resolution](#entry-resolution). |\n| `--backend NAME` | *(none)* | Deprecated and ignored; the type-based backend is always used. Accepted for backward compatibility — passing it prints a warning. |\n| `--artifact PATH` | auto-detect | C/C++ only: the built binary/object/archive to extract bitcode from (relative to `--project`). Auto-detected otherwise, preferring an executable over a shared library, archive, then object. |\n| `--build-cmd CMD` | auto-detect | Shell build command. C/C++: run with `gllvm` injected, auto-detected otherwise (`configure` → `Makefile` → `CMakeLists.txt` → `build.ninja` → `meson.build`, else `make`); e.g. `\"cmake -S . -B build \u0026\u0026 cmake --build build\"`. `libfuzzer`/`ziggy`/`afl`: overrides the native build command (default `cargo fuzz build` / `cargo ziggy build --no-honggfuzz` / `cargo afl build`). |\n| `--static-libs {auto,none,all}` | `auto` | C/C++ only: how to treat static archives (`.a`) the target links. `auto` also analyzes each linked archive in full, so members the linker dropped are reported rather than silently absent. `none` keeps only the linker's view. `all` includes every bitcode archive in the tree. Exact archive manifests prevent linked objects from being dropped; unresolved duplicate definitions fail the merge instead of silently replacing one body. |\n| `--profile {debug,release}` | tool default | Build profile. `libfuzzer`/`ziggy`/`afl`: `release` adds `--release` to the native command (else the tool's default). Plain `--lang rust`: the cargo profile (default `debug`). See [Matching the fuzz binary's build](#matching-the-fuzz-binarys-build). |\n| `--codegen-units N` | auto | Plain `--lang rust` only (positive integer): rustc `-Ccodegen-units`, auto-detected from `Cargo.toml` else cargo's per-profile default. Ignored for `libfuzzer`/`ziggy`/`afl` (their build sets it). See [Matching the fuzz binary's build](#matching-the-fuzz-binarys-build). |\n| `--build-std` | off | Rust only: build the standard library from source with Cargo's `-Zbuild-std` option and rustc's detected host target, so std functions appear in the graph instead of as external declarations. |\n| `--dot FILE` | *(none)* | Also write the reachable subgraph as Graphviz DOT (indirect edges dashed/red). |\n| `--reached FILE` | beside `--out` | Path for the sancov **allowlist** of reachable functions. |\n| `--not-reached FILE` | beside `--out` | Path for the sancov **ignorelist** of unreachable functions. |\n| `-v`, `--verbose` | off | Narrate each pipeline stage (toolchain → build → merge → analyze): echoes the tool commands run, streams the build output live, and lists the collected bitcode modules. |\n\n#### Target types (`--lang`)\n\n| `--lang` | acquires via | default entry |\n|----------|--------------|---------------|\n| `c` | gllvm (`gclang`) | `main` + `LLVMFuzzerTestOneInput` |\n| `cpp` | gllvm (`gclang++`) | `main` + `LLVMFuzzerTestOneInput` |\n| `rust` | `cargo build` + `--emit=llvm-bc` | `main` |\n| `mixed` | gllvm **and** cargo (merged) | `LLVMFuzzerTestOneInput` |\n| `libfuzzer` | `cargo fuzz build` | `fuzz_target!` |\n| `ziggy` | `cargo ziggy build --no-honggfuzz` | `main` |\n| `afl` | `cargo afl build` | `main` |\n\n`libfuzzer`/`ziggy`/`afl` build through the fuzzer's own driver (a `RUSTC_WRAPPER`\nadds `--emit=llvm-bc`), so the bitcode carries the real `cfg(fuzzing)`, opt level,\nand instrumentation; `--build-cmd` overrides the command. `rust`/`mixed` use a\nplain `cargo build`, tunable with `--profile` / `--codegen-units`.\n\nThe C/C++ targets root at both `main` and `LLVMFuzzerTestOneInput`, so one\n`--lang c`/`cpp` covers a normal program and an LLVMFUzzerTestOneInput harness alike. A default\nentry that matches nothing is a harmless warning, because roots are unioned.\n\n#### Entry resolution\n\n`--entry` never requires a mangled symbol. A token matches — unioned across all\nof:\n\n- an exact mangled symbol (e.g. `_Z3foov`),\n- an exact demangled name (e.g. `foo()`),\n- a demangled `::name` suffix (so `main` finds `crate::main`), and\n- the alias `fuzz_target!` (→ `LLVMFuzzerTestOneInput` + `rust_fuzzer_test_input`).\n\nMatching more than one function only adds roots, which stays sound. For a Rust\nbinary, just root at `main`: the token matches the real Rust `main`, so you never\nneed to type a mangled symbol.\n\n#### Matching the fuzz binary's build\n\nThe reachable set must be computed from a build that matches the binary you\ninstrument: the optimization level, `cfg(fuzzing)`, feature flags, and\n`debug_assertions`/`overflow_checks` all change *which* functions are compiled,\nwhich wildcard `fun:` patterns cannot recover.\n\n**`libfuzzer`/`ziggy`/`afl` handle this automatically** by building through the\nfuzzer's own driver (`cargo fuzz build` / `cargo ziggy build --no-honggfuzz` /\n`cargo afl build`) with a `RUSTC_WRAPPER` that adds `--emit=llvm-bc`. So the\nbitcode already carries the harness's real cfgs and flags. `--profile release`\nadds `--release`; `--build-cmd` overrides the command for anything finer (a\nspecific target, sanitizer, or profile). `--codegen-units` does not apply (the\ntool sets it).\n\n`cargo-afl` and `ziggy` builds run with `AFLRS_REQUIRE_PLUGINS=1`, so they fail\nloudly when the AFL++ LLVM plugins are missing rather than silently building with\nweaker instrumentation that would not match your real fuzzer; install them with\n`cargo afl config --build --plugins --force`.\n\nA build only emits bitcode for the crates it actually (re)compiles, so a **cached\nbuild yields nothing**. The driver detects this — a fully cached run aborts with a\nclear \"build was CACHED\" error — so `cargo clean` (or remove `fuzz/target` for\ncargo-fuzz) before the run. (Setting `RUSTC_WRAPPER` forces a rebuild the first\ntime; a second consecutive run is the cache hit that trips the check.)\n\nFor **plain `--lang rust`** the driver runs `cargo build --emit=llvm-bc`, and two\noptions match it to the fuzz binary (both ignored for C/C++); their defaults aim\nat a stock cargo build.\n\n- **`--profile {debug,release}`** (default `debug`) — the cargo profile. The\n  optimization level drives generic *sharing* (rustc's `-Zshare-generics` is on\n  when unoptimized, off when optimized): it decides which crate instantiates each\n  generic, and so which monomorphizations exist and how they are mangled. A\n  debug snapshot against a release fuzz binary (or vice versa) therefore produces\n  a different function set. Pass `--profile release` for an optimized fuzz build.\n  There is no manifest \"default profile\" in cargo (the profile is picked by the\n  build command), so when omitted this defaults to `debug` — cargo's own default\n  for a bare `cargo build`.\n\n- **`--codegen-units N`** — passed through verbatim as rustc `-Ccodegen-units`.\n  The unit count sets inlining boundaries, hence which monomorphizations survive\n  as standalone functions rather than being inlined away. `N` is any **positive\n  integer** (rustc rejects `0`/negative). When **omitted it is auto-detected** to\n  match how cargo would build the chosen profile:\n  1. the project's `Cargo.toml` `[profile.\u003cname\u003e] codegen-units` — searched from\n     `--project` up to the workspace root, since cargo honours the root\n     manifest's profiles (`\u003cname\u003e` is `dev` for `--profile debug`, else\n     `release`); otherwise\n  2. cargo's per-profile default — **256** for `debug` (dev), **16** for\n     `release`.\n\n  Common explicit value: **`1`** — a single unit per crate (maximum inlining, one\n  `.bc` per crate). Many fuzzing setups pin `codegen-units = 1`; if your\n  `Cargo.toml` does, auto-detect already picks it up, so you rarely need the flag.\n  With `N` \u003e 1 rustc splits each crate into several\n  `target/\u003cprofile\u003e/deps/\u003ccrate\u003e-\u003chash\u003e.\u003ccgu\u003e.rcgu.bc` files; the driver collects\n  all of them.\n\n**How to choose.** Usually you do not have to: set `--profile` to your fuzz\nbuild's profile and codegen-units auto-detects the rest from `Cargo.toml`.\nOverride `--codegen-units` only when the fuzz binary is built with a value that\nis neither in the manifest nor the cargo default for that profile (e.g. one\nforced via `-Ccodegen-units` in `RUSTFLAGS`). `-v` prints the resolved\n`profile=… codegen-units=…` so you can confirm the match.\n\nThe `fun:` patterns in the lists already tolerate the Rust mangling\n*disambiguator* (`17h\u003chash\u003e`) drifting between builds (see [Output](#output)), but\nthat only covers the *naming* of a given instance. Matching `--profile` /\n`--codegen-units` is what aligns the *set* of emitted functions; a wildcard\ncannot recover a function that one build inlined away and the other did not.\n\n#### Environment variables\n\n| Variable | Purpose |\n|----------|---------|\n| `REACHABILITY_ANALYZER` | Path to the analyzer binary (default `analyzer/build/reachability-analyzer`). |\n| `CLANG` / `CLANGXX` / `LLVM_LINK` / `OPT` | Override individual tool paths (otherwise resolved by major from the analyzer's LLVM). |\n| `PATH` | Must contain `gclang` / `gclang++` / `get-bc` (gllvm) for C/C++/mixed targets. |\n\n## Output\n\n`reachability run` writes three files:\n\n- **`\u003cout\u003e.json`** — `summary` counts, a `reachable` array (mangled and demangled\n  name, source file/line when debug info is present, `via` =\n  `direct`/`indirect`/`both`, and an `indirect_only` flag), and an\n  `unreachable_defined` array. With `--dot FILE`, also the reachable subgraph.\n- **`reached.txt`** — a SanitizerCoverage **allowlist** of reachable functions.\n- **`not_reached.txt`** — a SanitizerCoverage **ignorelist** of unreachable\n  functions.\n\nBoth lists use each function's **mangled** (LLVM symbol) name — what clang and\nAFL++ match `fun:` against — so they cover C, C++, and Rust. Feed either to clang\nor AFL++ to instrument only reachable code:\n\n```bash\n# instrument ONLY reachable functions:\nclang -fsanitize-coverage=trace-pc-guard -fsanitize-coverage-allowlist=reached.txt ...\n# OR: instrument everything EXCEPT unreachable functions:\nclang -fsanitize-coverage=trace-pc-guard -fsanitize-coverage-ignorelist=not_reached.txt ...\n```\n\n\u003e A coverage allowlist instruments a function only when both a `src:` and a\n\u003e `fun:` entry match, so `reached.txt` opens with a `src:*` line. An ignorelist\n\u003e has no such requirement, so `not_reached.txt` is pure `fun:` lines. (Verified\n\u003e against clang in `driver/tests/test_covlists.py`.)\n\n\u003e **Rust mangling disambiguator.** A Rust generic instance is mangled with a\n\u003e trailing `17h\u003chash\u003e` disambiguator whose value depends on the build (opt level,\n\u003e codegen units, instantiating crate). The exact value differs between this\n\u003e bitcode snapshot and the instrumented fuzz binary, so an exact-name entry would\n\u003e miss. Each `fun:` entry therefore replaces that disambiguator with a `*` glob,\n\u003e which both clang sancov and AFL++ honour, so an entry matches the instance in\n\u003e any build. An ignorelist pattern that would also match a *reachable* instance\n\u003e is dropped, so excluding unreachable code never excludes reachable code.\n\u003e The `*` only tolerates the disambiguator, not a different function *set*, so\n\u003e still build the snapshot like the fuzz binary — automatic for\n\u003e `libfuzzer`/`ziggy`/`afl`, or matched via `--profile`/`--codegen-units` for\n\u003e plain `--lang rust` (see\n\u003e [Matching the fuzz binary's build](#matching-the-fuzz-binarys-build)).\n\n## Indirect-call resolution\n\nIndirect calls (C function pointers, C++ virtual dispatch, Rust `dyn`/`fn`\npointers) are resolved by the **type-based** resolver: an indirect call of\nfunction type `T` may reach any address-taken function whose LLVM function type\nis `T`. It is language-agnostic, always available, and sound — a deliberate\nover-approximation. The `--indirect-any` debug flag widens this further, to any\naddress-taken function regardless of type.\n\nSee [`docs/llvm-support.md`](docs/llvm-support.md) for the LLVM compatibility\nmatrix.\n\n## Historical note: the removed SVF backend\n\nEarlier versions shipped an optional second backend, `--backend=svf`, built on\n[SVF](https://github.com/SVF-tools/SVF)'s Andersen points-to analysis, meant to\nnarrow the type-based over-approximation per call site. It was removed: in\npractice it produced essentially the same reachable sets as the default\ntype-based backend while costing far more. It built only against a pinned LLVM\n21 (it failed on 22/23), required a separately vendored SVF + Z3 build with a\nlocal source patch, ran substantially slower, and was more fragile to operate —\nso it offered no practical benefit over the type-based backend, which is sound,\nlanguage-agnostic, and works on every supported LLVM. The `--backend` flag is\nretained only for backward compatibility: it is accepted but ignored (with a\nwarning).\n\n## Testing\n\n```bash\nmake test       # full pytest suite (analyzer .ll goldens + per-language soundness)\nmake matrix     # LLVM version-compatibility matrix (catches future-LLVM breakage)\n```\n\nEach fixture in `fixtures/` carries a `must_reach` / `must_not_reach` set; every\nbackend must satisfy the soundness invariant on each.\n\n## Project layout\n\n```\nanalyzer/   C++ analyzer + Makefile (src/, built via llvm-config)\ndriver/     Python driver (toolchain, acquire_*, link, analyze, cli)\nfixtures/   per-language test targets with expected reachable sets\nexamples/   worked examples (cpp_cmake/)\nscripts/    setup.sh, test_matrix.sh, select_llvm.sh\ndocs/       worked examples (EXAMPLES.md), LLVM support\n```\n\n## Limitations\n\nThis is a static over-approximation, not dynamic coverage. Its precision is\nbounded by indirect-call resolution and by any missing bitcode — precompiled\nlibraries, or the Rust standard library without `--build-std`.\n\n**Callbacks that escape to external code.** A function pointer is treated as\nreachable when it is handed to an external/indirect call as an argument, or\nreturned (qsort/bsearch comparators, atexit/pthread/`std::call_once` callbacks,\netc.). The escape analysis follows local loads/stores, globals, aggregates,\nselect/PHI values, defined wrapper arguments, and defined functions that return\ncallbacks.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Faflplusplus%2Ffuzz-reachability","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Faflplusplus%2Ffuzz-reachability","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Faflplusplus%2Ffuzz-reachability/lists"}