{"id":49530899,"url":"https://github.com/macabeus/mizuchi","last_synced_at":"2026-05-02T07:33:36.816Z","repository":{"id":336750093,"uuid":"1133185034","full_name":"macabeus/mizuchi","owner":"macabeus","description":"🐉 Forge C from the ashes of assembly","archived":false,"fork":false,"pushed_at":"2026-03-15T00:50:16.000Z","size":12556,"stargazers_count":42,"open_issues_count":9,"forks_count":3,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-03-15T09:39:52.234Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"TypeScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/macabeus.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-01-13T02:06:25.000Z","updated_at":"2026-03-15T00:50:20.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/macabeus/mizuchi","commit_stats":null,"previous_names":["macabeus/mizuchi"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/macabeus/mizuchi","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/macabeus%2Fmizuchi","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/macabeus%2Fmizuchi/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/macabeus%2Fmizuchi/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/macabeus%2Fmizuchi/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/macabeus","download_url":"https://codeload.github.com/macabeus/mizuchi/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/macabeus%2Fmizuchi/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":32527138,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-02T01:12:54.858Z","status":"online","status_checked_at":"2026-05-02T02:00:05.923Z","response_time":132,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2026-05-02T07:33:36.319Z","updated_at":"2026-05-02T07:33:36.809Z","avatar_url":"https://github.com/macabeus.png","language":"TypeScript","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Mizuchi\n\n\u003cimg src=\"./media/branding/logo.png\" align=\"right\" height=\"130px\" /\u003e\n\n\u003e 🐉 Forge C from the ashes of assembly. What the compiler consumed, the dragon returns.\n\nMizuchi automates the cycle of writing C code, compiling, and comparing against a target binary, towards the goal of fully automatic **matching decompilation**.\n\nIt orchestrates a plugin-based pipeline that can leverage programmatic and AI-powered tools to automatically decompile assembly functions to C source code that produces byte-for-byte identical machine code when compiled.\n\n- ✨ Automatic retries with detailed context on compilation or match failures\n- 🐍 Integration with Claude, [m2c](https://github.com/matt-kempster/m2c), [decomp-permuter](https://github.com/simonlindholm/decomp-permuter), and [objdiff](https://github.com/encounter/objdiff/).\n- 🗺️ Decomp Atlas, a powerful webapp to browse functions and generate rich prompts in one click\n- 📊 Beautiful Report UI to visualize the pipeline result\n\n\u003e [📚 Learn about this project and its benchmarks on this post](https://gambiconf.substack.com/p/can-llms-really-do-matching-decompilation)\n\n\u003cimg width=\"1143\" height=\"1057\" alt=\"image\" src=\"https://github.com/user-attachments/assets/3e078ff7-723f-4e4c-bfd1-bcb5d3f0d3fb\" /\u003e\n\n\u003ctable align=\"center\"\u003e\n    \u003ctr\u003e\n      \u003ctd align=\"center\" width=\"50%\"\u003e\n        \u003ckbd\u003e\u003cimg width=\"1015\" height=\"839\" alt=\"image\" src=\"https://github.com/user-attachments/assets/04818b83-34e5-4c55-bf6d-ec9ec24dee33\" /\u003e\u003c/kbd\u003e\u003cbr /\u003e\n        \u003ci\u003eAchieve fully matching code automatically\u003c/i\u003e\n      \u003c/td\u003e\n      \u003ctd align=\"center\" width=\"50%\"\u003e\n        \u003ckbd\u003e\u003cimg width=\"962\" height=\"558\" alt=\"image\" src=\"https://github.com/user-attachments/assets/2afc7ca2-d9c1-44d8-b32f-103ba0661b4d\" /\u003e\u003c/kbd\u003e\u003cbr /\u003e\n        \u003ci\u003eEven partial matches provide a good start\u003c/i\u003e\n      \u003c/td\u003e\n    \u003c/tr\u003e\n\u003c/table\u003e\n\n\u003ctable align=\"center\"\u003e\n    \u003ctr\u003e\n      \u003ctd align=\"center\" width=\"33%\"\u003e\n        \u003ckbd\u003e\u003cimg width=\"1413\" height=\"1136\" alt=\"image\" src=\"https://github.com/user-attachments/assets/036b5d07-af78-466b-ab31-b487d39ac151\" /\u003e\u003c/kbd\u003e\u003cbr /\u003e\n        \u003ci\u003eExplore the function cloud by similarity\u003c/i\u003e\n      \u003c/td\u003e\n      \u003ctd align=\"center\" width=\"33%\"\u003e\n        \u003ckbd\u003e\u003cimg width=\"1413\" height=\"1136\" alt=\"image\" src=\"https://github.com/user-attachments/assets/3e19248f-dc26-4222-834d-b18468943dfa\" /\u003e\u003c/kbd\u003e\u003cbr /\u003e\n        \u003ci\u003ePick your next function to decompile based on scoring\u003c/i\u003e\n      \u003c/td\u003e\n      \u003ctd align=\"center\" width=\"33%\"\u003e\n        \u003ckbd\u003e\u003cimg width=\"1413\" height=\"1136\" alt=\"image\" src=\"https://github.com/user-attachments/assets/a6c46e82-9b58-449b-846f-c12180a84a29\" /\u003e\u003c/kbd\u003e\u003cbr /\u003e\n        \u003ci\u003eBuild rich prompts to decompile a function in a single click\u003c/i\u003e\n      \u003c/td\u003e\n    \u003c/tr\u003e\n\u003c/table\u003e\n\n\u003e ⚙️ **What is Matching Decompilation?**\n\u003e\n\u003e Matching decompilation is the art of converting assembly back into C source code that, when compiled, produces byte-for-byte identical machine code. It’s popular in the retro gaming community for recreating the source code of classic games. For example, [Super Mario 64](https://github.com/n64decomp/sm64) and [The Legend of Zelda: Ocarina of Time](https://github.com/zeldaret/oot) have been fully match-decompiled.\n\u003e\n\u003e [Learn more by watching my talk.](https://www.youtube.com/watch?v=sF_Yk0udbZw)\n\n## Installation\n\n```bash\nnpm install\nnpm run build \u0026\u0026 npm run build:ui\n```\n\n### m2c Setup (Optional)\n\nTo enable the m2c programmatic phase:\n\n```bash\ngit submodule update --init vendor/m2c\n./scripts/setup-m2c.sh\n```\n\n### decomp-permuter Setup (Optional)\n\nTo enable decomp-permuter (brute-force mutation matching). Works both in the programmatic phase and as background tasks during the AI-powered phase:\n\n```bash\ngit submodule update --init vendor/decomp-permuter\n./scripts/setup-decomp-permuter.sh\n```\n\n### Requirements\n\n- `ANTHROPIC_API_KEY` environment variable set or login on Claude Code to cache credentials locally\n\n## Quick Start\n\n1. **Create a configuration file**: Copy the example config and customize it for your project.\n\n```bash\ncp mizuchi.example.yaml /path/to/you/decomp/project/mizuchi.yaml\n```\n\n2. **Index your codebase**:\n\n```bash\nnpm start -- index-codebase --config /path/to/your/decomp/project/mizuchi.yaml\n```\n\n3. **Start the Decomp Atlas server**:\n\n```bash\nnpm start -- atlas --config /path/to/your/decomp/project/mizuchi.yaml\n```\n\n4. **Generate prompts**: Open Decomp Atlas at [`http://localhost:3000/`](http://localhost:3000/), browse the functions and generate the prompts\n\n5. **Run the pipelines**:\n\n```bash\nnpm start -- run --config /path/to/your/decomp/project/mizuchi.yaml\n```\n\n## Pipeline Overview\n\nMizuchi executes a pipeline of plugins:\n\n![Pipeline Diagram](./media/docs/pipeline-flow.png)\n\n\u003e 📌 **Roadmap**: See the [issues tab](https://github.com/macabeus/mizuchi/issues) for planned features.\n\n## Output\n\nMizuchi generates three output files:\n\n| File                           | Description                                                                          |\n| ------------------------------ | ------------------------------------------------------------------------------------ |\n| `run-results-{timestamp}.json` | Complete execution data including plugin results, timing, and success/failure status |\n| `run-report-{timestamp}.html`  | Visual report with success rates, metrics, and per-prompt breakdown                  |\n| `claude-cache.json`            | Cached Claude API responses keyed by prompt content hash                             |\n\n### Built-in Plugins\n\n| Plugin              | Description                                                                                                                             |\n| ------------------- | --------------------------------------------------------------------------------------------------------------------------------------- |\n| **m2c**             | Optional: generates an initial C decompilation using [m2c](https://github.com/matt-kempster/m2c)                                        |\n| **decomp-permuter** | Optional: brute-forces code mutations using [decomp-permuter](https://github.com/simonlindholm/decomp-permuter) to improve match scores |\n| **Claude Runner**   | Sends prompts to Claude and processes responses                                                                                         |\n| **Compiler**        | Compiles generated C code using a configurable shell script template                                                                    |\n| **Objdiff**         | Compares compiled object files against targets using [objdiff](https://github.com/encounter/objdiff)                                    |\n| **Integrator**      | Optional post-match: integrates matched C code into the decomp project ([docs](docs/integrator-plugin.md))                              |\n\n## Decomp Atlas\n\nDecomp Atlas is a web UI for exploring your decompilation project and target the next functions to decompile. It includes a **prompt builder** that generates rich decompilation prompts.\n\n### Starting the server\n\n```bash\n# Build the CLI and UI\nnpm run build \u0026\u0026 npm run build:decomp-atlas\n\n# Start the Decomp Atlas server\nnpm start -- atlas --config mizuchi.yaml\n```\n\nThe server reads your `mizuchi.yaml` config and serves the Decomp Atlas UI at `http://localhost:3000`.\n\n\u003e Note: Your project must have a `mizuchi-db.json` file in the root directory for the Decomp Atlas to work. Generate it with `mizuchi index-codebase` (see below).\n\n### Indexing Your Codebase\n\nThe `index-codebase` command scans your decompilation project and generates a `mizuchi-db.json` file containing all discovered functions, their assembly, C source (if decompiled), call graphs, and vector embeddings.\n\n**1. Configure your `mizuchi.yaml`:**\n\nAdd `nonMatchingAsmFolders` to the `global` section listing directories that contain non-matching assembly files (relative to `projectPath`):\n\n```yaml\nglobal:\n  projectPath: /path/to/decomp/project\n  mapFilePath: /path/to/project.map\n  target: gba # or n64, ps1, etc.\n  nonMatchingAsmFolders:\n    - asm/non_matching\n    - asm\n```\n\n**2. Run the indexer:**\n\n```bash\n# Build first (if not already done)\nnpm run build\n\n# Index the codebase\nnpm start -- index-codebase --config mizuchi.yaml\n\n# Or in development mode\nnpm run dev -- index-codebase --config mizuchi.yaml\n```\n\nThe indexer performs three phases:\n\n1. **Scan matched functions** — finds C function definitions via ast-grep, resolves each to its compiled `.o` file using the map file, and extracts assembly via objdiff\n2. **Scan unmatched functions** — reads `.s`/`.S`/`.asm` files from `nonMatchingAsmFolders` and parses function boundaries\n3. **Compute embeddings** — generates vector embeddings using [jina-embeddings-v2-base-code](https://huggingface.co/jinaai/jina-embeddings-v2-base-code) via a Python subprocess with MPS GPU acceleration (Apple Silicon) or CPU fallback\n\n**Options:**\n\n| Flag                    | Description                                              |\n| ----------------------- | -------------------------------------------------------- |\n| `-c, --config`          | Path to `mizuchi.yaml` (defaults to `./mizuchi.yaml`)    |\n| `-s, --skip-embeddings` | Skip embedding generation (useful for quick re-indexing) |\n\n**Incremental indexing:** Re-running the command only recomputes embeddings for new or changed functions. Unchanged functions preserve their existing embeddings.\n\n**Python requirements for embeddings:** Python 3.10+ is required. On first run, the indexer automatically creates a virtual environment at `~/.cache/mizuchi/python-venv/` and installs `torch` and `transformers` (~2-3 GB). The model weights are cached at `~/.cache/huggingface/`. Use `--skip-embeddings` to skip this entirely.\n\n## Development\n\nSee [DEVELOPMENT.md](DEVELOPMENT.md) for development setup, commands, and notes.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmacabeus%2Fmizuchi","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmacabeus%2Fmizuchi","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmacabeus%2Fmizuchi/lists"}