{"id":21077547,"url":"https://github.com/simonlindholm/decomp-permuter","last_synced_at":"2025-04-04T09:08:15.400Z","repository":{"id":36958185,"uuid":"172801212","full_name":"simonlindholm/decomp-permuter","owner":"simonlindholm","description":"Randomly permute C files to better match a target binary","archived":false,"fork":false,"pushed_at":"2025-03-24T22:34:01.000Z","size":3023,"stargazers_count":139,"open_issues_count":48,"forks_count":48,"subscribers_count":6,"default_branch":"main","last_synced_at":"2025-03-28T08:06:43.693Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/simonlindholm.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2019-02-26T22:33:20.000Z","updated_at":"2025-03-24T22:34:05.000Z","dependencies_parsed_at":"2024-03-29T10:34:56.541Z","dependency_job_id":"7d213a68-8999-433c-9f17-a5b24deecab9","html_url":"https://github.com/simonlindholm/decomp-permuter","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/simonlindholm%2Fdecomp-permuter","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/simonlindholm%2Fdecomp-permuter/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/simonlindholm%2Fdecomp-permuter/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/simonlindholm%2Fdecomp-permuter/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/simonlindholm","download_url":"https://codeload.github.com/simonlindholm/decomp-permuter/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247149501,"owners_count":20891954,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-19T19:37:03.869Z","updated_at":"2025-04-04T09:08:15.380Z","avatar_url":"https://github.com/simonlindholm.png","language":"Python","funding_links":[],"categories":["Reverse Engineering","🛠️ General Tools"],"sub_categories":["Tools and Disassemblers","🔬 Format Analysis \u0026 Reverse Engineering"],"readme":"# Decomp permuter\n\nAutomatically permutes C files to better match a target binary. The permuter has two modes of operation:\n- Random: purely at random, introduce temporary variables for values, change types, put statements on the same line...\n- Manual: test all combinations of user-specified variations, using macros like `PERM_GENERAL(a = b ? c : d;, if (b) a = c; else a = d;)` to try both specified alternatives.\n\nThe modes can also be combined, by using the `PERM_RANDOMIZE` macro.\n\n[\u003cimg src=\"https://asciinema.org/a/232846.svg\" height=\"300\"\u003e](https://asciinema.org/a/232846)\n\nThis tool supports MIPS (compiled by IDO, possibly GCC), PowerPC, and ARM32 assembly.\n\n## Usage\n\n`./permuter.py directory/` runs the permuter; see below for the meaning of the directory.\nPass `-h` to see possible flags. `-j` is suggested (enables multi-threaded mode).\n\nYou'll first need to install a couple of prerequisites: `python3 -m pip install pycparser pynacl toml Levenshtein` (also `dataclasses` if on Python 3.6 or below)\n`pynacl` is optional and only necessary for the \"permuter@home\" networking feature.\n`Levenshtein` is optional and only necessary for using the Levenshtein diff algorithm (difflib is used by default).\n\nThe permuter expects as input one or more directory containing:\n  - a .c file with a single function,\n  - a .o file to match,\n  - a .sh file that compiles the .c file,\n  - a .toml file specifying settings.\n\nFor projects with a properly configured makefile, you should be able to set these up by running\n```\n./import.py \u003cpath/to/file.c\u003e \u003cpath/to/file.s\u003e\n```\nwhere file.c contains the function to be permuted, and file.s is its assembly in a self-contained file.\nOtherwise, see USAGE.md for more details.\n\nFor projects using Ninja instead of Make, add a `permuter_settings.toml` in the root or `tools/` directory of the project:\n```toml\nbuild_system = \"ninja\"\n```\nThen `import.py` should work as expected if `build.ninja` is at the root of the project.\n\nAll of the possible randomizations are assigned a weight value that affects the frequency with which the randomization is chosen.\nThe default set of weights is specified in `default_weights.toml` and vary based on the targeted compiler.\nThese weights can be overridden by modifying `settings.toml` in the input directory.\n\nThe .c file may be modified with any of the following macros which affect manual permutation:\n\n- `PERM_GENERAL(a, b, ...)` expands to any of `a`, `b`, ...\n- `PERM_VAR(a, b)` sets the meta-variable `a` to `b`, `PERM_VAR(a)` expands to the meta-variable `a`.\n- `PERM_RANDOMIZE(code)` expands to `code`, but allows randomization within that region. Multiple regions may be specified. A `PERM_RANDOMIZE` block is automatically added when there are no PERM macros.\n- `PERM_FORCE_SAMELINE(code)` expands to `code`, but joined to a single line after round-tripping through the C parser library (which normally puts statements on separate lines). Can be useful for IDO where same-lineness affects codegen.\n- `PERM_LINESWAP(lines)` expands to a permutation of the ordered set of non-whitespace lines (split by `\\n`). Each line must contain zero or more complete C statements. (For incomplete statements use `PERM_LINESWAP_TEXT`, which is slower because it has to repeatedly parse C code.)\n- `PERM_INT(lo, hi)` expands to an integer between `lo` and `hi` (which must be constants).\n- `PERM_IGNORE(code)` expands to `code`, without passing it through the C parser library (pycparser)/randomizer. This can be used to avoid parse errors for non-standard C, e.g. `asm` blocks.\n- `PERM_PRETEND(code)` expands to `code` for the purpose of the C parser/randomizer, but gets removed afterwards. This can be used together with `PERM_IGNORE` to enable the permuter to deal with input it isn't designed for (e.g. inline functions, C++, non-code).\n- `PERM_ONCE([key,] code)` expands to either `code` or to nothing, such that each unique key gets expanded exactly once. `key` defaults to `code`. For example, `PERM_ONCE(a;) b; PERM_ONCE(a;)` expands to either `a; b;` or `b; a;`.\n\nArguments are split by a commas, exluding commas inside parenthesis. `(,)` is a special escape sequence that resolves to `,`. \n\nNested macros are allowed, so e.g.\n```\nPERM_VAR(delayed, )\nPERM_GENERAL(stmt;, PERM_VAR(delayed, stmt;))\n...\nPERM_VAR(delayed)\n```\nis an alternative way of writing `PERM_ONCE`.\n\nIf any multi-choice PERM macros are provided, automatic randomization will be disabled; to enable it you need to surround the function (or the relevant parts of it) with `PERM_RANDOMIZE`.\n\n## permuter@home\n\nThe permuter supports a distributed mode, where people can donate processor power to your permuter runs to speed them up.\nTo use this, pass `-J` when running `permuter.py` and follow the instructions.\n(This can be combined with regular `-j` flags.)\nYou will need to be granted access by someone who is already connected to a permuter network.\n\npermuter@home is only available for a limited number of compilers\n(see [the list](https://github.com/decompals/pah-docker) for the main permuter network),\nand currently does not work on native Windows (but WSL does work).\n\nTo allow others to use your computer for permuter runs, do the following:\n\n- install Docker (used for sandboxing and to ensure a consistent environment)\n- if on Linux, add yourself to the Docker group: `sudo usermod -aG docker $USER`\n  or set up [rootless Docker](https://docs.docker.com/engine/security/rootless/)\n- install required packages: `python3 -m pip install docker`\n- open a terminal, and run `./pah.py run-server` to start the worker server.\n  There are a few required arguments (e.g. how many cores to use), see `--help` for more details.\n\nAnyone who is granted access to permuter@home can run a worker.\n\nTo set up a new permuter network, see [src/net/controller/README.md](./src/net/controller/README.md).\n\n## FAQ\n\n**What do the scores mean?** The scores are computed by taking diffs of objdump'd .o\nfiles, and giving different penalties for lines that are the same/use the same\ninstruction/are reordered/don't match at all. 0 means the function matches fully.\nStack positions are ignored unless --stack-diffs is passed (but beware that the\npermuter is currently quite bad at resolving stack differences). For more details,\nsee scorer.py. It's far from a perfect system, and should probably be tweaked to\nlook at e.g. the register diff graph.\n\n**What sort of non-matchings are the permuter good at?** It's generally best towards\nthe end, when mostly regalloc changes remain. If there are reorderings or functional\nchanges, it's often easy to resolve those by hand, and neither the scorer nor the\nrandomizer tends to play well with them.\n\n**Should I use this instead of trying to match code by hand?** No, but it can be a good\ncomplement. PERM macros can be used to quickly test lots of variations of a function at\nonce, in cases where there are interactions between several parts of a function.\nThe randomization mode often finds lots of nonsensical changes that improve regalloc\n\"by accident\"; it's up to you to pick out the ones that look sensible. If none do,\nit can still be useful to know which parts of the function need to be changed to get the\ncode nearer to matching. Having made one of the improvements, and the function can then be\npermuted again, to find further possible improvements.\n\n## Helping out\n\nThere's tons of room for helping out with the permuter!\nMany more randomization passes could be added, the scoring function is far from optimal,\nthe permuter could be made easier to use, etc. etc. The GitHub Issues list has some ideas.\n\nIdeally, `mypy permuter.py` and `./run-tests.sh` should succeed with no errors, and files\nformatted with `black`. To setup a pre-commit hook for black, run:\n```\npip install pre-commit black\npre-commit install\n```\nPRs that skip this are still welcome, however.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsimonlindholm%2Fdecomp-permuter","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsimonlindholm%2Fdecomp-permuter","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsimonlindholm%2Fdecomp-permuter/lists"}