{"id":16821317,"url":"https://github.com/mahaloz/sailr-eval","last_synced_at":"2025-09-12T02:09:55.738Z","repository":{"id":200066148,"uuid":"704749501","full_name":"mahaloz/sailr-eval","owner":"mahaloz","description":"The SAILR paper's evaluation pipline for measuring the quality of decompilation","archived":false,"fork":false,"pushed_at":"2024-11-26T05:52:04.000Z","size":2009,"stargazers_count":113,"open_issues_count":3,"forks_count":7,"subscribers_count":3,"default_branch":"main","last_synced_at":"2025-08-10T19:37:21.505Z","etag":null,"topics":["angr","decompilation"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"bsd-2-clause","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/mahaloz.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2023-10-14T01:37:32.000Z","updated_at":"2025-07-04T03:04:29.000Z","dependencies_parsed_at":null,"dependency_job_id":"cbcae567-87d9-4cef-a0d4-13789e03cbd8","html_url":"https://github.com/mahaloz/sailr-eval","commit_stats":null,"previous_names":["mahaloz/sailr-eval"],"tags_count":10,"template":false,"template_full_name":null,"purl":"pkg:github/mahaloz/sailr-eval","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mahaloz%2Fsailr-eval","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mahaloz%2Fsailr-eval/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mahaloz%2Fsailr-eval/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mahaloz%2Fsailr-eval/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/mahaloz","download_url":"https://codeload.github.com/mahaloz/sailr-eval/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mahaloz%2Fsailr-eval/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":274742855,"owners_count":25341132,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-09-12T02:00:09.324Z","response_time":60,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["angr","decompilation"],"created_at":"2024-10-13T10:59:47.451Z","updated_at":"2025-09-12T02:09:55.682Z","avatar_url":"https://github.com/mahaloz.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# SAILR Evaluation Pipeline\n\n\u003cp align=\"center\"\u003e\n   \u003cimg src=\"https://i.imgur.com/VUnGRHU.png\" style=\"width: 30%;\" alt=\"angr-sailr Logo\"/\u003e\n\u003c/p\u003e\n\nThe SAILR evaluation pipeline, `sailreval`, is a tool for measuring various aspects of decompilation quality.\nThis evaluation pipeline was originally developed for the USENIX 2024 paper [\"Ahoy SAILR! There is No Need to DREAM of C:\nA Compiler-Aware Structuring Algorithm for Binary Decompilation\"](https://www.zionbasque.com/files/publications/sailr_usenix24.pdf). It supports 26 different C packages from Debian,\nfor compiling, decompiling, and measuring. Currently, angr, Hex-Rays (IDA Pro), and Ghidra are supported as decompilers.\n\nIf you are only looking to use the SAILR version of angr, simply use angr! The latest version of angr now uses SAILR!\nIf you are looking to reproduce the exact results of the SAILR paper, then jump to [this README](./misc/reproducing_sailr_paper/README.md) for a submission version. \n\n## Table of Contents\n- [Overview](#overview)\n- [Installation](#installation)\n- [Usage](#usage)\n  - [Compilation](#compiling)\n  - [Decompilation](#decompiling)\n  - [Measuring](#measuring)\n  - [Aggregation](#aggregating)\n- [Example Run](#example-run)\n- [Miscellaneous](#miscellaneous)\n  - [Compiling Windows Targets](#compiling-windows-targets)\n- [Citation](#citation)\n\n\n## Overview:\nThis repo contains the `sailreval` Python package and information about the SAILR paper artifacts.\n`sailreval` is the Python package that contains all the code for running the evaluation pipeline.\n`sailreval` evaluates the quality of decompilation by comparing it to the original source code.\nThis evaluation is done in four phases:\n1. [Compilation](#compilation): a project described in the [targets](./targets) directory is downloaded, preprocessed, and compiled into object files.\n2. [Decompilation](#decompilation): decompilers [supported in sailreval](./sailreval/decompilers) are used to decompile the object files into C source files.\n3. [Measurement](#measuring): the preprocessed source and decompiled source are compared using [metrics](./sailreval/metrics) in `sailreval`.\n4. [Aggregation](#aggregating): the results from the measurement are normalized for functions that had a metric on all decompilers.\n\nEach phase requires the phase directly before it runs; however, you can skip stages if you manually provide the\nrequired files. For example, you can skip the decompilation phase if you already have the object files and preprocessed source.\n\n## Installation\nThe `sailreval` package can be used in two ways: locally or in a docker container.\nIf you plan on reproducing the results of the SAILR paper or using some pre-packaged decompiler like Ghidra, then you\nwill need both. Below are two methods for installing: one is heavy (docker and local), and one is light (only local).\nMake sure you have Docker installed on your system. \n\n### Install Script (Recommended)\nOn Linux and MacOS:\n```bash\n./setup.sh\n```\n\nThis will build the Docker container, install system dependencies, and install the Python package locally.\n\n### Only Python Package\nIf you want to use only local decompilers and you have the build dependencies installed for your compiled project, you\ncan install the Python package without the Docker container. For an example of this use case, see \nour [CI runner](./.github/workflows/python-app.yml).\n```bash\npip3 insatll -e .\n```\n\nNote: you will need to install the system dependencies for the Python project yourself, listed [here](./.github/workflows/python-app.yml).\nThe package is also available on PyPi, so remote installation works as well. \n\n### Install Verification\nVerify the installation by running:\n```bash\n./scripts/verify_pipeline.sh\n```\n\nThis will use both the Docker container and your local install to run the Pipeline. \nIf you installed it correctly, you should see some final output like:\n```md\n# Evaluation Data\n## Stats\nLayout: ('sum', 'mean', 'median')\n### O2\nMetric     | source      | angr_sailr  | angr_dream\n---------- | ----------- | ----------- | -----------\ngotos      | 1/0.12/0.0  | 1/0.12/0.0     | 0/0/0.0\n...\n```\n\n## Usage\nAfter installation, if you used the script normally (i.e. the docker install), than you can use the `docker-eval.sh` script\nwhich is a proxy to the `eval.py` script, but inside the container. \nAs an example you can use:\n```bash\n./docker-eval.sh --help\n./eval.py --help\n```\n\nThey should both produce the same result.\n\nUsing the steps below, you can run the entire pipeline stage-by-stage. In each evaluated target in `targets` you will\nbe able to find a `sailr_compiled`, `sailr_decompiled`, and `sailr_measured` folder in the package folder. \nEach folder will contain the results of the respective stage. All targets are places in the `results` directory under\ntheir respective optimization. \nFor coreutils compiled with O2, you'll see `results/O2/coreutils`.\n\n### Compiling\nTo compile a package it must be described in the `targets` folder by a `target.toml`. Here is coreutils:\n```toml\npackage_name = \"coreutils\"\nsource_remote = \"git://git.sv.gnu.org/coreutils.git\"\nremote_type = \"git\"\ndownload = true\npost_download_cmds = [\"./bootstrap\"]\nversion = \"v9.1\"\npackage_dir = \"coreutils\"\npre_make_cmds = [\"./configure --quiet\"]\nmake_cmd = \"make\"\npost_make_cmds = []\nsource_dir = \"src\"\n```\n\nThere are many flags that you can set which are defined in the [sailr_target](./sailreval/utils/sailr_target.py) class.\n\nWe compile just coreutils using the docker wrapper:\n```sh \ndocker-eval.sh --compile coreutils --cores 8 --opt-levels O2\n```\n\nAfter compiling is done, you can find the results in the `results/O2/coreutils` directory. \nIn the `sailr_compiled` folder located in `coreutils` you will find all the object files, preprocessed source, and normal source.\nThe next phase will destroy the normal source and replace it with the preprocessed source. \nIt's critical that you do not edit the preprocessed source in any way.\n\n### Decompiling\nThe target must contain the `sailr_compiled` folder with `.o` files in it. In the case of coreutils that would be:\n`./results/O2/coreutils/sailr_compiled/`. The source must also be present in that folder. \n\nFor the very first time you decompile a target, you must \"decompile\" the source, which creates normalized preprocessed source.\nDo it like so:\n```\n./eval.py --decompile coreutils --use-dec source --cores 20 --opt-levels O2\n```\n\nHighly recommend to run locally for speed. \nAfter this is done, you don't need to do it again even if you re-decompile for other decompilers.\n\nNext, you decompile all the decompilers you want:\n```\n./docker-eval.sh --decompile coreutils --use-dec ghidra angr_sailr angr_phoenix --cores 20 --opt-levels O2\n```\n\nAll the decompilation files, including the preprocessed source, will be found inside the `sailr_decompiled` folder.\nFor coreutils that would be: `./results/O2/coreutils/sailr_decompiled/`.\nYou will find the preprocessed source as `source_*.c` and the decompilation as `\u003cdecompiler\u003e_*.c`.\nYou will also notice files like `angr_sailr_mv.linemaps`, `angr_sailr_mv.toml`, and `mv.dwarf.linemaps`.\nThese files contain the line mappings for decompiled source to original source and pre-computed metrics like `goto` counts.\n\nIf you plan on using IDA Pro, you must mount it into the container. \nPlease mount the `idat64` binary directly into the container at `/tools/`.\nTo do that, add `-v /path/to/idat64_folder/:/tools/` to the `docker run` command in the [docker-eval.sh](./scripts/docker-eval.sh) script.\n\n\n### Measuring\nLike the decompilation phase, this phase requires the `sailr_decompiled` to exist with `.o` and `.c` files in it.\nIf you plan on using `cfged`, then the `sailr_decompild` folder must contain the linemaps, toml, and dwarf files for each targeted object file.\n\nIf you ran the decompilation step above, you should automatically have that. Measure with: \n```sh \n./eval.py --measure coreutils --use-metric gotos cfged --use-dec source angr_sailr --cores 15\n```\n**NOTE**: you must put `source` as one of the targeted decompilers if you are using `cfged`. \n\nAfter runing, you will find files in the `sailr_measured` folder. \nFor coreutils that would be: `./results/O2/coreutils/sailr_measured/`.\nIn the folder you will find various `toml` files that look like the following:\n\n```toml\nbinary = \"mv\"\ntotal_time = 231.44658088684082\ntimeout = false\n\n[source.cfged]\nmain = \"0.0\"\n\n[source.gotos]\nmain = 0 \n\n[angr_sailr.cfged]\nmain = \"309.0\"\n\n[angr_sailr.gotos]\nmain = 11\n# ...\n```\n\nYou can use the `toml` library in Python to load these files into a dictionary.\nThe dictionary is keyed by `[decompiler][metric][function]` and the value is the metric value.\n\n### Aggregating\n\nAfter measuring, you can aggregate the results like so:\n```sh\n./eval.py --summarize-targets coreutils --use-dec source angr_sailr --use-metric gotos cfged \n```\n\nThe results will look something like, which is all sums:\n```markdown\n# Evaluation Data\n## Stats\n\n### O2\nDecompiler | gotos_sum | cfged_sum\n---------- | --------- | ---------\nsource     | 46        | 0\nangr_sailr | 668       | 39701\n\n\n## Metadata\n\ntotal_unique_functions_in_src | total_unique_functions_in_all_metrics\n----------------------------- | -------------------------------------\n1152                          | 918\n```\n\nOnly the last printed table matters. Tables printed before that are intermediate results.\nYou can also show `Sum/Average/Median` by using the `--show-stats` arg. \n\nThe above summarization is the normalized results where each count is based on functions that successfully decompiled and measured\non all decompilers specified in the command. You can also do multiple targets at once:\n```sh\n./eval.py --summarize-targets coreutils diffutils ...\n```\n\nThere is a special case summarization for projects that are the same but may have different names.\nThis happens in the case of `coreutils` and `coreutils_gcc5`. Both are Coreutils compiled with different decompilers.\nYou can normalize across both projects for binaries and functions that only exist across both projects with:\n```sh\n./eval.py --merge-results ./results/O2/coreutils*/sailr_measured --use-dec source angr_sailr --use-metric gotos cfged\n```\n\n## Example Run\nHere is an example run of the pipeline:\n```sh\n./docker-eval.sh --compile coreutils --cores 20 \u0026\u0026 \\\n./eval.py --decompile coreutils --use-dec source --cores 20 \u0026\u0026 \\\n./docker-eval.sh --decompile coreutils --use-dec ghidra angr_sailr angr_phoenix angr_dream angr_comb --cores 20 \u0026\u0026 \\\n./eval.py --measure coreutils --use-metric gotos cfged bools func_calls --use-dec source ghidra angr_sailr angr_phoenix angr_dream angr_comb --cores 20 \u0026\u0026 \\\n./eval.py --summarize-targets coreutils --use-dec source ghidra angr_sailr angr_phoenix angr_dream angr_comb --use-metric gotos cfged bools func_calls --show-stats\n```\n\n\n## Miscellaneous\n### Compiling Windows Targets\nWindows targets, like `libz_windows`, will not be compiled by this pipeline, so you must compile them yourself. \nFollow the following the steps to compile a windows target:\n1. Download the source code for the target specified in the targets toml file\n2. Make a new configuration in MSVC `Project-\u003eProperties-\u003eConfiguration Manager-\u003eActive Solution Configuration-\u003eNew`\n3. Name is `SAILR`\n4. Go to `Project-\u003eProperties-\u003eC/C++-\u003ePreprocessor` and enable Preprocessor Definitions to File \n5. Hit compile with SAILR config, the copy all `*.i`, `*.c`, and `*.obj` files into the `src` folder you need to make\n6. Rename the `*.obj` to `*.o`\n7. If step `5` failed, then just remove the preprocessor option after running once\n\nTo run the full pipeline for Windows targets, you must have [llvm-pdbutil](https://github.com/shaharv/llvm-pdbutil-builds)\ninstalled on the system. \n\n## Citation\nIf you use this tool in your research, please cite out paper:\n```bib\n@inproceedings{basque2024ahoy,\n  title={Ahoy sailr! there is no need to dream of c: A compiler-aware structuring algorithm for binary decompilation},\n  author={Basque, Zion Leonahenahe and Bajaj, Ati Priya and Gibbs, Wil and O’Kain, Jude and Miao, Derron and Bao, Tiffany and Doup{\\'e}, Adam and Shoshitaishvili, Yan and Wang, Ruoyu},\n  booktitle={Proceedings of the USENIX Security Symposium},\n  year={2024}\n}\n```","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmahaloz%2Fsailr-eval","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmahaloz%2Fsailr-eval","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmahaloz%2Fsailr-eval/lists"}