{"id":14638093,"url":"https://github.com/fuzz4all/fuzz4all","last_synced_at":"2025-09-07T06:32:28.068Z","repository":{"id":220028321,"uuid":"750549861","full_name":"fuzz4all/fuzz4all","owner":"fuzz4all","description":"🌌️Fuzz4All: Universal Fuzzing with Large Language Models","archived":false,"fork":false,"pushed_at":"2025-08-11T20:13:17.000Z","size":10317,"stargazers_count":275,"open_issues_count":2,"forks_count":45,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-08-11T22:05:34.888Z","etag":null,"topics":["fuzzing","large-language-models","llm","program-synthesis","testing"],"latest_commit_sha":null,"homepage":"https://fuzz4all.github.io/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"cc-by-4.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/fuzz4all.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-01-30T21:10:54.000Z","updated_at":"2025-08-11T20:13:20.000Z","dependencies_parsed_at":null,"dependency_job_id":"c5ab146a-c175-4788-a322-32411cd5f682","html_url":"https://github.com/fuzz4all/fuzz4all","commit_stats":null,"previous_names":["fuzz4all/fuzz4all"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/fuzz4all/fuzz4all","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fuzz4all%2Ffuzz4all","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fuzz4all%2Ffuzz4all/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fuzz4all%2Ffuzz4all/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fuzz4all%2Ffuzz4all/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/fuzz4all","download_url":"https://codeload.github.com/fuzz4all/fuzz4all/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/fuzz4all%2Ffuzz4all/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":274005341,"owners_count":25205934,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-09-07T02:00:09.463Z","response_time":67,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["fuzzing","large-language-models","llm","program-synthesis","testing"],"created_at":"2024-09-10T02:01:42.765Z","updated_at":"2025-09-07T06:32:28.056Z","avatar_url":"https://github.com/fuzz4all.png","language":"Python","funding_links":[],"categories":["Python"],"sub_categories":[],"readme":"# \u003cp style=\"text-align: center;\"\u003e  🌌️Fuzz4All: Universal Fuzzing with LLMs \u003c/p\u003e\n\n\u003cp align=\"center\"\u003e\n    \u003ca href=\"https://arxiv.org/abs/2308.04748\"\u003e\u003cimg src=\"https://img.shields.io/badge/arXiv-2308.04748-b31b1b.svg?style=for-the-badge\"\u003e\n    \u003ca href=\"https://doi.org/10.5281/zenodo.10456883\"\u003e\u003cimg src=\"https://img.shields.io/badge/DOI-10456883-blue?style=for-the-badge\"\u003e\n    \u003ca href=\"https://hub.docker.com/r/stevenxia/fuzz4all/tags\"\u003e\u003cimg src=\"https://img.shields.io/badge/docker-fuzz4all-%230db7ed.svg?style=for-the-badge\u0026logo=docker\u0026logoColor=blue\"\u003e\n    \u003ca href=\"https://github.com/fuzz4all/fuzz4all/blob/master/LICENSE\"\u003e\u003cimg src=\"https://forthebadge.com/images/badges/cc-by.svg\" style=\"height: 28px\"\u003e\u003c/a\u003e\n\u003c/p\u003e\n\nThis repository contains the source code for our ICSE'24 paper \u003ci\u003e \"Fuzz4All: Universal Fuzzing with Large Language Models\" \u003c/i\u003e\n\n## 🌌️ About\n\n`Fuzz4All` -- the first fuzzer that can universally target many input languages and features of these languages.\n\u003e The key idea behind `Fuzz4All` is to leverage large language models (LLMs) as an input generation and mutation engine, which enables the \n\u003e approach to produce diverse and realistic inputs for any practically relevant language. \n\nTo realize this potential, we present a novel **autoprompting technique**, which creates LLM prompts \nthat are well-suited for fuzzing, and a novel **LLM-powered fuzzing loop**, which iteratively updates \nthe prompt to create new fuzzing inputs.\n\n![](./resources/overview.gif)\n\n## ⚡ Quick Start\n\n\u003e [!Important]\n\u003e We highly recommend running `Fuzz4All` in a sandbox environment/machine such as docker. \n\u003e Since LLMs may generate potential harmful code your machine, please proceed with caution.\n\u003e We have provided a complete docker image in our artifact here: https://doi.org/10.5281/zenodo.10456883\n\n### Setup\n\nFirst, create the corresponding environment and install the required packages\n\n```bash\nconda create -n fuzz4all python=3.10\nconda activate fuzz4all\n\npip install -r requirements.txt\npip install -e .\n```\n\nNext, we need to quickly configure the environmental variables. Here are the default parameters:\n\n```bash\nexport FUZZING_BATCH_SIZE=30\nexport FUZZING_MODEL=\"bigcode/starcoderbase\"\nexport FUZZING_DEVICE=\"gpu\"\n```\nor if you want to run Fuzz4All with local ollama mode:\n```bash\nexport FUZZING_MODEL=\"ollama/starcoder\"\n```\n\nif you want to use other model than starcoder, please change the name of model after `ollama/*`. \nMake sure you have model locally pulled.\nThe exact parameters will depend on the machine you are running `Fuzz4All` on.\n\n\u003e [!Note]\n\u003e Currently `Fuzz4All` only supports starcoderbase and starcoderbase-1b models. However, one can easily modify \n\u003e the source code to include and use other models. See `model.py` for more detail.\n\nTo use the autoprompting mechanism of `Fuzz4All` via GPT-4, please also export your openai key\n\n```\nexport OPENAI_API_KEY={key_here}\n```\n\n### Fuzzing\n\nNow you are ready to run `Fuzz4All` on all targets (with arbitrary inputs through autoprompting)! \n\n`Fuzz4All` is configured easily through config files. The one used for our experiment are store in `configs/`. \nThe config file controls various aspects of `Fuzz4All` including the fuzzing language, time, autoprompting strategy, etc.\nPlease see any example config file in `configs/` for more detail. \n\nIn general, you can run `Fuzz4All` with the following command:\n\n```bash\npython Fuzz4All/fuzz.py --config {config_file.yaml} main_with_config \\ \n                        --folder outputs/fuzzing_outputs \\\n                        --batch_size {batch_size} \\\n                        --model_name {model_name} \\\n                        --target {target_name}\n```\n\nwhere `{config_file.yaml}` is the config file you want to use, `{batch_size}` is the batch size you want to use, \n`{model_name}` is the model name you want to use, and `{target_name}` is the target binary you want to fuzz.\n\n\u003e [!Note]\n\u003e you will neede to build/download your own binary ({target_name}) for fuzzing\n\nFor targeted fuzzing (i.e., fuzzing a specific API or library of a language), you can modify the config file to point to the \nspecific API/library documentation you want the model to generate prompts for. Please see `configs/targeted` for examples of such configs.\n\n\u003cdetails\u003e\u003csummary\u003eYou should see similar outputs to the following: \u003c/summary\u003e \n\n```\nBATCH_SIZE: 30\nMODEL_NAME: bigcode/starcoderbase\nDEVICE: gpu\n...\n=== Target Config ===\nlanguage: smt2\nfolder: outputs/full_run/cvc5/\n...\n====================\n[INFO] Initializing ... this may take a while ...\n[INFO] Loading model ...\n=== Model Config ===\nmodel_name: bigcode/starcoderbase\n...\n====================\n[INFO] Model Loaded\n[INFO] Use auto-prompting prompt ...\nGenerating prompts... ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 0:07:30\n[INFO] Done\n (resuming from 0)\n[VERBOSE] ; SMT2 is an input language commonly used by SMT solvers, with its syntax based on S-expressions. The multi-sorted logic accommodates a simple type system to confirm that terms from contrasting sorts\naren't the equal. Uninterpreted functions can be declared, with the function symbol being an uninterpreted one. SMT2 supports various theories, including integer and real arithmetic, with basic logical\nconnectives, quantifiers, and attribute annotations. An SMT2 theory includes sort and function symbol declarations and assertions of facts about them. Terms can be checked against these theories to determine their\nvalidity, with successful queries returning \"unsat\".\n; Please create a short program which uses complex SMT2 logic for an SMT solver\n(set-logic ALL)\n...\n(set-logic ALL)\n(assert (forall ((n Int)) (=\u003e (\u003e n 0) (= n (* 2 n)))))\n(check-sat)\n(exit)\n; Please create a short program which uses complex SMT2 logic for an SMT solver\n(set-logic ALL)\n\nFuzzing •   0% ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━     30/100000 • 0:02:26\n```\n\u003c/details\u003e\n\nAfter fuzzing, you can find the generated fuzzing programs in `outputs/full_run/{target}/`. \n\n\u003cdetails\u003e\n\u003csummary\u003eHere is the structure of the output directory: \u003c/summary\u003e\n\n```\n- outputs/full_run/{target}/\n    - prompts \n        - best_prompt.txt: the best prompt found by `Fuzz4All` for the target.\n        - greedy_prompt.txt\n        - prompt_0.txt\n        - prompt_1.txt\n        - prompt_2.txt\n        - scores.txt: keep track of the scores of each prompt (used to select the best prompt).\n    - 0.fuzz\n    - 1.fuzz\n    ... # \n    - log.txt\n    - log_generation.txt\n    - log_validation.txt\n```\n\u003c/details\u003e\n\nMost notably, we log both the generation and validation process in `log_generation.txt` and `log_validation.txt` respectively. Furthermore, `log.txt` provides an overview of the fuzzing process (including any potential bugs found by `Fuzz4All`) \n\nPotential bugs will look like this in `log.txt`:\n\n```\n[VERBOSE] 2345.fuzz has potential error! # this indicates that file 2345.fuzz may have a potential bug\n```\n\n## ⚙️ Artifact\n\nPlease see [`README_artifact.md`](https://github.com/fuzz4all/fuzz4all/blob/master/README_artifact.md) and [Zenodo link](https://zenodo.org/records/10456883) for a more detailed explanation of Fuzz4All \nas well as how to produce the complete results from our paper \n\n## 🐛 Bugs Found\n\nWe have included a complete list of bugs found by `Fuzz4All` under `bugs/` folder.\n\n## 📝 Citation\n\n```bibtex\n@inproceedings{fuzz4all,\n  title = {Fuzz4All: Universal Fuzzing with Large Language Models},\n  author = {Xia, Chunqiu Steven and Paltenghi, Matteo and Tian, Jia Le and Pradel, Michael and Zhang, Lingming},\n  booktitle = {Proceedings of the 46th International Conference on Software Engineering},\n  series = {ICSE '24},\n  year = {2024},\n}\n```\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffuzz4all%2Ffuzz4all","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ffuzz4all%2Ffuzz4all","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffuzz4all%2Ffuzz4all/lists"}