Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/NASP-THU/ProphetFuzz
[CCS'24] An LLM-based, fully automated fuzzing tool for option combination testing.
https://github.com/NASP-THU/ProphetFuzz
fuzzing
Last synced: 3 days ago
JSON representation
[CCS'24] An LLM-based, fully automated fuzzing tool for option combination testing.
- Host: GitHub
- URL: https://github.com/NASP-THU/ProphetFuzz
- Owner: NASP-THU
- Created: 2024-08-28T02:45:16.000Z (4 months ago)
- Default Branch: main
- Last Pushed: 2024-10-24T06:26:01.000Z (3 months ago)
- Last Synced: 2024-10-25T01:17:20.564Z (3 months ago)
- Topics: fuzzing
- Language: Roff
- Homepage:
- Size: 887 KB
- Stars: 15
- Watchers: 3
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
- awesome_ai_agents - Prophetfuzz - [CCS'24] An LLM-based, fully automated fuzzing tool for option combination testing. (Building / Tools)
- awesome_ai_agents - Prophetfuzz - [CCS'24] An LLM-based, fully automated fuzzing tool for option combination testing. (Building / Tools)
README
# ProphetFuzz
The implementation of the paper titled **"ProphetFuzz: Fully Automated Prediction and Fuzzing of High-Risk Option Combinations with Only Documentation via Large Language Model"**
ProphetFuzz is an LLM- based, fully automated fuzzing tool for option combination testing. ProphetFuzz can predict and conduct fuzzing on high-risk option combinations 1 with only documentation, and the entire process operates without manual intervention.
For more details, please refer to [our paper](https://dl.acm.org/doi/10.1145/3658644.3690231) from ACM CCS'24.
Due to page limitations, the Appendix of the paper could not be included within the main text. Please refer to [Appendix](Appendix.md).
## Structure
```
.
├── Dockerfile
├── README.md
├── assets
│ ├── dataset
│ │ ├── groundtruth_for_20_programs.json
│ │ └── precision.json
│ └── images
├── fuzzing_handler
│ ├── cmd_fixer.py
│ ├── code_checker.py
│ ├── config.json
│ ├── run_cmin.py
│ ├── run_fuzzing.sh
│ └── utils
│ ├── analysis_util.py
│ ├── code_utils.py
│ └── execution_util.py
├── llm_interface
│ ├── assemble.py
│ ├── config
│ │ └── .env
│ ├── constraint.py
│ ├── few-shot
│ │ ├── manpage_htmldoc.json
│ │ ├── manpage_jbig2.json
│ │ ├── manpage_jhead.json
│ │ ├── manpage_makeswf.json
│ │ ├── manpage_mp4box.json
│ │ ├── manpage_opj_compress.json
│ │ ├── manpage_pdf2swf.json
│ │ └── manpage_yasm.json
│ ├── few-shot_generate.py
│ ├── input
│ ├── output
│ ├── predict.py
│ ├── restruct_manpage.py
│ └── utils
│ ├── gpt_utils.py
│ └── opt_utils.py
├── manpage_parser
│ ├── input
│ ├── output
│ ├── parser.py
│ └── utils
│ └── groff_utils.py
└── run_all_in_one.sh
```1. manpage_parser: Scripts for parsing documentation
2. llm_interface: Scripts for extracting constraints, predicting high-risk option combinations, and assembling commands.
3. fuzzing_handler: Scripts for preparing and conducting fuzzing.
4. assets/dataset: Dataset for eveluating constraint extraction module.
5. run_all_in_one.sh: Scripts for completing everything with one script.
6. Dockerfile: Building our experiment environment (Tested on Ubuntu 20.04)The implementations for various components of ProphetFuzz can be found in the following functions,
| Section | Component | File | Function |
|----|----|----|----|
| 3.2 | Constraint Extraction | [llm_interface/constraint.py](llm_interface/constraint.py) | extractRelationships |
| 3.2 | Self Check | [llm_interface/constraint.py](llm_interface/constraint.py) | checkRelationships |
| 3.3 | AutoCoT | [llm_interface/few-shot_generate.py](llm_interface/few-shot_generate.py) | generatePrompt |
| 3.3 | High-Risk Combination Prediction | [llm_interface/predict.py](llm_interface/predict.py)| predictCombinations |
| 3.4 | Command Assembly | [llm_interface/assembly.py](llm_interface/assembly.py) | generateCommands |
| 3.5 | File Generation | [fuzzing_handler/generate_combination.py](scripts/generate_combination.py) | main |
| 3.5 | Corpus Minimization | [fuzzing_handler/run_cmin.py](scripts/run_cmin.py) | runCMinCommands |
| 3.5 | Fuzzing | [fuzzing_handler/run_fuzzing.sh](fuzzing_handler/run_fuzzing.sh) | runFuzzing |## Usage Example
Here's the English translation:
1. **Using Docker to Configure the Running Environment**
- If you only want to complete the part that interacts with the LLM, you can directly use our pre-installed image (4GB):
```
docker run -it 4ugustus/prophetfuzz_base bash
```- If you want to complete the entire process, including seed generation, command repair, and fuzzing, please build the full image based on the pre-installed image:
```
docker build -t prophetfuzz:latest .
docker run -it --privileged=true prophetfuzz bash
# 'privileged' is used for setting up the fuzzing environment
```2. **Set Up Your API Key**:
Set your OpenAI API key in the `llm_interface/config/.env` file:
```bash
OPENAI_API_KEY="[Input Your API Key Here]"
```2. **Run the Script**:
Execute the script to start the automated fuzzing process:
```bash
bash run_all_in_one.sh bison
```**Note**: If you are not within our Docker environment, you might need to manually install dependencies and adjust the `fuzzing_handler/config.json` file to specify the path to the program under test.
If you prefer to start fuzzing manually, use the following command:
```bash
fuzzer/afl-fuzz -i fuzzing_handler/input/bison -o fuzzing_handler/output/bison_prophet_1 -m none -K fuzzing_handler/argvs/argvs_bison.txt -- path/to/bison/bin/bison @@
```## CVEs Assigned ##
We employ ProphetFuzz to perform persistent fuzzing on the latest versions of the programs in our dataset. To date, ProphetFuzz has uncovered 140 zero-day or half-day vulnerabilities, 93 of which have been confirmed by the developers, earning 21 CVE numbers.
| CVE | Program | Type |
| -------------- | --------- | ------------------------ |
| CVE-2024-3248 | xpdf | stack-buffer-overflow |
| CVE-2024-4853 | editcap | heap-buffer-overflow |
| CVE-2024-4855 | editcap | bad free |
| CVE-2024-31744 | jasper | assertion failure |
| CVE-2024-31745 | dwarfdump | use-after-free |
| CVE-2024-31746 | objdump | heap-buffer-overflow |
| CVE-2024-32154 | ffmpeg | segmentation violation |
| CVE-2024-32157 | mupdf | segmentation violation |
| CVE-2024-32158 | mupdf | negative-size-param |
| CVE-2024-34960 | ffmpeg | floating point exception |
| CVE-2024-34961 | pspp | segmentation violation |
| CVE-2024-34962 | pspp | segmentation violation |
| CVE-2024-34963 | pspp | assertion failure |
| CVE-2024-34965 | pspp | assertion failure |
| CVE-2024-34966 | pspp | assertion failure |
| CVE-2024-34967 | pspp | assertion failure |
| CVE-2024-34968 | pspp | assertion failure |
| CVE-2024-34969 | pspp | segmentation violation |
| CVE-2024-34971 | pspp | segmentation violation |
| CVE-2024-34972 | pspp | assertion failure |
| CVE-2024-35316 | ffmpeg | segmentation violation |## Credit ##
Thanks to Dawei Wang ([@4ugustus](https://github.com/waugustus)) and Geng Zhou ([@Arbusz](https://github.com/Arbusz)) for their valuable contributions to this project.
## Citing this paper ##
In case you would like to cite ProphetFuzz, you may use the following BibTex entry:
```
@inproceedings {wang2024prophet,
title = {ProphetFuzz: Fully Automated Prediction and Fuzzing of High-Risk Option Combinations with Only Documentation via Large Language Model},
author = {Wang, Dawei and Zhou, Geng and Chen, Li and Li, Dan and Miao, Yukai},
booktitle = {Proceedings of the 2024 ACM SIGSAC Conference on Computer and Communications Security},
publisher = {Association for Computing Machinery},
address = {Salt Lake City, UT, USA},
pages = {735–749},
year = {2024}
}
```