https://github.com/ise-uiuc/kernelgpt

KernelGPT: Enhanced Kernel Fuzzing via Large Language Models (ASPLOS 2025)
https://github.com/ise-uiuc/kernelgpt

linux syzkaller testing

Last synced: 2 months ago
JSON representation

KernelGPT: Enhanced Kernel Fuzzing via Large Language Models (ASPLOS 2025)

Host: GitHub
URL: https://github.com/ise-uiuc/kernelgpt
Owner: ise-uiuc
Created: 2024-03-09T15:01:28.000Z (about 1 year ago)
Default Branch: main
Last Pushed: 2025-02-07T21:41:14.000Z (4 months ago)
Last Synced: 2025-03-31T11:01:38.554Z (2 months ago)
Topics: linux, syzkaller, testing
Language: C++
Homepage:
Size: 683 KB
Stars: 88
Watchers: 5
Forks: 14
Open Issues: 3
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# KernelGPT: Enhanced Kernel Fuzzing via Large Language Models

> [!IMPORTANT]
> We are keeping improving the documents and adding more implementation details. Please stay tuned at [README-DEV.md](README-DEV.md) for more information.

**Contact:** [Chenyuan Yang](https://yangchenyuan.github.io/), [Zijie Zhao](https://zijie.cs.illinois.edu/), [Lingming Zhang](https://lingming.cs.illinois.edu).

## About

* **KernelGPT** is a novel approach to automatically inferring Syzkaller specifications via Large Language Models (LLMs) for enhanced kernel fuzzing
* KernelGPT leverages an iterative approach to automatically infer all the necessary specification components, and further leverages the validation feedback to repair/refine the initial specifications.

> [!IMPORTANT]
> * KernelGPT has detected **24** new bugs 🐛 in the Linux kernel, with **11 assigned with CVEs**❗, and 12 of them are fixed.
> * A number of specifications generated by KernelGPT have already been merged into Syzkaller.

## 🔨 Installation

To install the required packages, run the following command:

```bash
pip install -r requirements.txt
```

### Linux & Syzkaller
You need to clone the linux and syzkaller repository to run the code. You can do this by running the following command:

```bash
git submodule update --init --recursive
```

Please refer to the [Sykaller documentation](https://github.com/google/syzkaller/blob/master/docs/linux/setup.md) for setup instructions.

### Image

```bash
cd image && bash create-image.sh
```

## 🔍 Usage

### Parsing

You need to first compile the kernel with Clang and trace the compile commands. To do this, run the following command:

```bash
cd linux
make CC=clang HOSTCC=clang allyesconfig
bear -- make CC=clang HOSTCC=clang -j$(nproc)
```

To parse the Linux repository, run the following command:

```bash
cd spec-gen/analyzer
make all
```

This will create one `analyze` and one `usage` executable in the `spec-gen/analyzer` directory.

⚠️ Possible issues
You need to install `clang` and `libclang-dev` to compile the `analyze` and `usage` executables. More specifically, we need the Clang with version 14. You can install it by running the following command:

```bash
sudo apt-get install clang-14 libclang-dev
```
Please refer to the [analyzer README](spec-gen/analyzer/README.md) for more information.

```bash
./analyze -p /path/to/linux/compile_commands.json
```

Run the `process_output.py` script

```bash
python process_output.py --linux-path /path/to/linux
```

Then collect the usage information

```bash
./usage -p /path/to/linux/compile_commands.json
```

And run the process_output.py script again

```bash
python process_output.py --linux-path /path/to/linux --usage
```

After that, you will get the following files under the `spec-gen/analyzer` directory:
```
processed_enum.json
processed_enum-typedef.json
processed_func.json
processed_handlers.debug.json
processed_handlers.json
processed_ioctl_filtered.json
processed_ioctl.json
processed_struct.json
processed_struct-typedef.json
processed_usage.json
```

### Specification Generation

To generate the specification, first put your OpenAI API key in the `openai_key` file under the `spec-gen` directory. Then run the following command:

```bash
python gen_spec.py -d analyzer/processed_handlers.json -o spec-output -n 1
```

This will generate one specification file in the `spec-output` directory.

Then you can validate and repair the specification by running the following command:

```bash
python eval_spec.py -u -s spec-output/_generated --output-name debug -o eval-output
```

This will validate the specification and generate the repaired specification in the `eval-output` directory.
It will invoke the `spec-eval/run-specs.py`.

### Reuse the Generated Specifications

If you want to reuse our generated specifications for drivers (or sockets), you could use `eval_spec.py`:

```bash
# Under the directory `spec-gen`
python eval_spec.py -u -s ../generated-specs/specs-6.7/correct-driver-spec --output-name debug -o eval-output --merge
```
This command will translate all specification written in `json` to `syzkaller` format and run the syzkaller.
The log for this process is `spec-eval/debug/merged.log`.

Then, all the textural specifications will be under `spec-eval/debug/default-tmp/syzkaller/sys/linux` directory, with `gpt4_`as the prefix.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/ise-uiuc/kernelgpt

Awesome Lists containing this project

README