Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/ise-uiuc/kernelgpt

KernelGPT: Enhanced Kernel Fuzzing via Large Language Models
https://github.com/ise-uiuc/kernelgpt

linux syzkaller testing

Last synced: 4 days ago
JSON representation

KernelGPT: Enhanced Kernel Fuzzing via Large Language Models

Awesome Lists containing this project

README

        

# KernelGPT: Enhanced Kernel Fuzzing via Large Language Models



> [!IMPORTANT]
> We are keeping improving the documents and adding more implementation details. Please stay tuned at [README-DEV.md](README-DEV.md) for more information.

**Contact:** [Chenyuan Yang](https://yangchenyuan.github.io/), [Zijie Zhao](https://zijie.cs.illinois.edu/), [Lingming Zhang](https://lingming.cs.illinois.edu).

## About

* **KernelGPT** is a novel approach to automatically inferring Syzkaller specifications via Large Language Models (LLMs) for enhanced kernel fuzzing
* KernelGPT leverages an iterative approach to automatically infer all the necessary specification components, and further leverages the validation feedback to repair/refine the initial specifications.

> [!IMPORTANT]
> * KernelGPT has detected **24** new bugs 🐛 in the Linux kernel, with **11 assigned with CVEs**❗, and 12 of them are fixed.
> * A number of specifications generated by KernelGPT have already been merged into Syzkaller.

## 🔨 Installation

To install the required packages, run the following command:

```bash
pip install -r requirements.txt
```

### Linux & Syzkaller
You need to clone the linux and syzkaller repository to run the code. You can do this by running the following command:

```bash
git submodule update --init --recursive
```

Please refer to the [Sykaller documentation](https://github.com/google/syzkaller/blob/master/docs/linux/setup.md) for setup instructions.

### Image

```bash
cd image && bash create-image.sh
```

## 🔍 Usage

### Parsing

You need to first compile the kernel with Clang and trace the compile commands. To do this, run the following command:

```bash
cd linux
make CC=clang HOSTCC=clang allyesconfig
bear -- make CC=clang HOSTCC=clang -j$(nproc)
```

To parse the Linux repository, run the following command:

```bash
cd spec-gen/analyzer
make all
```

This will create one `analyze` and one `usage` executable in the `spec-gen/analyzer` directory.

⚠️ Possible issues
You need to install `clang` and `libclang-dev` to compile the `analyze` and `usage` executables. More specifically, we need the Clang with version 14. You can install it by running the following command:

```bash
sudo apt-get install clang-14 libclang-dev
```
Please refer to the [analyzer README](spec-gen/analyzer/README.md) for more information.

```bash
./analyze -p /path/to/linux/compile_commands.json
```

Run the `process_output.py` script

```bash
python process_output.py --linux-path /path/to/linux
```

Then collect the usage information

```bash
./usage -p /path/to/linux/compile_commands.json
```

And run the process_output.py script again

```bash
python process_output.py --linux-path /path/to/linux --usage
```

After that, you will get the following files under the `spec-gen/analyzer` directory:
```
processed_enum.json
processed_enum-typedef.json
processed_func.json
processed_handlers.debug.json
processed_handlers.json
processed_ioctl_filtered.json
processed_ioctl.json
processed_struct.json
processed_struct-typedef.json
processed_usage.json
```

### Specification Generation

To generate the specification, first put your OpenAI API key in the `openai_key` file under the `spec-gen` directory. Then run the following command:

```bash
python gen_spec.py -d analyzer/processed_handlers.json -o spec-output -n 1
```

This will generate one specification file in the `spec-output` directory.

Then you can validate and repair the specification by running the following command:

```bash
python eval_spec.py -u -s spec-output/_generated --output-name debug -o eval-output
```

This will validate the specification and generate the repaired specification in the `eval-output` directory.
It will invoke the `spec-eval/run-specs.py`.