https://github.com/henryhchchc/lsp-fuzz
A grey-box hybrid fuzzer that generates test cases for language servers
https://github.com/henryhchchc/lsp-fuzz
fuzzing language-server-protocol libafl lsp
Last synced: 8 months ago
JSON representation
A grey-box hybrid fuzzer that generates test cases for language servers
- Host: GitHub
- URL: https://github.com/henryhchchc/lsp-fuzz
- Owner: henryhchchc
- License: mit
- Created: 2024-10-07T04:14:39.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2025-10-07T05:58:39.000Z (8 months ago)
- Last Synced: 2025-10-07T07:25:50.020Z (8 months ago)
- Topics: fuzzing, language-server-protocol, libafl, lsp
- Language: Rust
- Homepage: https://aka.henryhc.net/lspfuzz
- Size: 3.28 MB
- Stars: 4
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# LSPFuzz: Hunting Bugs in Language Servers
LSPFuzz is a grey-box hybrid fuzzer that generates test cases for [Language Servers](https://microsoft.github.io/language-server-protocol/).
It is implemented based on [LibAFL](https://github.com/AFLplusplus/LibAFL).
## Technical Details
LSPFuzz is equipped with a two-stage mutation pipeline that produces valid yet diverse inputs to trigger various analysis routines in LSP servers.
To learn more about how it works, please check out the following research paper:
Hengcheng Zhu, Songqiang Chen, Valerio Terragni, Lili Wei, Jiarong Wu, Yepang Liu, and Shing-Chi Cheung.
**LSPFuzz: Hunting Bugs in Language Servers.**
In _Proceedings of the 40th IEEE/ACM International Conference on Automated Software Engineering._ Seoul, South Korea. November 2025.
[π€Conference](https://conf.researchr.org/details/ase-2025/ase-2025-papers/203/LSPFuzz-Hunting-Bugs-in-Language-Servers)
| [πPreprint](https://scholar.henryhc.net/files/publications/2025/ASE2025-LSPFuzz.pdf)
| [π¦Artifacts](https://doi.org/10.5281/zenodo.17052142)
If you use LSPFuzz for academic purposes, please cite the above paper.
## Usage
### Preparation
1. Prepare a fuzz target compatible with [AFL++](https://github.com/AFLplusplus/AFLplusplus).
It is highly recommended to use the [LTO mode](https://github.com/AFLplusplus/AFLplusplus/blob/stable/instrumentation/README.lto.md) and [persistent mode](https://github.com/AFLplusplus/AFLplusplus/blob/stable/instrumentation/README.persistent_mode.md).
The following is an annotated template for a fuzz target.
```c++
#include "your_header_file.h"
#ifndef __AFL_FUZZ_TESTCASE_LEN
// The following definitions allow compilation without the AFL++ compiler.
ssize_t fuzz_len;
#define __AFL_FUZZ_TESTCASE_LEN fuzz_len
const uint8_t fuzz_buf[1024000];
#define __AFL_FUZZ_TESTCASE_BUF fuzz_buf
#define __AFL_FUZZ_INIT() void sync(void);
#define __AFL_LOOP(x) ((fuzz_len = read(0, fuzz_buf, sizeof(fuzz_buf))) > 0 ? 1 : 0)
#define __AFL_INIT() sync()
#endif
__AFL_FUZZ_INIT();
int main(int argc, const char* argv[]) {
#ifdef __AFL_HAVE_MANUAL_CONTROL
__AFL_INIT();
#endif
// [Initialization]
// Perform some one-time initialization for the target LSP server.
// Or call `LLVMFuzzerInitialize(argc, argv)` here.
const uint8_t *buf = __AFL_FUZZ_TESTCASE_BUF;
while (__AFL_LOOP(10000)) {
ssize_t len = __AFL_FUZZ_TESTCASE_LEN;
// [Input Processing]
// Process an input here:
// 1. Read `len` bytes from `buf` for LSP inputs, as if they were read from `stdin`.
// 2. Process the LSP inputs. Note that the input contains the `Content-Length` headers.
// 3. Release resources and reset states
// Or call `LLVMFuzzerTestOneInput(buf, len)` here.
}
return 0;
}
```
> [!NOTE]
> Although persistent mode can significantly improve the fuzzing efficiency, users need to make sure the resource are properly released and states are reset in the fuzzing loop.
2. Obtaining the coverage map size
```bash
AFL_DUMP_MAP_SIZE=1 ./fuzz-target
```
3. Mine code fragments for code generation.
```bash
lsp-fuzz-cli mine-code-fragments \
--search-directory \ # A directory containing code files of the target language of the LSP servers
--output # The file to store the mined code fragments
```
### Start Fuzzing
```bash
lsp-fuzz-cli fuzz \
--state \ # The directory to store the fuzzing state (e.g., generated inputs, found crashes)
--lsp-executable \ # The executable file of the LSP server to fuzz target
--language-fragments Language=\ # Comma-separated list of files containing the mined code fragments, for example, `C=c.frag,CPlusPlus=cpp.frag`
--coverage-map-size \ # The size of the coverage map to use for coverage-guided fuzzing
--time-budget 24 # The time budget for fuzzing in hours
```
To lean more about the options, run `lsp-fuzz-cli fuzz --help`.
### Reproduce Detected Crashes
1. Export the generated crash inducing inputs
```bash
lsp-fuzz-cli export \
--input /solutions \ # The directory containing the generated crash inducing inputs
--output # The directory to store the exported crash inducing inputs
```
The contents of `` will be organized as follows:
```
βββ
β βββ workspace
β β βββ file1.txt
β β βββ file2.txt
β βββ requests
β βββ message_0001
β βββ message_0002
βββ
β βββ workspace
β β βββ file1.txt
β β βββ file2.txt
β βββ requests
β βββ message_0001
β βββ message_0002
βββ ...
```
Each directory `` represents a unique input generated by LSPFuzz.
Within each `` directory, there are two subdirectories: `workspace` and `requests`.
The `workspace` directory contains the code files, and the `requests` directory contains the LSP requests that were sent to the LSP server during the fuzzing process.
> [!NOTE]
> Do not move the exported test cases because the LSP requests are encoded with _absolute paths_, moving them will invalidate the requests.
2. Feed the exported input to the LSP server
To reproduce the crash, `cd` to a directory containing the exported inputs.
```bash
cat requests/* | ./target-lsp-server
```
Note that `target-lsp-server` is the actual LSP server under test, not the fuzz target.
Make sure it reads requests from `stdin`.
To reproduce bugs caught by sanitizers, `target-lsp-server` should be compiled with sanitizers enabled.