Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/gamozolabs/fzero_fuzzer

A fast Rust-based safe and thead-friendly grammar-based fuzz generator
https://github.com/gamozolabs/fzero_fuzzer

Last synced: 2 months ago
JSON representation

A fast Rust-based safe and thead-friendly grammar-based fuzz generator

Awesome Lists containing this project

README

        

# Intro

`fzero` is a grammar-based fuzzer that generates a Rust application inspired
by the paper "Building Fast Fuzzers" by Rahul Gopinath and Andreas Zeller.
https://arxiv.org/pdf/1911.07707.pdf

You can find the F1 fuzzer here:

https://github.com/vrthra/F1

# Usage

Currently this only generates an application that does benchmarking, but with
some quick hacks you could easily get the input out and feed it to an
application.

## Example usage

```
D:\dev\fzero_fuzz>cargo run --release html.json test.rs test.exe 8
Finished release [optimized] target(s) in 0.02s
Running `target\release\fzero.exe html.json test.rs test.exe 8`
Loaded grammar json
Converted grammar to binary format
Optimized grammar
Generated Rust source file
Created Rust binary!

D:\dev\fzero_fuzz>test.exe
MiB/sec: 1773.3719
MiB/sec: 1763.8357
MiB/sec: 1756.8917
MiB/sec: 1757.1934
MiB/sec: 1758.9417
MiB/sec: 1758.9122
MiB/sec: 1758.7352
```

# Concept

This program takes in an input grammar specified by a JSON file. This JSON
grammar representation is converted to a binary-style grammar that is intended
for interpretation and optimization. A Rust application (source file) is
produced by the shape of the input grammar. This then is compiled using `rustc`
to an application for the local machine.

This doesn't have any constraints on the random number generation as it uses an
infinite supply of random numbers. There is no limitation on the output size
and the buffer will dynamically grow as the input is created.

# Benchmarks

All tests on a single core of a `Intel(R) Core(TM) i7-8700 CPU @ 3.20GHz` with a turbo clock rate of 4.3 GHz

All numbers in `MiB/second`.

| Benchmark | fzero fuzzer | F1 fuzzer | Speedup |
|--------------------|--------------|-----------|---------|
| html.json depth=4 | 5330 | 1295 | 4.11x |
| html.json depth=8 | 1760 | 348 | 5.05x |
| html.json depth=16 | 338 | 195 | 1.73x |
| html.json depth=32 | 218 | 175 | 1.25x |
| html.json depth=64 | 201 | 175 | 1.14x |
| json.json depth=4 | 97 | 97 | 1.00x |
| json.json depth=8 | 79 | 93 | 0.84x |
| json.json depth=16 | 83 | 89 | 0.93x |
| json.json depth=32 | 85 | 88 | 0.97x |
| json.json depth=64 | 85 | 90 | 0.94x |

# Unsafe code

This project uses a small amount of `unsafe` code to provide the same semantics
of `extend_from_slice` but in a much faster way (over 4x faster). Not quite
sure why it's much faster, but if you are uncomfortable with `unsafe` code,
feel free to set `SAFE_ONLY` to `true` at the top of `src/lib.rs`. This will
restrict this fuzzer to only generate safe code. I don't think this is
necessary but who knows :)

# Performance

The performance of this tool is separated into multiple categories. One is the
code generation side, how long it takes for the JSON to be compiled into a Rust
application. The other is the code execution speeds, which is how fast the
produced application can generate inputs.

## Code Generation

Code generation vastly outperforms the "Building Fast Fuzzers" paper. For
example when generating the code based on the `html.json` grammar, the F1
fuzzer took over 25 minutes to produce the code. This fuzzer is capable of
producing a Rust application in under 10 seconds.

## Code execution

This project is on some performance metrics about 20-30% slower than the F1
fuzzer, but these scenarios are rare. However, in most situations we've been
about to out-perform F1 by about 30-50%, and in extreme cases (html.json
depth=8) we've observed over a 4x speedup.

# Differences from the F1 fuzzer

The F1 fuzzer mentions a technique that will resolve to the nearest terminal
tokens when stack depth is exceeded. We haven't implemented this technique but
I don't think it's a huge impact on the generated inputs. This is something I
will look into in the future.

Due to not using globals this can easily be scaled out to multiple threads as
all random state and input generation are done in a structure.

There is no use of assembly in this project, and thus it can produce
highly-performant fuzzers for any architecture or environment that Rust can
compile against (pretty much identical to LLVM's target list).