https://github.com/Barro/java-afl

Binary rewriting approach with fork server support to fuzz Java applications with afl-fuzz.
https://github.com/Barro/java-afl
Last synced: 5 months ago
JSON representation
Binary rewriting approach with fork server support to fuzz Java applications with afl-fuzz.
Host: GitHub
URL: https://github.com/Barro/java-afl
Owner: Barro
License: apache-2.0
Created: 2018-03-10T23:53:42.000Z (over 7 years ago)
Default Branch: master
Last Pushed: 2018-05-03T22:24:27.000Z (over 7 years ago)
Last Synced: 2024-11-21T14:38:37.417Z (about 1 year ago)
Language: Java
Homepage:
Size: 173 KB
Stars: 88
Watchers: 4
Forks: 8
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: COPYING
Awesome Lists containing this project

awesome-hacking-lists - Barro/java-afl - Binary rewriting approach with fork server support to fuzz Java applications with afl-fuzz. (Java)
README

          This is a fork server based approach to fuzz Java applications on Java

virtual machine with

[american fuzzy lop](http://lcamtuf.coredump.cx/afl/). See

[caveats section](#caveats) about the downsides of this approach.

## Usage

Fuzzing with american fuzzy lop works by instrumenting the compiled

Java bytecode with probabilistic program coverage revealing

instrumentation. There are general types of instrumeting fuzzing modes

in programs that can be fuzzed with `afl-fuzz` command. The default

fork server mode does not need any modifications to the program source

code and can work as is. There are also more efficient deferred fork

server and persistent modes that enable you to skip some

initialization code and keep the JVM running longer than for just one

input.

### Ahead of time instrumentation

Ahead of time instrumentation works by instrumenting specific .jar or

.class files that you want to run with `afl-fuzz` for your

program. This is done by running the built `java-afl-instrument.jar`

and instrumenting each jar or class file that you want to include in

your program. No source code modifications are necessary to get

started:

```bash

$ java -jar java-afl-instrument.jar instrumented/ ClassToTest.class

$ java -jar java-afl-instrument.jar instrumented/ jar-to-test.jar

```

As instrumentation injects native JNI code into the used files, so you

can only run these files on similar enough systems that

`java-afl-instrument.jar` was run on.

Then you are ready to fuzz your Java application with `afl-fuzz`. It

can be done with this type of command with the provided

`java-afl-fuzz` wrapper script:

```bash

$ java-afl-fuzz -m 20000 -i in/ -o /dev/shm/fuzz-out/ -- java -cp instrumented/ ClassToTest

$ java-afl-fuzz -m 20000 -i in/ -o /dev/shm/fuzz-out/ -- java -jar instrumented/jar-to-test.jar

```

### Just in time instrumentation

Just in time instrumentation works by wrapping the main function of a

program that you want to run around a custom instrumentation injecting

[ClassLoader](https://docs.oracle.com/javase/7/docs/api/java/lang/ClassLoader.html). This

way you will get more thorough instrumentation than just running ahead

of time instrumentation on your program, but at the same time the

instrumentation likely covers code that you are not interested in.

Just in time instrumentation works by adding both `java-afl-run.jar`

and the target classes to CLASSPATH and running `javafl.run`

class with the target class name as a parameter:

```bash

$ java-afl-fuzz -m 20000 -i in/ -o /dev/shm/fuzz-out/ \

      -- java -cp java-afl-run.jar:. javafl.run ClassToTest

$ java-afl-fuzz -m 20000 -i in/ -o /dev/shm/fuzz-out/ \

      -- java -cp java-afl-run.jar:jar-to-test.jar javafl.run ClassToTest

```

Notice that there is no need to first instrument the class files, as

it is done on fly. This has the same platform specific limitations as

ahead of time compilation, as this instrumentation injects native JNI

code into the used files. So you can only fuzz programs with

`java-afl-run.jar` on similar enough systems that `java-afl-run.jar`

was built on.

### java-afl-fuzz parameters

Parameters to `java-afl-fuzz` command have following functions:

* `-i in/`: Input directory of initial data that then gets modified

  over the fuzzing process.

* `-o /dev/shm/fuzz-out/`: Output directory for fuzzing state

  data. This should always be on a shared memory drive and never in a

  directory pointing to a physical hard drive.

* `-m 20000`: Higher virtual memory limit that enables JVM to run, as

  the default memory limit in `afl-fuzz` is 50 megabytes. JVM can

  allocate around 10 gigabytes of virtual memory by default.

More detailed description of available options can be found from

[american fuzzy lop's README](http://lcamtuf.coredump.cx/afl/README.txt). You

may also want to adjust maximum heap size with

[`-Xmx`](https://docs.oracle.com/cd/E15523_01/web.1111/e13814/jvm_tuning.htm#PERFM164)

option to be smaller than the default if you fuzz multiple JVM

instances on the same machine to keep memory usage sane.

### Advanced usage

More efficient deferred and persistent modes start each fuzzing

iteration later than at the beginning of `main()` function. Using

deferred or persistent mode requires either a special annotation for

the `main()` function or `--custom-init` flag to the instrument

program:

```java

public class ProgramCustom {

    @javafl.CustomInit

    public static void main(String args[]) {

        ...

    }

}

```

Or you can instrument unmodified code in such way that the init

function does not need to reside inside `main()` by making

`--custom-init` as the first parameter:

```bash

$ java -jar java-afl-instrument.jar --custom-init instrumented/ ClassToTest.class

$ java -jar java-afl-instrument.jar --custom-init instrumented/ jar-to-test.jar

```

To put the application into deferred mode where all the initialization

code that comes before `javafl.fuzz.init()` function can be done in

following fashion:

```java

public class ProgramPersistent {

    @javafl.CustomInit

    public static void main(String[] args) {

        ...

        javafl.fuzz.init();

        // You need to read the actual input after initialization point.

        System.in.read(data_buffer);

        ... do actual input processing...

    }

}

```

To put the program into a persistent mode you need wrap the part that

you want to execute around a `while (javafl.fuzz.loop())`

loop. If you read the input from `System.in`, you need to take care

that you flush Java's buffering on it after you have read your data:

```java

public class ProgramPersistent {

    @javafl.CustomInit

    public static void main(String[] args) {

        ...

        byte[] data = new byte[128];

        int read = 128;

        while (javafl.fuzz.loop(100000)) {

            read = System.in.read(data, 0, data.length);

            // Throw away all buffering information from stdin for the

            // next iteration:

            System.in.skip(9999999);

            ... do actual input processing...

        }

        ...

    }

}

```

### Options controlling instrumentation

Command line switches to `java-afl-instrument.jar`:

* `--custom-init`

* `--deterministic`: by default java-afl produces random class files

  to make it possible to probabilistically get bigger coverage on the

  program from two differently instrumented programs than from

  one. This switch makes the instrumentation depend solely on the

  input data for each class and will always result in the same result

  between different instrumentation runs. Just in time instrumentation

  is always deterministic.

Environmental variables:

* `AFL_INST_RATIO`: by default 100% of program control flow altering

  locations are instrumented. This makes it possible to

  probabilistically select a smaller instrumentation ratio. Smaller

  instrumentation ratios are useful in big programs where resulting

  program execution path traces would otherwise fill the default 16

  bit state map and increasing the map size would add unneeded

  performance penalty.

## Building

As there are tons of different tools to build Java programs with

automatic dependency fetching, java-afl supports more than one way to

build itself.

If you pass american fuzzy lop's source code directory that has

`config.h` file in it, you can pass following C flags to JNI

compilation part:

```

CFLAGS="-I -DHAVE_AFL_CONFIG_H"

```

This makes the compiled information match to what afl-fuzz expects if

it has been modified in any way. Build systems also try to deduce this

(TODO) during compilation from existing `afl-showmap` command if such

exists.

### Bazel

[Bazel](https://bazel.build/) a build tool that can handle very large

programs with ease.

```bash

$ bazel build :java-afl-instrument_deploy.jar :java-afl-run_deploy.jar

# Stand-alone jars are under  bazel-bin/ as java-afl-instrument_deploy.jar and java-afl-run_deploy.jar

```

### CMake

[CMake](https://cmake.org/) is the PHP of build systems. Widely

available and gets stuff done but becomes quite painful after a

while.

```bash

$ ( mkdir -p build-cmake && cd build-cmake && cmake .. -GNinja )

$ ninja -C build-cmake

# Stand-alone jars are under build-cmake/ as java-afl-instrument.jar and java-afl-run.jar

```

### Travis CI [![Build Status](https://travis-ci.org/Barro/java-afl.svg?branch=master)](https://travis-ci.org/Barro/java-afl)

Requires Ubuntu 14.04 based system. You need to have

[ASM 6.1](http://asm.ow2.org/) to build this as a dependency in

addition to Java 8 and afl build dependencies. Currently there is a

crude build script to build and test this implementation:

```bash

$ ./build.sh

```

Even though building requires Java 8, this should be able to

instrument programs that run only on some older versions of Java.

## Performance

Performance numbers on Intel Core i7-3770K CPU @ 3.50GHz with OpenJDK

1.8.0_151 and afl 2.52b. These tests were done with the simple test

programs that are provided at [test/](test/) directory.

* Fork server mode around 750 executions/second for a program that

  does nothing. Closer to 300 when there is actually something

  happening.

* Deferred mode naturally gets something between the fork server mode

  and persistent mode. Depends how heavy the initialization is,

  probably maybe some tens of percents.

* Persistent mode around 14000 executions/second. Highly depends on

  how much and how long JVM is able to optimize before being

  killed. See [caveats](#caveats) section about this. Around 31000

  iterations/second for an empty while loop, that is close to the

  maximum that native C code can handle with `afl-fuzz` in persistent

  mode.

## TODO

* Fix persistent mode loop dynamic instrumentation.

* Check if a dynamically instrumentable class is a file and load it or

  a full jar file instead.

* Support deferred init for arbitrary given method without source code

  modifications. Just prefer the loop syntax and non-forking mode

  instead of fork server one for more speed.

  * Remove the need for `@javafl.CustomInit`.

* Create a non-forking alternative mode.

* More ways to build this:

  * Ant

  * Maven

  * Gradle

* Alternative method implementations based on fuzzing mode (similar to

  C preprocessor's #ifdef/#ifndef). Probably somehow with annotations

  or `System.getProperty("FUZZING_BUILD_MODE_UNSAFE_FOR_PRODUCTION")`.

## Greetz

* Great thanks to [Michał Zalewski](http://lcamtuf.coredump.cx/) for

  [american fuzzy lop](http://lcamtuf.coredump.cx/afl), a crude but

  effective fuzzer. Especially the idea of using a bitmap and randomly

  generated program locations as a fast probabilistic memory bound

  approximation of the program execution path.

* Inspired by [python-afl](http://jwilk.net/software/python-afl) and

  [Kelinci](https://github.com/isstac/kelinci). Just in time

  instrumentation idea from [JQF](https://github.com/rohanpadhye/jqf).

## Dependencies

Mandatory dependencies to build this:

* GNU/Linux system with recently new basic utilities.

* C compiler.

* Java 1.8 or newer to build and to instrument classes with afl-fuzz

  compatible instrumentation. Runtime Java version of instrumented

  classes should be anything that the original class worked with.

* [ASM 6.1](http://asm.ow2.org/)

Optional dependencies for building include one of these:

* Bazel

* CMake

## Caveats

Java virtual machine is a multi-threaded application and

[fork()](http://man7.org/linux/man-pages/man2/fork.2.html) call only

preserves the thread that called it. This creates some issues from

performance and stability point of view:

* There is no garbage collector running in the forked

  process (at least with OpenJDK). Therefore there is a limit on

  objects that the fuzz target can allocate during its lifetime. This

  shouldn't be an issue with a generally lightweight fuzz targets that

  can execute hundreds of times per second, but can become an issue

  with more heavy ones.

* Performance will suffer, as JVM will not be able to use knowledge

  about hotspots in often executed functions.

* Persistent mode has a limited number of cycles that it can run

  before it runs out of memory due to no garbage collector running.

  TODO create a non-forking alternative for persistent mode.

  * This will make afl-fuzz to result in a timeout every so often when

    the program runs out of some resource. If the timeout is set

    manually to be relatively long in otherwise fast fuzz target, it

    will needlessly delay the recovery from a resource leaking situation.

## License

Copyright 2018  Jussi Judin

Licensed under the Apache License, Version 2.0 (the "License");

you may not use this file except in compliance with the License.

You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software

distributed under the License is distributed on an "AS IS" BASIS,

WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.

See the License for the specific language governing permissions and

limitations under the License.
ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/Barro/java-afl

Awesome Lists containing this project

README