https://github.com/eth-sri/ilf

AI based fuzzer based on imitation learning
https://github.com/eth-sri/ilf

blockchain fuzzing imitation-learning machine-learning smart-contracts symbolic-execution testing

Last synced: 12 months ago
JSON representation

AI based fuzzer based on imitation learning

Host: GitHub
URL: https://github.com/eth-sri/ilf
Owner: eth-sri
License: apache-2.0
Created: 2019-11-23T12:31:16.000Z (over 6 years ago)
Default Branch: master
Last Pushed: 2023-07-26T22:22:55.000Z (almost 3 years ago)
Last Synced: 2024-05-08T00:15:48.435Z (about 2 years ago)
Topics: blockchain, fuzzing, imitation-learning, machine-learning, smart-contracts, symbolic-execution, testing
Language: Python
Size: 4.73 MB
Stars: 144
Watchers: 11
Forks: 32
Open Issues: 2
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

          ILF: AI-based Fuzzer for Ethereum Smart Contracts 

=============================================================================================================



    



ILF is an **I**mitation **L**earning based **F**uzzer for smart contracts. The fuzzing policy, which is used to generate transactions, is represented by an ensemble of neural networks and is learned from thousands of high-quality sequences of transactions generated using symbolic execution. ILF can be used to fuzz any Ethereum smart contract and outputs the coverage and a vulnerability report.

ILF is developed at [SRI Lab, Department of Computer Science, ETH Zurich](https://www.sri.inf.ethz.ch/) as part of the [Machine Learning for Programming](https://www.sri.inf.ethz.ch/research/plml) and [Blockchain Security](https://www.sri.inf.ethz.ch/research/blockchain-security) projects. For mode details, please refer to [ILF CCS'19 paper](https://files.sri.inf.ethz.ch/website/papers/ccs19-ilf.pdf) and [slides](https://files.sri.inf.ethz.ch/website/slides/ccs19-ilf-slides.pdf).

## Setup

We provide a docker file, which we recommend to start with. To build and run:

```

$ docker build -t ilf .

$ docker run -it ilf

```

You can also follow the instructions in the Dockerfile to install ILF locally. If you experience build errors on Apple M chips, please refer to [#21](https://github.com/eth-sri/ilf/issues/21).

## Usage

### Fuzzing

To fuzz the example provided in the repo with ILF (the `imitation` fuzzing policy) using our pre-trained model in the `model` directory:

```

$ python3 -m ilf --proj ./example/crowdsale/ --contract Crowdsale --fuzzer imitation --model ./model/ --limit 2000

```

The `--fuzzer` argument can be replaced by:

* `random`: a uniformly random fuzzing policy.

* `symbolic`: a symbolic execution fuzzing policy based on depth first search of block states. This is used for generating training sequences.

* `sym_plus`: an augmentation of `symbolic` which can revisit encountered block states.

* `mix`: a fuzzing policy that randomly chooses `imitation` or `symbolic` for generating each transaction.

For fuzzing new contracts, one needs to provide a Truffle project (formatted as the example in `example/crowdsale`). Then the script `script/extract.py` should be called to extract deployment transactions of the contracts. For the example contract, the script runs as follows:

```

$ rm example/crowdsale/transactions.json

$ python3 script/extract.py --proj example/crowdsale/ --port 8545

```

Note that you need to kill existing `ganache-cli` processes listening the same port before calling this script.

### Training

For training, one needs to run `symbolic` on a set of training contracts to produce a dataset in a training directory. Usually tens of thousands of contracts are used for training. For demonstration purposes, we show how to produce a small training dataset from our example contract to the `train_data` directory:

```

$ mkdir train_data

$ python3 -m ilf --proj ./example/crowdsale/ --contract Crowdsale --limit 2000 --fuzzer symbolic --dataset_dump_path ./train_data/crowdsale.data

```

Run the scripts to select seed integer values and amount values from the training dataset, and put them into `ilf/fuzzers/imitation/int_values.py` and `ilf/fuzzers/imitation/amounts.py`, respectively:

```

$ python3 script/get_int_values.py --train_dir ./train_data

$ python3 script/get_amounts.py --train_dir ./train_data

```

Then the following command performs neural network training and outputs the trained networks in the `new_model` directory:

```

$ mkdir new_model

$ python3 -m ilf --fuzzer imitation --train_dir ./train_data --model ./new_model

```

### Automatically Constructing Truffle Projects

For evaluation and training purposes, one might want to automatically construct Truffle projects from a large set of contracts. To achieve this, one can write a script to automatically produce files required by Truffle projects, following the format in `example/crowdsale`. The compressed file `truffle_scripts.tar.gz` contains the scripts we used. Those scripts might not run directly but can give you a high level idea how things work.

## Citing ILF

```

@inproceedings{He:2019:LFS:3319535.3363230,

 author = {He, Jingxuan and Balunovi\'{c}, Mislav and Ambroladze, Nodar and Tsankov, Petar and Vechev, Martin},

 title = {Learning to Fuzz from Symbolic Execution with Application to Smart Contracts},

 booktitle = {Proceedings of the 2019 ACM SIGSAC Conference on Computer and Communications Security},

 series = {CCS '19},

 year = {2019},

 isbn = {978-1-4503-6747-9},

 location = {London, United Kingdom},

 pages = {531--548},

 numpages = {18},

 url = {http://doi.acm.org/10.1145/3319535.3363230},

 doi = {10.1145/3319535.3363230},

 acmid = {3363230},

 publisher = {ACM},

 address = {New York, NY, USA},

 keywords = {fuzzing, imitation learning, smart contracts, symbolic execution},

} 

```

## Contributors

* [Jingxuan He](https://www.sri.inf.ethz.ch/people/jingxuan)

* [Mislav Balunović](https://www.sri.inf.ethz.ch/people/mislav)

* Nodar Ambroladze

* [Petar Tsankov](https://www.sri.inf.ethz.ch/people/petar)

* [Martin Vechev](https://www.sri.inf.ethz.ch/people/martin)

* Anton Permenev

## License and Copyright

* Copyright (c) 2019 [Secure, Reliable, and Intelligent Systems Lab (SRI), ETH Zurich](https://www.sri.inf.ethz.ch/)

* Licensed under the [Apache 2.0 License](https://www.apache.org/licenses/LICENSE-2.0)

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/eth-sri/ilf

Awesome Lists containing this project

README