An open API service indexing awesome lists of open source software.

https://github.com/sonicstark/sanitizersymbolizertool

__sanitizer::SymbolizerTool ecosystem as a standalone library
https://github.com/sonicstark/sanitizersymbolizertool

backtrace callstack elf llvm macho sanitizer sanitizer-api symbolization symbolize

Last synced: 7 days ago
JSON representation

__sanitizer::SymbolizerTool ecosystem as a standalone library

Awesome Lists containing this project

README

          

# SanitizerSymbolizerTool

## Introduction

Fuzzers from the AFL family strive to avoid the step of *symbolize* which turns virtual addresses to file/line locations when working with [Sanitizers](https://github.com/google/sanitizers). Because

> Similarly, include symbolize=0, since without it, AFL++ may have difficulty telling crashes and hangs apart.

(according to [12) Third-party variables set by afl-fuzz & other tools](https://aflplus.plus/docs/env_variables/))

They do this by

> set abort_on_error and symbolize for all the four sanitizer flags

(see [#1618](https://github.com/AFLplusplus/AFLplusplus/discussions/1618) and [#1624](https://github.com/AFLplusplus/AFLplusplus/issues/1624))

However, sometimes we may still want to check symbol names in the report provided by sanitizer when fuzzing, such as utilizing backtrace info as feedback.
One possibe way to do this is symbolizing addresses and offsets data outside of sanitizer runtime linked in fuzz target,
i.e., use symbolizer in fuzzer itself when necessary.

*SanitizerSymbolizerTool* helps to implement this.
We strip `__sanitizer::SymbolizerTool` and related dependencies from [`compiler-rt`](https://github.com/llvm/llvm-project/tree/main/compiler-rt),
and wrapper them as a standalone library. After introducing it, the fuzzer can use **external** **individual** "tools" that can perform symbolication
by statically analysing target binary (currently, only *llvm-symbolizer* and *addr2line* are supported), with a similar style which implemented in sanitizer runtime.

Currently it doesn't support *Windows* platform. Fully migrating compiler-rt across-platform features will be done in future. But *Fuchsia* will never be supported due to a lack of relevant docs.

One more thing, it's NOT THREAD SAFE.

*SanitizerSymbolizerTool* is under the Apache License v2.0 with LLVM Exceptions (same as [llvm/llvm-project](https://github.com/llvm/llvm-project)).
See `LICENSE` for more details.

## Building

### Dependencies

- *cmake* (version 3.13.4 or newer)
- *clang* & *llvm* (at least 12.0.0)

The source code itself doesn't depend on any special libraries or features from [llvm/llvm-project](https://github.com/llvm/llvm-project).
But to preserve the *LLVM-building-style*, build-system of [compiler-rt](https://github.com/llvm/llvm-project/tree/main/compiler-rt) is migrated and reused.
I tried *16.0.3 Release* first, but the shared cmake modules are not only located in `llvm-project/compiler-rt/cmake`, but also in `llvm-project/cmake`, which makes things get more complicated. Next I happened to find a copy of *12.0.0 Release* on my laptop, which has a truely standalone cmake stuffs in compiler-rt. So I used it anyway, and that's why we have source code based on 16.0.3 and build-system based on 12.0.0.

### Start to build

Run:
* `cd build`
* `cmake .. -DLLVM_CONFIG_PATH=/path/to/llvm-config`
* `make`

The library can be installed to your system with `make install` command then.

Some common options when calling `cmake` (for more information see [Building LLVM with CMake](https://llvm.org/docs/CMake.html)):

* `-DCMAKE_INSTALL_PREFIX=directory` --- Specify for directory the full path name of where you want SanitizerSymbolizerTool library to be installed (default /usr/local).

* `-DCMAKE_BUILD_TYPE=type` --- Valid options for type are Debug, Release, RelWithDebInfo, and MinSizeRel. Default is Debug.

## Usage

### Quick start

Include *sanitizer_symbolizer_tool.h* in your project, use those APIs and link with the library built before when compiling.
If you use the static library, your final executable will need to link with standard C++ library, otherwise you will get undefined references.

For *llvm-symbolizer*, you need a version of it that is not too old - at least from *LLVM version 12.0.1* after some tests. Otherwise `SanSymTool_init` will fail since some command line options are not supported.

For *addr2line*, versions from *GNU Binutils 2.30* or newer are suggested. Older versions have not been tested.

### Learn more

There are some interesting stuffs in `./demo` which can help you explore and learn more about this project.

- bug-san0-dbg0-64

`bug-san0-dbg0-64.bin` is one of ELF binaries built from [microBug](https://github.com/SonicStark/microBug). And `bug-san0-dbg0-64-visualize.html` & `-mem-map.svg` show the structure of this ELF file. The SVG file is generated by [drawio-desktop](https://github.com/jgraph/drawio-desktop) from `bug-san0-dbg0-64-mem-map.drawio`.

- big-symbol

`big-symbol.cpp` is used to challenge a symbolizer with some inline functions and large function names. The four binaries are built by
```bash
clang++-12 -fPIC -pie -O0 -g -o big-symbol-elf-dbg1-pie1.bin ./big-symbol.cpp;
clang++-12 -fPIC -pie -O0 -o big-symbol-elf-dbg0-pie1.bin ./big-symbol.cpp;
clang++-12 -O0 -g -o big-symbol-elf-dbg1-pie0.bin ./big-symbol.cpp;
clang++-12 -O0 -o big-symbol-elf-dbg0-pie0.bin ./big-symbol.cpp;
```
on Ubuntu 18.04.6 LTS (x86_64).

- checksum.txt

Give MD5 checksums of the five *.bin files to detect unexpected data corruption when distributing on the internet.

- DispMemOffset.sh

Use GDB to check the relocation when running an ELF binary built as PIE(**P**osition-**I**ndependent **E**xecutable).

- simple_demo

`simple_demo.c` is an example of using *SanitizerSymbolizerTool* to check each address in *.text*, *.data* and *.bss* of `bug-san0-dbg0-64.bin`.
You can modify it to check other four binaries. These macros may help you:
- for `big-symbol-elf-dbg*-pie1.bin`
```c
#define SEC_HEAD_TEXT 0x0007a0U
#define SEC_TAIL_TEXT 0x003072U
#define SEC_HEAD_DATA 0x206048U
#define SEC_TAIL_DATA 0x206060U
#define SEC_HEAD_BSS 0x206060U
#define SEC_TAIL_BSS 0x208780U
```
- for `big-symbol-elf-dbg*-pie0.bin`
```c
#define SEC_HEAD_TEXT 0x400680U
#define SEC_TAIL_TEXT 0x402f42U
#define SEC_HEAD_DATA 0x606050U
#define SEC_TAIL_DATA 0x606060U
#define SEC_HEAD_BSS 0x606060U
#define SEC_TAIL_BSS 0x608780U
```
`simple_demo_dummy_build.sh` roughly builds `simple_demo` with all available source code instead of linking with the pre-built library, which is used for a quick debug check.