https://github.com/trumank/binfold
Symbol porting
https://github.com/trumank/binfold
bindiff re reverse-engineering symbols
Last synced: about 2 months ago
JSON representation
Symbol porting
- Host: GitHub
- URL: https://github.com/trumank/binfold
- Owner: trumank
- License: mit
- Created: 2025-07-26T17:11:09.000Z (8 months ago)
- Default Branch: master
- Last Pushed: 2025-09-11T23:07:38.000Z (7 months ago)
- Last Synced: 2025-09-12T01:45:51.872Z (7 months ago)
- Topics: bindiff, re, reverse-engineering, symbols
- Language: Rust
- Homepage:
- Size: 1.12 MB
- Stars: 40
- Watchers: 0
- Forks: 3
- Open Issues: 3
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# binfold
A utility for quickly (4 seconds on 100MB exe) and accurately porting huge numbers (102520 out of 190396 matched (>50% !!)) of symbols between similar binaries.
Binary Ninja showing ported symbols

## Building
```bash
cargo build --release
```
## Usage
### 1a. Download a pre-generated database
- [Unreal Engine](https://drive.google.com/file/d/18rWfF7MobqxTc8NQzZOoMzZxuxiTUHAv/view)
### 1b. Generate your own
Create a database from executables with PDBs:
```bash
# Single executable
cargo run --release gen-db -e /path/to/binary.exe -d db.fold
# Multiple executables
cargo run --release gen-db -e /path/to/binary1.exe -e /path/to/binary2.exe -d db.fold
# Recursively scan directories for EXE files with PDBs
cargo run --release gen-db -e /path/to/directory -d db.fold
```
### 2. Analyze an executable
Analyze functions in a binary and optionally match against a database:
```bash
# Generate a PDB file with matched function names
cargo run --release analyze --exe /path/to/binary.exe --database db.fold --generate-pdb
```
## Matching algorithm
The core matching algorithm is based on [WARP](https://github.com/vector35/warp) which is essentially a hash of function body bytes. However, my implementation has deviated significantly where it comes to constraints. The current implementation populates and matches against the following types of constraints:
- function call bodies (hash of function body)
- function call names (hash of symbol name)
- const string references
Constraints can have an optional offset attached but they are not currently used.
## Future
Some more ideas worth exploring:
- matching and naming global variables (tricky because there is nothing to hash like a function body)
- add constraints based on variable names
- add type information to symbols (like WARP does)
- improve function basic block analysis (notably jump tables)