Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/Colton1skees/Dna
LLVM based static binary analysis framework
https://github.com/Colton1skees/Dna
analysis binary deobfuscation instruction-semantics lifter llvm llvm-ir program-analysis static-analysis triton x86 x86-64
Last synced: about 14 hours ago
JSON representation
LLVM based static binary analysis framework
- Host: GitHub
- URL: https://github.com/Colton1skees/Dna
- Owner: Colton1skees
- License: gpl-3.0
- Created: 2022-03-12T19:58:54.000Z (almost 3 years ago)
- Default Branch: master
- Last Pushed: 2024-10-07T22:00:50.000Z (4 months ago)
- Last Synced: 2025-01-28T20:44:30.793Z (6 days ago)
- Topics: analysis, binary, deobfuscation, instruction-semantics, lifter, llvm, llvm-ir, program-analysis, static-analysis, triton, x86, x86-64
- Language: C++
- Homepage:
- Size: 1020 KB
- Stars: 208
- Watchers: 5
- Forks: 19
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- License: LICENSE.txt
Awesome Lists containing this project
README
# Dna
`Dna` is a static binary analysis framework built on top of LLVM. Notably it's written almost entirely in C#, including managed bindings for LLVM, Remill, and Souper.
# Functionality
`Dna` implements an iterative control flow graph reconstruction inspired heavily by the [SATURN](https://arxiv.org/pdf/1909.01752) paper. It iteratively applies recursive descent, lifting (using remill), and path solving until the complete control flow graph is recovered. In the case of jump tables, we use a recursive algorithm based on `Souper` and z3 to solve the set of possible jump table targets. You can find the iterative exploration algorithm [here](https://github.com/Colton1skees/Dna/blob/e70b48b1da4c9b3666cc2a218138c050ab6f9d8b/Dna.BinaryTranslator/Unsafe/IterativeFunctionTranslator.cs#L48), and the jump table solving algorithm [here](https://github.com/Colton1skees/Dna/blob/master/Dna.BinaryTranslator/JmpTables/Precise/SouperJumpTableSolver.cs#L41).
Once a control flow graph has been fully explored, it can then be recompiled to x86 and reinserted into the binary using the algorithms from [here](https://github.com/Colton1skees/Dna/blob/master/Dna.BinaryTranslator/Safe/SafeFunctionTranslator.cs#L46) and [here](https://github.com/Colton1skees/Dna/blob/master/Dna.BinaryTranslator/Safe/FunctionGroupCompiler.cs#L27). Though the compiled code is not pretty by *any* means, it should run so long as the recovered control flow graph is correct. That being said, it is still a research prototype - bugs and edge cases are expected. Control flow graph exploration may fail in the case of e.g. unbounded jump tables or unliftable instructions.
Some other notable features:
- Supports *most* jump tables, including MSVC's nested or so-called compressed jump tables.
- Supports lifting code with SEH to LLVM IR. When SEH is present, `try`/`catch` statements and `filter` intrinsics are inserted into the control flow graph. Though the recompiler does not (yet) support SEH (the SEH entries are not fixed up), so exceptions will cause crashes.
- Includes a strong API for writing LLVM passes natively in C#. We have bindings for e.g. `MemorySSA`, `LoopInfo`, dominator trees, pass pipeline management, etc.
- Graph visualization for LLVM IR and binary control flow graphs using graphviz or alternatively a script generator for binary ninja.Some caveats:
- Only x86_64 is supported
- Recompiled code is not CET compliant# Dependencies
- LLVM/LLVMSharp
- Remill
- Souper
- AsmResolver
- RiversNote that `Dna` is currently based on LLVM 17.
# Building
`Dna` will not build out of the box. Custom patches to remill and souper were needed for this to build on windows. If you would like to work on Dna, open an issue or email me `[email protected]`. At some point I may publish proper build steps, but I make no guarantees.