{"id":16251529,"url":"https://github.com/pfalcon/scratchablock","last_synced_at":"2026-03-11T07:32:05.733Z","repository":{"id":33187784,"uuid":"36829682","full_name":"pfalcon/ScratchABlock","owner":"pfalcon","description":"Yet another crippled decompiler project","archived":false,"fork":false,"pushed_at":"2021-12-04T08:41:06.000Z","size":927,"stargazers_count":103,"open_issues_count":15,"forks_count":23,"subscribers_count":12,"default_branch":"master","last_synced_at":"2024-10-11T15:10:16.008Z","etag":null,"topics":["data-flow-analysis","decompiler","program-analysis","reverse-engineering"],"latest_commit_sha":null,"homepage":"https://github.com/EiNSTeiN-/decompiler/issues/9#issuecomment-103221200","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/pfalcon.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"COPYING","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2015-06-03T20:46:45.000Z","updated_at":"2024-09-20T13:27:21.000Z","dependencies_parsed_at":"2022-08-17T20:10:43.922Z","dependency_job_id":null,"html_url":"https://github.com/pfalcon/ScratchABlock","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pfalcon%2FScratchABlock","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pfalcon%2FScratchABlock/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pfalcon%2FScratchABlock/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pfalcon%2FScratchABlock/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/pfalcon","download_url":"https://codeload.github.com/pfalcon/ScratchABlock/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":221663796,"owners_count":16859909,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["data-flow-analysis","decompiler","program-analysis","reverse-engineering"],"created_at":"2024-10-10T15:10:30.003Z","updated_at":"2026-03-11T07:32:05.683Z","avatar_url":"https://github.com/pfalcon.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"[![Build Status](https://github.com/pfalcon/ScratchABlock/actions/workflows/pycopy-test.yml/badge.svg)](https://github.com/pfalcon/ScratchABlock/actions/)\n\n**Q**: Why is there a need for yet another decompiler, especially a\ncrippled one?\n\n**A**: A sad truth is that most decompilers out there are crippled. Many\naren't able to decompile trivial constructs, others can't decompile more\nadvanced, those which seemingly can deal with them, are crippled by\nsupporting only the boring architectures and OSes. And almost every\nwritten in such a way that tweaking it or adding a new architecture is\ncomplicated. A decompiler is a tool for reverse engineering, but ironically,\nif you want to use a typical decompiler productively or make it suit your\nneeds, first you will need to reverse-engineer the decompiler itself, and\nthat can easily take months (or years).\n\nHow ScratchABlock is different?\n-------------------------------\n\nThe central part of a decompiler (and any program transformation framework)\nis Intermediate Representation (IR). A decompiler should work on IR, and\nshould take it as an input, and conversion of a particular architecture's\nassembler to this IR should be well decoupled from a decompiler, or\notherwise it takes extraordinary effort to add support for another\narchitecture (which in turn limits userbase of a decompiler).\n\nDecompilation is a complex task, so there should be easy insight into the\ndecompilation process. This means that IR used by a decompiler should be\nhuman-friendly, for example use a syntax familiar to programmers, map as\ndirectly as possible to a typical machine assembler, etc.\n\nThe requirements above should be quite obvious on their own. If not, they\ncan be learnt from the books on the matter, e.g.:\n\n\u003e \"The compiler writer also needs mechanisms that let humans examine the IR\n\u003e program easily and directly. Self-interest should ensure that compiler\n\u003e writers pay heed to this last point.\"\n\u003e\n\u003e (Keith Cooper, Linda Torczon, \"Engineering a Compiler\")\n\nHowever, decompiler projects, including OpenSource ones, routinely violate\nthese requirements: they are tightly coupled with specific machine\narchitectures, don't allow to feed IR in, and oftentimes don't expose or\ndocument it to user at all.\n\nScratchABlock is an attempt to say \"no\" to such practices and develop a\ndecompilation framework based on the requirements above. Note that\nScratchABlock can be considered a learning/research project, and beyond\ngood intentions and criticism of other projects, may not offer too much\nto a casual user - currently, or possibly at all. It can certainly be\ncriticised in many aspects too.\n\n\nDown to Earth part\n------------------\n\nScratchABlock is released under the terms of GNU General Public License v3\n(GPLv3).\n\nScratchABlock is written in Python3 language, and tested with version 3.3\nand up, though may work with 3.2 or lower too (won't work with legacy\nPython2 versions). There're a few dependencies:\n\n* PyYAML, https://pypi.python.org/pypi/PyYAML\n* nose-tests, https://pypi.python.org/pypi/nose (required only for unit\n  tests).\n\nOn Debian/Ubuntu Linux, these can be installed with\n`sudo apt-get install python3-yaml python3-nose`. Alternatively, you can\ninstall these via Python's own `pip` package manager (should work for\nany OS): `pip3 install -r requirements.txt`.\n\nScratchABlock uses the *PseudoC* assembler as its IR. It is an assembler\nlanguage expressed as much as possible using the familiar C language\nsyntax. The idea is that any C programmer would understand it intuitively\n([example](tests/ifelse2.lst)), but there is an ongoing effort to\n[document PseudoC more formally](docs/PseudoC-spec.md).\n\nNote that based on the requirements described in the previous section of\nthe document, and following well-known \"Unix paradigm\", ScratchABlock\ndoes \"one thing\" - analyses and transformations on PseudoC programs,\nand explicitly *not* concerned with converting machine instructions of\nparticular architectures into PseudoC (at least, for now). That means\nthat ScratchABlock doesn't force you to use any particular converter/\nlifter - you can use any you like. Caveat: you would need to have one\nto use it. See the end of the document for some hints in that regard.\n\nSource code and interfacing scripts are in the root of the repository.\nThe most important scripts are:\n\n* `apply_xform.py` - A central driver, allows to apply a sequence of\ntransformations (or in general, high-level analysis/transformation\nscripts) to a single file or a directory of files (\"project directory\").\n\n* `inter_dataflow.py` - Interprocedural (global) dataflow analysis driver\n  (WIP).\n\n* `script_*.py` - High-level analysis/transformation scripts for\n   `apply_xform.py` (`--script` switch).\n\n* `script_i_*.py` - Analysis scripts for `inter_dataflow.py`.\n\n* `run_tests` - The regregression testsuite runner. The majority of\ntestsuite is high-level, consisting of running apply_xform.py with\ndifferent passes on file(s) and checking the expected results.\n\nOther subdirectories of the repository:\n\n* `tests_unit` - Classical unit tests for Python modules, written in\nPython.\n\n* `tests` - The main testsuite. While integrational in the nature, it\nusually tests one pass on one simple file, so follows the unit testing\nphilosophy. Tests are represented as PseudoC input files, while\nexpected results - as PseudoC with basic blocks annotation and (where\napplicable) CFG in .dot format. Looking at these testcases, trying\nto modify them and seeing the outcome is the best way to learn how\nScratchABlock works.\n\n* `docs` - A growing collection of documentation. For example, there's a\n[specification](docs/PseudoC-spec.md) of the PseudoC assembler language\nserving as the intermediate representation (IR) for ScratchABlock and\na [survey](docs/ir-why-not.md) why another existing IR was not selected.\n\nThe current approach of ScratchABlock is to grow a collection of\nrelatively loosely-coupled algorithms (\"passes\") for program analysis\nand transformation, have them covered with tests, and allow easy user\naccess to them. The magic of decompilation consists of applying these\nalgorithms in the rights order and right number of times. Then, to\nimprove the performance of decompilation, these passes usually require\nmore tight coupling. Exploring those directions is the next\npriority after implementing the inventory of passes as described\nabove.\n\nAlgorithms and transformations implemented by ScratchABlock:\n\n* Graph algorithms:\n  * Construction and querying (predecessors, successors, etc.)\n  * Traversal (depth first search (DFS), postorder)\n  * Dominator tree/dominance frontiers\n  * Node splitting\n\n* Static Single Assignment form (SSA):\n  * Construction\n\n* Data flow analysis:\n  * Generic iterative dataflow algorithm framework\n  * Dominator tree\n  * Reaching definitions\n  * Live variables\n  * Building def-use chains\n\n* Propagation:\n  * Constant\n  * Copy\n  * Memory references\n  * Expressions\n\n* Dead code elimination (DCE)\n\n* Rewriting:\n  * Of stack variables\n  * Of structure fields (TODO)\n  * Devirtualization (TODO)\n\n* Control flow structuring:\n  * Removal of jumps-over-jumps\n  * Single exit\n  * Loop single landing site\n  * if/if-else/if-elif-else ladders\n  * Control-flow \"and\" (if (a \u0026\u0026 b))\n  * Abnormal selection via node splitting\n  * while/do-while/infinite loops\n  * Generic loop structuring (TODO)\n  * Unreachable basic blocks elimination (TODO)\n\n* Output formats:\n  * PseudoC\n  * PseudoC with annotated basic blocks\n  * C\n  * .dot (for control flow (CFGs) and other graphs)\n  * YAML (for function properties database)\n\nScratchABlock's partner tool is [ScratchABit](https://github.com/pfalcon/ScratchABit),\nwhich is an interactive disassemler intended to perform the lowest-level\ntasks of decompilation process, like separation of code from data, and\nidentifying function boundaries. ScratchABit usually works with a native\narchitecture assembler syntax, but for some architectures (usually, faithful\nRISCs), if a suitable plugin is available, it can output a PseudoC syntax,\nwhich can serve as input to ScratchABlock.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpfalcon%2Fscratchablock","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fpfalcon%2Fscratchablock","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpfalcon%2Fscratchablock/lists"}