{"id":20329289,"url":"https://github.com/bdcht/ccrawl","last_synced_at":"2025-04-04T21:09:17.116Z","repository":{"id":57417010,"uuid":"169878825","full_name":"bdcht/ccrawl","owner":"bdcht","description":"clang-based search engine for C/C++ data structures, classes, prototypes \u0026 macros","archived":false,"fork":false,"pushed_at":"2024-11-14T10:14:14.000Z","size":1556,"stargazers_count":101,"open_issues_count":0,"forks_count":10,"subscribers_count":11,"default_branch":"release","last_synced_at":"2025-03-28T20:09:20.420Z","etag":null,"topics":["clang","database","reverse-engineering","structures"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/bdcht.png","metadata":{"files":{"readme":"README.rst","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2019-02-09T15:08:25.000Z","updated_at":"2024-05-18T06:07:40.000Z","dependencies_parsed_at":"2024-11-14T11:20:33.060Z","dependency_job_id":"fd345c3c-d0f4-4d13-9579-081f11dba2a4","html_url":"https://github.com/bdcht/ccrawl","commit_stats":{"total_commits":275,"total_committers":1,"mean_commits":275.0,"dds":0.0,"last_synced_commit":"22673f31a52e8c4f9264e03b2f380f3a629e202c"},"previous_names":[],"tags_count":11,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bdcht%2Fccrawl","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bdcht%2Fccrawl/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bdcht%2Fccrawl/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bdcht%2Fccrawl/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/bdcht","download_url":"https://codeload.github.com/bdcht/ccrawl/tar.gz/refs/heads/release","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247249529,"owners_count":20908212,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["clang","database","reverse-engineering","structures"],"created_at":"2024-11-14T20:09:58.241Z","updated_at":"2025-04-04T21:09:17.086Z","avatar_url":"https://github.com/bdcht.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"======\nCcrawl\n======\n\n.. image:: http://readthedocs.org/projects/ccrawl/badge/?version=latest\n    :target: http://ccrawl.readthedocs.io/en/latest/?badge=latest\n    :alt: Documentation Status\n\n.. image:: https://badge.fury.io/py/ccrawl.svg\n    :target: https://badge.fury.io/py/ccrawl\n\n\n+-----------+--------------------------------------------------+\n| Status:   | Under Development                                |\n+-----------+--------------------------------------------------+\n| Location: | https://github.com/bdcht/ccrawl                  |\n+-----------+--------------------------------------------------+\n| Version:  | 1.x                                              |\n+-----------+--------------------------------------------------+\n|  Doc:     | http://ccrawl.readthedocs.io/en/latest/index.html|\n+-----------+--------------------------------------------------+\n\nDescription\n===========\n\nCcrawl uses clang_ to build a database related to various C/C++ data structures\n(struct, union, class, enum, typedef, prototypes and macros) which allows to identify\ndata types and constants/macros by querying this database for specific properties, including\nproperties related to the struct/class memory layout.\n\nBasically it allows for example to\n\n- **\"find all structures that have a pointer to char at offset 8 and an unsigned integer at offset 56 ?**\n- **\"find types with a total size of 96 bytes ?\"**  or\n- **\"find every macro that define value 0x1234 ?\"** or\n- **\"find the mask of values from enum X that correspond to 0xabcd ?\"**\n- **\"find all functions that return 'size_t' and have 'struct X' as first argument ?\"**\n\nCcrawl then allows to output found structures in many formats: C/C++ of course,\nbut also ctypes_, or amoco_. The ctypes_ output of a C++ class corresponds to\nan instance (object) layout in memory, including all virtual table pointers (or VTT)\nthat result from possibly multiple parent (possibly virtual) classes.\n\nFinally, Ccrawl allows to compute various statistics about a library API, and allows to\ncompute the dependency graph of any given type like for example (see tests/samples/xxx/graph.h):\n\n.. image:: https://github.com/bdcht/ccrawl/blob/release/doc/g.png\n   :width: 800\n\nUser documentation and API can be found at\n`http://ccrawl.readthedocs.io/en/latest/index.html`\n\nExamples\n========\n\nConsider the following C struct from file *samples/simple.h* ::\n\n  struct S {\n    char c;\n    int  n;\n    union {\n      unsigned char x[2];\n      unsigned short s;\n    } u;\n    char (*PtrCharArrayOf3[2])[3];\n    void (*pfunc)(int, int);\n  };\n\nFirst, collect the structure definition in a local database::\n\n  $ ccrawl -l test.db -g 'test0' collect samples/simple.h\n  [100%] simple.h                                                [  2]\n  --------------------------------------------------------------------\n  saving database...                                            [   2]\n\nThen, its possible to translate the full structure in ctypes_ ::\n\n  $ ccrawl -l test.db show -r -f ctypes 'struct S'\n  struct_S = type('struct_S',(Structure,),{})\n  union_b0eccf67 = type('union_b0eccf67',(Union,),{})\n  union_b0eccf67._fields_ = [(\"x\", c_ubyte*2),\n                             (\"s\", c_ushort)]\n\n  struct_S._anonymous_ = (\"u\",)\n  struct_S._fields_ = [(\"c\", c_byte),\n                       (\"n\", c_int),\n                       (\"u\", union_b0eccf67),\n                       (\"PtrCharArrayOf3\", POINTER(c_byte*3)*2),\n                       (\"pfunc\", POINTER(CFUNCTYPE(None, c_int, c_int)))]\n\nOr simply to compute the fields offsets ::\n\n  $ ccrawl -l test.db info 'struct S'\n  identifier: struct S\n  class     : cStruct\n  source    : simple.h\n  tag       : test0\n  size      : 40\n  offsets   : [(0, 1), (4, 4), (8, 2), (16, 16), (32, 8)]\n\nNow let's deal with a more tricky C++ example::\n\n  $ ccrawl -l test.db collect -a --cxx samples/shahar.cpp\n  [100%] shahar.cpp                                              [ 18]\n  --------------------------------------------------------------------\n  saving database...                                            [  18]\n\nWe can show a *full* (recursive) definition of a class::\n\n  $ ccrawl -l test.db show -r 'class Child'\n  class Grandparent {\n    public:\n      virtual void grandparent_foo();\n      int grandparent_data;\n  };\n  \n  class Parent1 : virtual public Grandparent {\n    public:\n      virtual void parent1_foo();\n      int parent1_data;\n  };\n  class Parent2 : virtual public Grandparent {\n    public:\n      virtual void parent2_foo();\n      int parent2_data;\n  };\n\n  class Child : public Parent1, public Parent2 {\n    public:\n      virtual void child_foo();\n      int child_data;\n  };\n\nAnd its ctypes_ memory layout::\n\n  $ ccrawl -l test.db show -f ctypes 'class Child'\n  struct___layout$Child = type('struct___layout$Child',(Structure,),{})\n  \n  struct___layout$Child._fields_ = [(\"__vptr$Parent1\", c_void_p),\n                                    (\"parent1_data\", c_int),\n                                    (\"__vptr$Parent2\", c_void_p),\n                                    (\"parent2_data\", c_int),\n                                    (\"child_data\", c_int),\n                                    (\"__vptr$Grandparent\", c_void_p),\n                                    (\"grandparent_data\", c_int)]\n\nSee the documentation for more examples.\n\nTodo\n====\n\n- improve C++ support (other layouts)\n- add web frontend\n- plugin for IDA Pro\n\nChangelog\n=========\n\n- `v1.10`_\n\n  * add convert command to translate some C input (from stdin) to other supported formats\n  * update external ghidra interface for use with Ghidra 10.4 (DEV)\n  * add function to colorize the Ghidra listing from tracefile generated by a gdbserver.\n  * allow collecting more file extensions (aka .C, .H, .i)\n\n- `v1.9`_\n\n  * add major preprocessing feature for improving the collect command\n  * add export command to send type definition in Ghidra\n  * update and improve documentation with FreeRTOS example\n  * add 'find_function_with_type' in ghidra extension module\n\n- `v1.8`_\n\n  * add graph command to output (in dot format) the dependency graph for a given root structure\n  * add --structs option to stats command which tries to build structures and report missing refs\n  * add find_calls_to method in mongodb proxy class to report collected \"calls\" from function's body\n  * add amoco.system.structs to ccrawl.core converter\n  * fix \"struct volatile\" case (libclang-14)\n  * fix support for bitfield structure with unnamed field in ext.ghidra\n\n- `v1.7`_\n\n  * optionally parse functions' bodies and update 'cFunc' descriptions with parsed infos\n  * add sync command to update mongodb remote database from a rebuilt local database\n  * improve Ghidra's interface to detect structures\n  * add pointer size option to compute structures' fields offsets\n  * fix: adjust enum size to its minimal needed size\n  * fix: apply global tag filter to all queries to the ProxyDB\n  * update to libclang-14\n\n- `v1.6`_\n\n  * add external interface to export types into Ghidra's data type manager\n  * add find_matching_types to replicate the Ghidra's \"auto_struct\" command\n  * add database(s) cleanup methods\n\n- `v1.5`_\n\n  * update code for libclang-12 (using python3-clang)\n  * update to tinydb v4.x\n\n- `v1.4`_\n\n  * update code for libclang-10 (using python3-clang)\n  * improve bitfield support\n\n- `v1.3`_\n\n  * add Flask-based REST API and server command\n  * support for mongodb database backend\n  * support for local tinydb databases\n  * c_type and cxx_type parsers for C/C++ types\n  * support anonymous types in C structs/unions\n  * support C++ multiple inheritance, including virtual parents\n  * basic support for C++ class \u0026 function templates\n  * support bitfield structures\n  * support user-defined alignment policies\n\n.. _clang: https://pypi.org/project/clang/\n.. _ctypes: https://docs.python.org/3.7/library/ctypes.html\n.. _amoco: https://github.com/bdcht/amoco\n.. _v1.10: https://github.com/bdcht/ccrawl/releases/tag/v1.10\n.. _v1.9: https://github.com/bdcht/ccrawl/releases/tag/v1.9\n.. _v1.8: https://github.com/bdcht/ccrawl/releases/tag/v1.8\n.. _v1.7: https://github.com/bdcht/ccrawl/releases/tag/v1.7\n.. _v1.6: https://github.com/bdcht/ccrawl/releases/tag/v1.6\n.. _v1.5: https://github.com/bdcht/ccrawl/releases/tag/v1.5\n.. _v1.4: https://github.com/bdcht/ccrawl/releases/tag/v1.4\n.. _v1.3: https://github.com/bdcht/ccrawl/releases/tag/v1.3\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbdcht%2Fccrawl","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fbdcht%2Fccrawl","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbdcht%2Fccrawl/lists"}