{"id":20683227,"url":"https://github.com/tudasc/metacg","last_synced_at":"2025-04-22T12:23:45.739Z","repository":{"id":110365013,"uuid":"286386642","full_name":"tudasc/MetaCG","owner":"tudasc","description":"MetaCG offers an annotated whole program call-graph tool for Clang/LLVM.","archived":false,"fork":false,"pushed_at":"2024-11-14T14:28:30.000Z","size":10365,"stargazers_count":36,"open_issues_count":2,"forks_count":5,"subscribers_count":6,"default_branch":"master","last_synced_at":"2024-11-14T15:31:10.373Z","etag":null,"topics":["call-graph","clang","llvm","whole-program-analysis"],"latest_commit_sha":null,"homepage":"","language":"C++","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"bsd-3-clause","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/tudasc.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE.txt","code_of_conduct":null,"threat_model":null,"audit":null,"citation":"CITATION.cff","codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":"AUTHORS","dei":null,"publiccode":null,"codemeta":null}},"created_at":"2020-08-10T05:43:50.000Z","updated_at":"2024-10-23T11:05:24.000Z","dependencies_parsed_at":null,"dependency_job_id":"e0733e9c-139c-41a9-b64f-89a181d1be67","html_url":"https://github.com/tudasc/MetaCG","commit_stats":null,"previous_names":[],"tags_count":10,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tudasc%2FMetaCG","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tudasc%2FMetaCG/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tudasc%2FMetaCG/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tudasc%2FMetaCG/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/tudasc","download_url":"https://codeload.github.com/tudasc/MetaCG/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":224974661,"owners_count":17401108,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["call-graph","clang","llvm","whole-program-analysis"],"created_at":"2024-11-16T22:15:58.068Z","updated_at":"2024-11-16T22:15:58.548Z","avatar_url":"https://github.com/tudasc.png","language":"C++","funding_links":[],"categories":[],"sub_categories":[],"readme":"[![License](https://img.shields.io/badge/License-BSD%203--Clause-blue.svg)](https://opensource.org/licenses/BSD-3-Clause)\n\n# MetaCG\n\nMetaCG provides a common whole-program call-graph representation to exchange information between different tools built on top of LLVM/Clang.\nIt uses the json file format and separates structure from information, i.e., caller/callee relation and *meta* information.\n\nThe package contains an experimental Clang-based tool for call-graph construction, and a converter for the output files of [Phasar](https://github.com/secure-software-engineering/phasar).\nAlso, the package contains the PGIS analysis tool, which is used as the analysis backend in [PIRA](https://github.com/tudasc/pira).\n\nOnce constructed, the graph can be serialized into JSON.\n\nThe JSON structure follows either the version two (MetaCGV2) or version three (MetaCGV3) specification.  \nMetaCGV3 is more suitable for a wider range of applications due to having less necessary attributes. \nFor any given function it requires the name and origin of the function and whether its definition is available.\nIt is also usually more space efficient compared to MetaCGV2 due to hashed function names and allows to export metadata not only to nodes but also to edges  \nThe MetaCGV3 specification also includes a more human-readable representation for debugging purposes.  \nIt forgoes the hashing of names and explicitly lists caller and callees in exchange for increased filesize.  \n\u003ctable\u003e\n\u003ctr\u003e\n\u003ctd\u003e\nMetaCGV3 format\n\u003ctd\u003e\nMetaCGV3 debug format\n\u003c/td\u003e\n\u003c/tr\u003e\n\u003ctr\u003e\n\u003ctd\u003e\n\u003cpre\u003e\n{\n \"_MetaCG\":{\n    \"version\":\"3.0\",\n    \"generator\":{```\n      \"name\":\"ToolName\",\n      \"sha\":\"GitSHA\",\n      \"version\":\"ToolVersion\"\n    }\n  },\n  \"_CG\":{\n    \"edges\":[\n      [\n        [11868120863286193964,9631199822919835226],\n        {\"EdgeMetadata\":{\"isHotEdge\":true}}\"\n      ]\n    ],\n    \"nodes\":[\n      [ 9631199822919835226,{\n        \u003cbr\u003e\n        \"functionName\":\" foo\",\n        \"hasBody\":true,\n        \"meta\":{\n          \"NumInstructionsMetadata\":{\"instructions\":5}\n        },\n        \"origin\":\"main.cpp\"\n        }\n      ],\n      [11868120863286193964,{\n        \u003cbr\u003e\n        \"functionName\":\"main\",\n        \"hasBody\":true,\n        \"meta\":{\n          \"NumInstructionsMetadata\":{\"instructions\":8}\n        },\n        \"origin\":\"main.cpp\"\n        }\n      ]\n    ]\n  }\n}\n\u003c/pre\u003e\n\u003c/td\u003e\n\u003ctd\u003e\n\u003cpre\u003e\n{\n  \"_MetaCG\":{\n    \"version\":\"3.0\",\n    \"generator\":{\n      \"name\":\"ToolName\",\n      \"sha\":\"GitSHA\",\n      \"version\":\"ToolVersion\"\n    }\n  },\n  \"_CG\":{\n    \"edges\":[\n      [\n        [11868120863286193964,9631199822919835226],\n        {\"EdgeMetadata\":{\"isHotEdge\":true}}\"\n      ]\n    ],\n    \"nodes\":[\n      [ \"foo\",{\n        \"callees\":[],\n        \"callers\":[\"main\"],\n        \"functionName\":\" foo\",\n        \"hasBody\":true,\n        \"meta\":{\n          \"NumInstructionsMetadata\":{\"instructions\":5}\n        },\n        \"origin\":\"main.cpp\"\n        }\n      ],\n      [ \"main\",{\n        \"callees\":[\"foo\"],\n        \"callers\":[],\n        \"functionName\":\"main\",\n        \"hasBody\":true,\n        \"meta\":{\n          \"NumInstructionsMetadata\":{\"instructions\":8}\n        },\n        \"origin\":\"main.cpp\"\n        }\n      ]\n    ]\n  }\n}\n\u003c/pre\u003e\n\u003c/td\u003e\n\u003c/tr\u003e\n\n\u003c/table\u003e\n\nThe version-two specification (MetaCGV2) contains the following information:\n```{.json}\n{\n \"_MetaCG\": {\n   \"version\": \"2.0\",\n   \"generator\": {\n    \"name\": \"ToolName\",\n    \"version\": \"ToolVersion\"\n    }\n  },\n  \"_CG\": {\n   \"main\": {\n    \"callers\": [],\n    \"callees\": [\"foo\"],\n    \"hasBody\": true,\n    \"isVirtual\": false,\n    \"doesOverride\": false,\n    \"overrides\": [],\n    \"overriddenBy\": [],\n    \"meta\": {\n     \"MetaTool\": {}\n    }\n   }\n  }\n}\n```\n\n- *_MetaCG* contains information about the file itself, such as the file format version.\n- *_CG* contains the actual serialized call graph. For each function, it lists\n  - *callers*: A set of functions that this function may be called from.\n  - *callees*: A set of functions that this function may call.\n  - *hasBody*: Whether a definition of the function was found in the processed source files.\n  - *isVirtual*: Whether the function is marked as virtual.\n  - *doesOverride*: Whether this function overrides another function.\n  - *overrides*: A set of functions that this function overrides.\n  - *overriddenBy*: A set of functions that this function is overridden by.\n  - *meta*: A special field into which a tool can export its (intermediate) results.\n\nThe version two specification is mainly used for the PIRA profiler, to export various additional information about the program's functions into the MetaCG file.\nIt uses the meta field to export e.g. empirically determined performance models, runtime measurements, or loop nesting depth per function.\n\n## Requirements and Building\n\nMetaCG consists of the graph library, a CG construction tool, and an example analysis tool.\nThe graph library is always built, while the CGCollector and the PGIS tool can be disabled at configure time.\n\nWe test MetaCG internally using GCC 10 and 11 for Clang 10 and 14 (for GCC 10) and 13 and 14 (for GCC 11), respectively.\nOther version combinations *may* work.\n\n**Build Requirements (for graph lib)**\n- nlohmann/json library [github](https://github.com/nlohmann/json)\n- spdlog [github](https://github.com/gabime/spdlog)\n\n**Additional Build Requirements (for full build)**\n- Clang/LLVM version 10 (and above)\n- Cube 4.5 [scalasca.org](https://www.scalasca.org/software/cube-4.x/download.html)\n- Extra-P 3.0 [.tar.gz](http://apps.fz-juelich.de/scalasca/releases/extra-p/extrap-3.0.tar.gz)\n- cxxopts [github](https://github.com/jarro2783/cxxopts)\n- PyQt5\n- matplotlib\n\n### Building\n\n#### Graph Library Only\n\nThe default is to build only the graph library.\nThe build requirements are downloaded at configure time.\nWhile CMake looks for `nlohmann-json` version 3.10., MetaCG should work starting from version 3.9.2.\nFor spdlog, we rely on version 1.8.2 -- other versions *may* work.\nIf you do not want to download at configure time, please use the respective CMake options listed below.\n\n```{.sh}\n# To build the MetaCG library w/o PGIS or CGCollector\n$\u003e cmake -S . -B build -G Ninja -DCMAKE_INSTALL_PREFIX=/where/to/install\n$\u003e cmake --build build --parallel\n$\u003e cmake --install build\n```\n\n#### Full Package Build\n\nYou can configure MetaCG to also build CGCollector and PGIS.\nThis requires additional dependencies.\nClang/LLVM (in a supported version) and the Cube library are assumed to be available on the system.\nExtra-P can be built using the `build_submodules.sh` script provided in the repository, though the script is not tested outside of our CI system.\nIt builds and installs Extra-P into `./deps/src` and `./deps/install`, respectively.\n\n```{.sh}\n$\u003e ./build_submodules.sh\n```\n\nThereafter, the package can be configured and built from the top-level CMake.\nChange the `CMAKE_INSTALL_PREFIX` to where you want your MetaCG installation to live.\n\n```{.sh}\n# To build the MetaCG library w/ PGIS and CGCollector\n$\u003e cmake -S . -B build -DCMAKE_INSTALL_PREFIX=/where/to/install -DCUBE_LIB=$(dirname $(which cube_info))/../lib -DCUBE_INCLUDE=$(dirname $(which cube_info))/../include/cubelib -DEXTRAP_INCLUDE=./extern/src/extrap/extrap-3.0/include -DEXTRAP_LIB=./extern/install/extrap/lib\n$\u003e cmake --build build --parallel\n# Installation installs CGCollector, CGMerge, CGValidate, PGIS\n$\u003e cmake --install build\n```\n\n#### General CMake Options\n\nThese options are common for the MetaCG package.\n\n- Bool `METACG_BUILD_CGCOLLECTOR`: Whether to build call-graph construction tool \u003cdefault=OFF\u003e\n- Bool `METACG_BUILD_PGIS`: Whether to build demo-analysis tool \u003cdefault=OFF\u003e\n- Bool `METACG_USE_EXTERNAL_JSON`: Search for installed version of nlohmann-json \u003cdefault=OFF\u003e\n\n#### PGIS CMake Options\n\nThese options are required when building with `METACG_BUILD_PGIS=ON`.\n\n- Path `CUBE_LIB`: Path to the libcube library directory\n- Path `CUBE_INCLUDE`: Path to the libcube include directory\n- Path `EXTRAP_LIB`: Path to the Extra-P library directory\n- Path `EXTRAP_INCLUDE`: Path to the Extra-P include directory\n\n## Usage\n\n### Graph Library\n\nProvides the basic data structures and its means to read and write the call graph with the metadata to a JSON file.\nTo include MetaCG as library in your project, you can simply install, find the package and link the library.\nThis pulls in all required dependencies and compile flags.\n\n```\n# In your project's CMake\nfind_package(MetaCG \u003cVERSION_STR\u003e REQUIRED)\n# Assuming you have a target MainApp\ntarget_link_library(MainApp metacg::metacg)\n```\n\n### CGCollector\nClang-based call-graph generation tool for MetaCG.\nIt has the components CGCollector, CGMerge and CGValidate to construct the partial MCG per translation unit, merge the partial MCGs into the final whole-program MCG and validate edges against a full Score-P profile, respectively.\n\n\n#### Using CGCollector\n\nIt is easiest to apply CGCollector, when a compilation database (`compile_commands.json`) is present.\nThen, CGCollector can be applied to a single source file using\n\n```{.sh}\n$\u003e cgc target.cpp\n```\n\n`cgc` is a wrapper script that (tries to) determines the paths to the Clang standard includes.\n\nSubsequently, the resulting partial MCGs are merged using `CGMerge` to create the final, whole-program call-graph of the application.\n\n```{.sh}\n$\u003e echo \"null\" \u003e $IPCG_FILENAME\n$\u003e find ./src -name \"*.mcg\" -exec cgmerge $IPCG_FILENAME $IPCG_FILENAME {} +\n```\n\n##### CGCollector / CGMerge on Multi-File Projects\n\nThe easiest approch to apply the CGCollector / CGMerge toolchain to a multi-file project is using the `TargetCollector.py` tool.\nIt is a convenience tool around CMake's file API that allows to configure the target project and apply the CGCollector / CGMerge to only the source files required for a given CMake target.\nCheck out the `graph/test/integration/TargetCollector/TestRunner.sh` script for an example invocation.\n\nIn case you want to apply the CGCollector / CGMerge toolchain to a non-CMake project, you need to resort to manually finding the files that need to be processed and merged for the given use case.\n\n#### Validation of Generated Callgraph\n\nOptionally, you can test the call graph for missing edges, by providing an *unfiltered* application profile that was recorded using [Score-P](https://www.vi-hps.org/projects/score-p) in the [Cube](https://www.scalasca.org/scalasca/software/cube-4.x/download.html) library format.\nThis is done using the CGValidate tool, which also allows to patch all missing edges and nodes.\n\n\n### PGIS (The PIRA Analyzer)\n\nThis tool is part of the [PIRA](https://github.com/tudasc/pira) toolchain.\nIt is used as analysis engine and constructs instrumentation selections guided by both static and dynamic information from a target application.\n\n#### Using PGIS\n\nThe PGIS tool offers a variety of modes to operate.\nA list of all modes and options can be found with `pgis_pira --help`.\nCurrently, the user needs to provide any `parameter-file`, as required by the performance model guided instrumentation or the load imbalance detection.\nExamples of such files can be found in the heuristics respective integration test directory.\n\n\n1. Construct overview instrumentation configurations for Score-P from a MetaCG file.\n\n```{.sh}\n$\u003e pgis_pira --metacg-format 2 --static mcg-file\n```\n\n2. Construct hot-spot instrumentation using raw runtime values.\n\n```{.sh}\n$\u003e pgis_pira --metacg-format 2 --cube cube-file mcg-file\n```\n\n3. Construct performance model guided instrumentation configurations for Score-P using Extra-P.\nThe Extra-P configuration lists where to find the experiment series.\nIts content follows what is expected by Extra-P.\n\n```{.json}\n{\n \"dir\": \"./002\",\n \"prefix\": \"t\",\n \"postfix\": \"postfix\",\n \"reps\": 5,\n \"iter\": 1,\n \"params\" : {\n  \"X\": [\"3\", \"5\", \"7\", \"9\", \"11\"]\n }\n}\n```\n\n```{.sh}\n$\u003e pgis_pira --metacg-format 2 --parameter-file \u003cparameter-file\u003e --extrap extrap-file mcg-file\n```\n\n4. Evaluate and construct load-imbalance instrumentation configuration.\n\n```{.sh}\n$\u003e pgis_pira --metacg-format 2 --parameter-file \u003cparameter-file\u003e --lide 1 --cube cube-file mcg-file\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftudasc%2Fmetacg","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ftudasc%2Fmetacg","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftudasc%2Fmetacg/lists"}