{"id":13667495,"url":"https://github.com/SRI-CSL/gllvm","last_synced_at":"2025-04-26T15:33:02.654Z","repository":{"id":22200652,"uuid":"95579616","full_name":"SRI-CSL/gllvm","owner":"SRI-CSL","description":"Whole Program LLVM: wllvm ported to go","archived":false,"fork":false,"pushed_at":"2024-04-28T19:23:43.000Z","size":986,"stargazers_count":315,"open_issues_count":13,"forks_count":34,"subscribers_count":31,"default_branch":"master","last_synced_at":"2025-04-25T00:52:37.830Z","etag":null,"topics":["bitcode","bitcode-files","bitcode-generation","clang","compilers","klee","llvm"],"latest_commit_sha":null,"homepage":"","language":"Go","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"bsd-3-clause","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/SRI-CSL.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2017-06-27T16:37:23.000Z","updated_at":"2025-04-23T07:46:23.000Z","dependencies_parsed_at":"2023-01-11T21:31:23.532Z","dependency_job_id":"afd69f8b-bad3-4f85-9c66-06dcd79748ba","html_url":"https://github.com/SRI-CSL/gllvm","commit_stats":null,"previous_names":[],"tags_count":13,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/SRI-CSL%2Fgllvm","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/SRI-CSL%2Fgllvm/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/SRI-CSL%2Fgllvm/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/SRI-CSL%2Fgllvm/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/SRI-CSL","download_url":"https://codeload.github.com/SRI-CSL/gllvm/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":251008928,"owners_count":21522199,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["bitcode","bitcode-files","bitcode-generation","clang","compilers","klee","llvm"],"created_at":"2024-08-02T07:00:38.362Z","updated_at":"2025-04-26T15:32:57.644Z","avatar_url":"https://github.com/SRI-CSL.png","language":"Go","readme":"\u003cp align=\"center\"\u003e\n\u003cimg align=\"center\" src=\"data/dragon128x128.png?raw_true\"\u003e\n\u003c/p\u003e\n\n# Whole Program LLVM in Go\n\n[![License](https://img.shields.io/badge/License-BSD%203--Clause-blue.svg)](https://opensource.org/licenses/BSD-3-Clause)\n[![Build Status](https://travis-ci.org/SRI-CSL/gllvm.svg?branch=master)](https://travis-ci.org/SRI-CSL/gllvm)\n[![Go Report Card](https://goreportcard.com/badge/github.com/SRI-CSL/gllvm)](https://goreportcard.com/report/github.com/SRI-CSL/gllvm)\n\n**TL; DR:**  A drop-in replacement for [wllvm](https://github.com/SRI-CSL/whole-program-llvm), that builds the\nbitcode in parallel, and is faster. A comparison between the two tools can be gleaned from building the [Linux kernel.](https://github.com/SRI-CSL/gllvm/tree/master/examples/linux-kernel)\n\n## Quick Start Comparison Table\n\n| wllvm command/env variable  | gllvm command/env variable  |\n|-----------------------------|-----------------------------|\n|  wllvm                      | gclang                      |\n|  wllvm++                    | gclang++                    |\n|  wfortran                   | gflang                      |\n|  extract-bc                 | get-bc                      |\n|  wllvm-sanity-checker       | gsanity-check               |\n|  LLVM_COMPILER_PATH         | LLVM_COMPILER_PATH          |\n|  LLVM_CC_NAME      ...      | LLVM_CC_NAME          ...   |\n|                             | LLVM_F_NAME                 |\n|  WLLVM_CONFIGURE_ONLY       | WLLVM_CONFIGURE_ONLY        |\n|  WLLVM_OUTPUT_LEVEL         | WLLVM_OUTPUT_LEVEL          |\n|  WLLVM_OUTPUT_FILE          | WLLVM_OUTPUT_FILE           |\n|  LLVM_COMPILER              | *not supported* (clang only)|\n|  LLVM_GCC_PREFIX            | *not supported* (clang only)|\n|  LLVM_DRAGONEGG_PLUGIN      | *not supported* (clang only)|\n|  LLVM_LINK_FLAGS            | LLVM_LINK_FLAGS             |\n\n\nThis project, `gllvm`, provides tools for building whole-program (or\nwhole-library) LLVM bitcode files from an unmodified C or C++\nsource package. It currently runs on `*nix` platforms such as Linux,\nFreeBSD, and Mac OS X. It is a Go port of [wllvm](https://github.com/SRI-CSL/whole-program-llvm).\n\n`gllvm` provides compiler wrappers that work in two\nphases. The wrappers first invoke the compiler as normal. Then, for\neach object file, they call a bitcode compiler to produce LLVM\nbitcode. The wrappers then store the location of the generated bitcode\nfile in a dedicated section of the object file.  When object files are\nlinked together, the contents of the dedicated sections are\nconcatenated (so we don't lose the locations of any of the constituent\nbitcode files). After the build completes, one can use a `gllvm`\nutility to read the contents of the dedicated section and link all of\nthe bitcode into a single whole-program bitcode file. This utility\nworks for both executable and native libraries.\n\nFor more details see [wllvm](https://github.com/SRI-CSL/whole-program-llvm).\n\n## Prerequisites\n\nTo install `gllvm` you need the go language [tool](https://golang.org/doc/install).\n\nTo use `gllvm` you need clang/clang++/flang and the llvm tools llvm-link and llvm-ar.\n`gllvm` is agnostic to the actual llvm version. `gllvm` also relies on standard build\ntools such as `objcopy` and `ld`.\n\n\n## Installation\n\nTo install, simply do (making sure to include those `...`)\n```\ngo install github.com/SRI-CSL/gllvm/cmd/...@latest\n```\nThis should install six binaries: `gclang`, `gclang++`, `gflang`, `get-bc`, `gparse`, and `gsanity-check`\nin the `$GOPATH/bin` directory. \n\n## Usage\n\n`gclang` and\n`gclang++` are the wrappers used to compile C and C++.  \n`gflang` is the wrapper used to compile Fortran.\n`get-bc` is used for\nextracting the bitcode from a build product (either an object file, executable, library\nor archive). `gsanity-check` can be used for detecting configuration errors. `gparse` can be used to examine how `gllvm` parses compiler/linker lines.\n\nHere is a simple example. Assuming that clang is in your `PATH`, you can build\nbitcode for `pkg-config` as follows:\n\n```\ntar xf pkg-config-0.26.tar.gz\ncd pkg-config-0.26\nCC=gclang ./configure\nmake\n```\n\nThis should produce the executable `pkg-config`. To extract the bitcode:\n```\nget-bc pkg-config\n```\n\nwhich will produce the bitcode module `pkg-config.bc`. For more on this example\nsee [here](https://github.com/SRI-CSL/gllvm/tree/master/examples/pkg-config).\n\n## Advanced Configuration\n\nIf clang and the llvm tools are not in your `PATH`, you will need to set some\nenvironment variables.\n\n\n * `LLVM_COMPILER_PATH` can be set to the absolute path of the directory that\n   contains the compiler and the other LLVM tools to be used.\n\n * `LLVM_CC_NAME` can be set if your clang compiler is not called `clang` but\n    something like `clang-3.7`. Similarly `LLVM_CXX_NAME` and `LLVM_F_NAME` can be used to\n    describe what the C++ and Fortran compilers are called, respectively. We also pay attention to the\n    environment variables `LLVM_LINK_NAME` and `LLVM_AR_NAME` in an\n    analogous way.\n\nAnother useful, and sometimes necessary, environment variable is `WLLVM_CONFIGURE_ONLY`.\n\n* `WLLVM_CONFIGURE_ONLY` can be set to anything. If it is set, `gclang`\n   and `gclang++` behave like a normal C or C++ compiler. They do not\n   produce bitcode.  Setting `WLLVM_CONFIGURE_ONLY` may prevent\n   configuration errors caused by the unexpected production of hidden\n   bitcode files. It is sometimes required when configuring a build.\n   For example:\n   ```\n   WLLVM_CONFIGURE_ONLY=1 CC=gclang ./configure\n   make\n   ```\n\n## Extracting the Bitcode\n\nThe `get-bc` tool is used to extract the bitcode from a build artifact, such as an executable, object file, thin archive, archive, or library. In the simplest use case, as seen above,\none simply does:\n\n```\nget-bc -o \u003cname of bitcode file\u003e \u003cpath to executable\u003e\n```\nThis will produce the desired bitcode file. The situation is similar for an object file.\nFor an archive or library, there is a choice as to whether you produce a bitcode module\nor a bitcode archive. This choice is made by using the `-b` switch.\n\nAnother useful switch is the `-m` switch which will, in addition to producing the\nbitcode, will also produce a manifest of the bitcode files\nthat made up the final product. As is typical\n\n```\nget-bc -h\n```\nwill list all the commandline switches. Since we use the `golang` `flag` module,\nthe switches must precede the artifact path.\n\n\n\n## Preserving bitcode files in a store\n\nSometimes, because of pathological build systems, it can be useful\nto preserve the bitcode files produced in a\nbuild, either to prevent deletion or to retrieve it later. If the\nenvironment variable `WLLVM_BC_STORE` is set to the absolute path of\nan existing directory,\nthen WLLVM will copy the produced bitcode file into that directory.\nThe name of the copied bitcode file is the hash of the path to the\noriginal bitcode file.  For convenience, when using both the manifest\nfeature of `get-bc` and the store, the manifest will contain both\nthe original path, and the store path.\n\n## Debugging\n\n\nThe gllvm tools can show various levels of output to aid with debugging.\nTo show this output set the `WLLVM_OUTPUT_LEVEL` environment\nvariable to one of the following levels:\n\n * `ERROR`\n * `WARNING`\n * `AUDIT`\n * `INFO`\n * `DEBUG`\n\nFor example:\n```\n    export WLLVM_OUTPUT_LEVEL=DEBUG\n```\nOutput will be directed to the standard error stream, unless you specify the\npath of a logfile via the `WLLVM_OUTPUT_FILE` environment variable.\nThe `AUDIT` level, new in 2022, logs only the calls to the compiler, and indicates \nwhether each call is *compiling* or *linking*, the compiler used, and the arguments provided.\n\nFor example:\n```\n    export WLLVM_OUTPUT_FILE=/tmp/gllvm.log\n```\n\n## Dragons Begone\n\n`gllvm` does not support the dragonegg plugin.\n\n\n## Sanity Checking\n\nToo many environment variables? Try doing a sanity check:\n\n```\ngsanity-check\n```\nit might point out what is wrong.\n\n\n\n## Under the hoods\n\n\nBoth `wllvm` and `gllvm` toolsets do much the same thing, but the way\nthey do it is slightly different. The `gllvm` toolset's code base is\nwritten in `golang`, and is largely derived from the `wllvm`'s python\ncodebase.\n\nBoth generate object files and bitcode files using the\ncompiler. `wllvm` can use `gcc` and `dragonegg`, `gllvm` can only use\n`clang`. The `gllvm` toolset does these two tasks in parallel, while\n`wllvm` does them sequentially.  This together with the slowness of\npython's `fork exec`-ing, and it's interpreted nature accounts for the\nlarge efficiency gap between the two toolsets.\n\nBoth inject the path of the bitcode version of the `.o` file into a\ndedicated segment of the `.o` file itself. This segment is the same\nacross toolsets, so extracting the bitcode can be done by the\nappropriate tool in either toolset. On `*nix` both toolsets use\n`objcopy` to add the segment, while on OS X they use `ld`.\n\nWhen the object files are linked into the resulting library or\nexecutable, the bitcode path segments are appended, so the resulting\nbinary contains the paths of all the bitcode files that constitute the\nbinary.  To extract the sections the `gllvm` toolset uses the golang\npackages `\"debug/elf\"` and `\"debug/macho\"`, while the `wllvm` toolset\nuses `objdump` on `*nix`, and `otool` on OS X.\n\nBoth tools then use `llvm-link` or `llvm-ar` to combine the bitcode\nfiles into the desired form.\n\n## Customization under the hood.\n\nYou can specify the exact version of `objcopy` and `ld` that `gllvm` uses\nto manipulate the artifacts by setting the `GLLVM_OBJCOPY` and `GLLVM_LD`\nenvironment variables. For more details of what's under the `gllvm` hood, try\n```\ngsanity-check -e\n```\n\n## Customizing the BitCode Generation (e.g. LTO)\n\nIn some situations it is desirable to pass certain flags to `clang` in the step that\nproduces the bitcode. This can be fulfilled by setting the\n`LLVM_BITCODE_GENERATION_FLAGS` environment variable to the desired\nflags, for example `\"-flto -fwhole-program-vtables\"`.\n\nIn other situations it is desirable to pass certain flags to `llvm-link` in the step\nthat merges multiple individual bitcode files together (i.e., within `get-bc`).\nThis can be fulfilled by setting the `LLVM_LINK_FLAGS` environment variable to\nthe desired flags, for example `\"-internalize -only-needed\"`.\n\n## Beware of link time optimization.\n\nIf the package you are building happens to take advantage of recent `clang` developments \nsuch as *link time optimization* (indicated by the presence of compiler flag `-flto`), then\nyour build is unlikely to produce anything that `get-bc` will work on. This is to be\nexpected. When working under these flags, the compiler actually produces object files that are bitcode,\nyour only recourse here is to try and save these object files, and retrieve them yourself.\nThis can be done by setting the `LTO_LINKING_FLAGS` to be something like\n`\"-g -Wl,-plugin-opt=save-temps\"` which will be appended to the flags at link time.\nThis will at least preserve the bitcode files, even if `get-bc` will not be able to retrieve them for you.\n\n## Cross-compilation notes\n\nWhen cross-compiling a project (i.e. you pass the `--target=` or `-target` flag to the compiler), \nyou'll need to set the `GLLVM_OBJCOPY` variable to either \n* `llvm-objcopy` to use LLVM's objcopy, which naturally supports all targets that clang does.\n* `YOUR-TARGET-TRIPLE-objcopy` to use GNU's objcopy, since `objcopy` only supports the native architecture.\n\nExample:\n```sh\n# test program\necho 'int main() { return 0; }' \u003e a.c \nclang --target=aarch64-linux-gnu a.c # works\ngclang --target=aarch64-linux-gnu a.c # breaks\nGLLVM_OBJCOPY=llvm-objcopy gclang --target=aarch64-linux-gnu a.c # works\nGLLVM_OBJCOPY=aarch64-linux-gnu-objcopy gclang --target=aarch64-linux-gnu a.c # works if you have GNU's arm64 toolchain\n```\n\n## Developer tools\n\nDebugging usually boils down to looking in the logs, maybe adding a print statement or two.\nThere is an additional executable, not mentioned above, called `gparse` that gets installed \nalong with `gclang`, `gclang++`, `gflang`, `get-bc` and `gsanity-check`. `gparse` takes the command line\narguments to the compiler, and outputs how it parsed them. This can sometimes be helpful.\n\n## License\n\n`gllvm` is released under a BSD license. See the file `LICENSE` for [details.](LICENSE)\n\n---\n\nThis material is based upon work supported by the National Science\nFoundation under Grant\n[ACI-1440800](http://www.nsf.gov/awardsearch/showAward?AWD_ID=1440800). Any\nopinions, findings, and conclusions or recommendations expressed in\nthis material are those of the author(s) and do not necessarily\nreflect the views of the National Science Foundation.\n","funding_links":[],"categories":["Go"],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FSRI-CSL%2Fgllvm","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FSRI-CSL%2Fgllvm","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FSRI-CSL%2Fgllvm/lists"}