{"id":18422454,"url":"https://github.com/sri-csl/llvm2smt","last_synced_at":"2025-04-07T14:32:53.875Z","repository":{"id":32502036,"uuid":"36082790","full_name":"SRI-CSL/llvm2smt","owner":"SRI-CSL","description":"Experimental translation of llvm to smt.","archived":false,"fork":false,"pushed_at":"2020-04-08T17:02:14.000Z","size":17684,"stargazers_count":56,"open_issues_count":2,"forks_count":14,"subscribers_count":19,"default_branch":"master","last_synced_at":"2025-03-22T19:45:58.514Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"LLVM","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/SRI-CSL.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2015-05-22T16:28:54.000Z","updated_at":"2024-11-08T18:39:58.000Z","dependencies_parsed_at":"2022-09-05T09:00:34.380Z","dependency_job_id":null,"html_url":"https://github.com/SRI-CSL/llvm2smt","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/SRI-CSL%2Fllvm2smt","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/SRI-CSL%2Fllvm2smt/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/SRI-CSL%2Fllvm2smt/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/SRI-CSL%2Fllvm2smt/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/SRI-CSL","download_url":"https://codeload.github.com/SRI-CSL/llvm2smt/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247670226,"owners_count":20976531,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-06T04:30:11.249Z","updated_at":"2025-04-07T14:32:50.544Z","avatar_url":"https://github.com/SRI-CSL.png","language":"LLVM","funding_links":[],"categories":[],"sub_categories":[],"readme":"# llvm2smt\n\nExperimental translation of LLVM (3.5ish) IR to SMT-LIB.\n\n\nOverview\n=============\n\nThis tool, llvm2smt, parses a llvm bitcode file (in its human readable form) and\ntranslates it to a symbolic SMT-LIB representation.\n\nCurrently the resulting SMT-LIB file uses the theory of bitvectors and arrays (QF_ABV).\n\nThe goal is to support symbolic analyses, such as bounded model checking, using\nSMT solvers.\n\nThe tool is in its infancy and only translates the llvm IR as it appears. Any logical\nproperties one might want to verify must be added by hand.\n\n\nA Simple Example\n==============\n\nThe file `test/shufflevector.ll` includes two simple functions written\nby hand.  The first function `@lhs` takes two integers, stores them\nin a two-element vector, shuffles the vector twice, then returns the first\nvector element.\n\n```llvm\n; Function Attrs: nounwind ssp uwtable\ndefine i32 @lhs(i32 %a, i32 %b) #0 {\n  %1 = insertelement \u003c2 x i32\u003e undef, i32 %a, i32 0\n  %2 = insertelement \u003c2 x i32\u003e %1, i32 %b, i32 1\n  %3 = shufflevector \u003c2 x i32\u003e %2, \u003c2 x i32\u003e undef, \u003c2 x i32\u003e \u003ci32 1, i32 0\u003e\n  %4 = shufflevector \u003c2 x i32\u003e %3, \u003c2 x i32\u003e undef, \u003c2 x i32\u003e \u003ci32 1, i32 0\u003e\n  %5 = extractelement \u003c2 x i32\u003e %4, i32 0\n  ret i32 %5\n}\n\n```\n\nThe second function `@rhs` does the same thing without the shuffles.\n\n```llvm\n; Function Attrs: nounwind ssp uwtable\ndefine i32 @rhs(i32 %a, i32 %b) #0 {\n  %1 = insertelement \u003c2 x i32\u003e undef, i32 %a, i32 0\n  %2 = insertelement \u003c2 x i32\u003e %1, i32 %b, i32 1\n  %3 = extractelement \u003c2 x i32\u003e %2, i32 0\n  ret i32 %3\n}\n\n```\n\nWe can show that these functions are equivalent by first translating the LLVM IR\nto SMT-LIB via:\n\n```shell\n\u003e llvm2smt shufflevector.ll \u003e shufflevector.smt\n```\nFunction `@rhs` is translated to the following SMT-LIB statements, in SMT-LIB the character `@` can be controversial\nso we replace it with `_@`.\n\n```smt\n;; Function: |_@rhs|\n;; (i32 %a, i32 %b)\n(declare-fun memory2 () Mem)\n(define-fun rsp2 () (_ BitVec 64) (_ bv0 64))\n(declare-fun |%a_@rhs| () (_ BitVec 32))\n(declare-fun |%b_@rhs| () (_ BitVec 32))\n\n;; BLOCK %0 with index 0 and rank = 1\n;; Predecessors:\n;; |_@rhs_block_0_entry_condition| \n(define-fun |_@rhs_block_0_entry_condition| () Bool true)\n;; %1 = insertelement \u003c2 x i32\u003e undef, i32 %a, i32 0\n(define-fun |%1_@rhs| () (Array (_ BitVec 1) (_ BitVec 32)) (store vzero_1_32 ((_ extract 0 0) (_ bv0 32)) |%a_@rhs|))\n;; %2 = insertelement \u003c2 x i32\u003e %1, i32 %b, i32 1\n(define-fun |%2_@rhs| () (Array (_ BitVec 1) (_ BitVec 32)) (store |%1_@rhs| ((_ extract 0 0) (_ bv1 32)) |%b_@rhs|))\n;; %3 = extractelement \u003c2 x i32\u003e %2, i32 0\n(define-fun |%3_@rhs| () (_ BitVec 32) (select |%2_@rhs| ((_ extract 0 0) (_ bv0 32))))\n;; ret i32 %3\n;; No backward arrows\n\n\n(define-fun |_@rhs_result| () (_ BitVec 32) |%3_@rhs|)\n```\nThe key points are:\n\n1. The function takes two input arguments denoted by `|%a_@rhs|` and `|%b_@rhs|`. Both \nare bitvectors of length 32.\n\n2. The return value of the function is denoted by `_@rhs_result`.\n\nThe other function is encoded similarly.\n\nTo check whether these two functions are equivalent, we add the following two SMT-LIB commands\nat the end of the file:\n\n```smt\n(assert (and (= |%a_@lhs| |%a_@rhs|) (= |%b_@lhs| |%b_@rhs|) (not (= |_@lhs_result| |_@rhs_result|))))\n(check-sat)\n```\n\nThis tests whether the functions `@lhs` and `@rhs` can produce different results when run of the same input.\n\nWe can then give the entire file to an SMT solver, such as `yices-smt2`,  to conclude:\n\n```shell\n\u003e yices-smt2 shufflevector.smt\nunsat\n```\nAs expected, the assertion is not satisfiable: if we give both functions the same input, they produce the same result.\n\n\n\nCompilation\n==============\n\n`llvm2smt` is written in OCaml. It is known to compile with OCaml 4.02.1\nbut other versions may work too. Standard OCaml tools are required\nincluding `ocamllex`, `ocamlyacc`, and `ocamldep`. \n\nInstalling OCaml is reasonably easy. Check the instructions at\nhttps://ocaml.org/docs/install.html.\n\nOnce you have OCaml, go to the `./src` directory then type\n\n```shell\n\u003e make\n```\n\nThis will build two main executables:\n\n1. `parse` is based on Trevor Jim's [parser](https://github.com/tjim/smpcc/blob/master/compiler/)\n    for LLVM assembly language (`.ll` suffix).\n    It can be used to check that our tool properly parses LLVM.\n\n2. `llvm2smt` is the main tool. It produces an SMT-LIB specification \n    from a single `.ll` input.\n\n\n\nExamples and tests for both are included in the `./examples`,\n`./test`, and `./bitcode` directories. Check the Makefile for details.\n\nOn simple examples (i.e., one source file), you can generate bitvcode using `clang -S -emit-llvm`. For\nmore complex builds, we typically use [wllvm](https://github.com/SRI-CSL/whole-program-llvm).\n\n\n\nWhat we do\n==============\n\n`llvm2smt` translates every basic block in the LLVM file into a\nsequence of SMT-LIB declarations and definitions. We use a global\narray to represent memory. For a 64 bit address space, this array has\ntype\n\n```smt\n  (Array (_ BitVec 64) (_ BitVec 8)).\n```\n\nRead operations are encoded using SMT-LIB `select` and write\noperations are encoded using `store`. Each write operation produces a\nnew memory state, denoted by a fresh SMT-LIB constant.\n\nWe also use a global variable to denote the stack pointer. It is used to\nencode the LLVM `alloca` operations (i.e., create local variables on the stack).\n\nWe use a bit-precise representation: `i1` variables are represented as\nBoolean, all other integer types are converted to bitvectors of the\nappropriate size. For example, `i32` variables are represented as\nbitvectors of length 32. We support all LLVM types except\nfloating-point numbers. For LLVM vector types, we use SMT-LIB\narrays. For example a register of type `\u003c2 x i32\u003e` is represented as \nan array of two bitvectors of length 32. The array itself is of type\n\n```SMT\n(Array (_ BitVec 1) (_ BitVec 32)).\n```\n\nThe SMT-LIB translation assumes that every basic block is executed at\nmost once. In most cases, this means that we must unroll loops before\nthe translation by using `opt` with the following command switches:\n```\n\u003e opt -loop-rotate -loop-unroll -unroll-count=3 ...\n```\n(Try `opt --help-list-hidden` to see all the good things `opt` can do for you.)\n\n\n\nWhat we don't do\n==============\n\nWe do not handle function calls. A work around is to force the\ncompiler to inline the calls to all relevant functions. \nThis can be done by annotating function declarations as follows:\n```c\nstatic __attribute__ ((__always_inline__))  int my_function(int x) {\n  ...\n}\n```\n\nWe do not handle floating-point types in LLVM since the QF_ABV logic\nthat we use does not support include floating point operations.  Our\ncrude approach for now is to convert all floating-point constants to\nzero and all floating-point register to uninterpreted constants in the\nSMT-LIB translation.\n\nIn addition to `call` mentioned above, we do not handle the following\nLLVM instructions `invoke`, `landingpad`, `resume`, `va_arg`,\n`indirectbr`, `cmpxchg`, `atomicrmw`, `fence`, `addrspacecast`,\n`extractvalue`, and `insertvalue`. Some of these could be added but we\nhave not encountered them in our C-code examples.\n\n\n\n\n\n\n\nAcknowledgement:\n==============\n\nOur code builds upon an OCaml-based parser for LLVM written by\nTrevor Jim:\n\nhttps://github.com/tjim/smpcc/blob/master/compiler/\n\nWe diverged from this repository around February 2015.\n\n\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsri-csl%2Fllvm2smt","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsri-csl%2Fllvm2smt","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsri-csl%2Fllvm2smt/lists"}