{"id":19256493,"url":"https://github.com/copilot-language/copilot-verifier","last_synced_at":"2025-09-09T18:12:56.847Z","repository":{"id":189709261,"uuid":"404824675","full_name":"Copilot-Language/copilot-verifier","owner":"Copilot-Language","description":"System for verifying the correctness of generated Copilot programs","archived":false,"fork":false,"pushed_at":"2025-05-08T17:39:24.000Z","size":173,"stargazers_count":16,"open_issues_count":6,"forks_count":1,"subscribers_count":9,"default_branch":"master","last_synced_at":"2025-06-24T15:54:27.124Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Haskell","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Copilot-Language.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2021-09-09T18:11:42.000Z","updated_at":"2025-05-08T17:18:29.000Z","dependencies_parsed_at":null,"dependency_job_id":"b1f7dab3-260f-4fab-b65b-6ec4db08ac4b","html_url":"https://github.com/Copilot-Language/copilot-verifier","commit_stats":null,"previous_names":["galoisinc/copilot-verifier"],"tags_count":7,"template":false,"template_full_name":null,"purl":"pkg:github/Copilot-Language/copilot-verifier","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Copilot-Language%2Fcopilot-verifier","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Copilot-Language%2Fcopilot-verifier/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Copilot-Language%2Fcopilot-verifier/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Copilot-Language%2Fcopilot-verifier/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Copilot-Language","download_url":"https://codeload.github.com/Copilot-Language/copilot-verifier/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Copilot-Language%2Fcopilot-verifier/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":274340559,"owners_count":25267294,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-09-09T02:00:10.223Z","response_time":80,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-09T19:05:57.284Z","updated_at":"2025-09-09T18:12:56.820Z","avatar_url":"https://github.com/Copilot-Language.png","language":"Haskell","readme":"[![Build Status](https://github.com/Copilot-Language/copilot-verifier/workflows/copilot-verifier/badge.svg)](https://github.com/Copilot-Language/copilot-verifier/actions?query=workflow%3Acopilot-verifier)\n[![Version on Hackage](https://img.shields.io/hackage/v/copilot-verifier.svg)](https://hackage.haskell.org/package/copilot-verifier)\n\n# Copilot Verifier\n\nCopilot Verifier is an add-on to the [Copilot Stream\nDSL](https://copilot-language.github.io) for verifying the correctness of C\ncode generated by the `copilot-c99` package.\n\nThe main idea of the verifier is to use the\n[Crucible symbolic simulator](https://github.com/galoisinc/crucible)\nto interpret the semantics of the generated C program and\nand to produce verification conditions sufficient to guarantee\nthat the meaning of the generated program corresponds in a precise\nway to the meaning of the original stream specification. The generated\nverification conditions are then dispatched to SMT solvers to\nbe automatically solved.  We will have more to say about exactly\nwhat is meant by this correspondence below.\n\nCopilot Verifier is described in the ICFP 2023 paper [_Trustworthy Runtime\nVerification via Bisimulation (Experience\nReport)_](https://dl.acm.org/doi/abs/10.1145/3607841).\n\n## Building\n\nTo build the verifier from source, first make sure you have met the following\nprerequisites:\n\n* Ensure that you have the `cabal` and `ghc` executables in your `PATH`. If you\n  don't already have them, we recommend using `ghcup` to install them:\n  https://www.haskell.org/ghcup/. We recommend `Cabal` 3.10 or newer, and one of\n  GHC 9.4, 9.6, or 9.8.\n\n* Ensure that you have the `clang` and `llvm-link` utilities from LLVM in your\n  `PATH`. Currently, LLVM versions up to 16 are supported. LLVM binaries are\n  available at https://github.com/llvm/llvm-project/releases.\n\n* Ensure that you have the `z3` SMT solver in your `PATH`. `z3` binaries are\n  available at https://github.com/Z3Prover/z3/releases.\n\n  Alternatively, the verifier can be configured to use one of the following\n  SMT solvers instead:\n\n  * `cvc4`: https://cvc4.github.io/downloads.html\n  * `cvc5`: https://github.com/cvc5/cvc5/releases/\n  * `yices`: https://yices.csl.sri.com\n\nThen, clone the repo and run:\n\n```\n$ git submodule update --init\n$ cabal test copilot-verifier\n```\n\nThis will clone the repository, build the verifier, and run the associated test\nsuite. If you have performed all of the steps above correctly, the test suite\nshould pass.\n\nWe also provide a [Dockerfile](Dockerfile) which automates the process of\ninstalling Copilot and the verifier. The Dockerfile can be built and run using\nthe following commands:\n\n```\n$ docker build -t \u003ctag\u003e .\n$ docker run -it \u003ctag\u003e\n```\n\nWhere `\u003ctag\u003e` is a unique name for the Docker image.\n\n## Using the Copilot Verifier\n\nThe main interface to the verifier is the `Copilot.Verifier.verify`\nfunction, which takes some options to drive the code generation\nprocess and a Copilot `Spec` object. It will invoke the Copilot\ncode generator to obtain C sources, compile the C sources into\nLLVM bitcode using the `clang` compiler front-end, then\nparse and interpret the bitcode using `crucible`.  After generating\nverification conditions, it will dispatch them to an SMT solver,\nand the result of those queries will be presented to the user.\n\nThere are a number of examples (based on similar examples\nfrom the main Copilot repository) demonstrating how to\nincorporate the verifier into a Copilot program.\nSee the `copilot-verifier/examples` subdirectory of this repository.\n\n## Details of the Verification\n\n### Synopsis of Copilot semantics\n\nThe Copilot DSL represents a certain nicely-behaved\nclass of discrete-time stream programs. Each `Stream`\nin Copilot denotes an infinite stream of values; one may\njust as well think that `Stream a` represents a pure mathematical\nfunction `ℕ → a` from natural numbers to values of type `a`.\nSee the\n[Copilot manual](https://ntrs.nasa.gov/api/citations/20200003164/downloads/20200003164.pdf)\nfor more details of the Copilot language itself and its semantics.\n\nOne of the central design considerations for Copilot is that is should\nbe possible to implement stream programs using a fixed, finite amount\nof storage.  As a result, Copilot will only accept stream programs\nthat access a bounded amount of their input streams (including any\nrecursive stream references). This allows an\nimplementation strategy where the generated C code can use fixed-size\nring buffers to store a limited number of previous stream values.\n\nThe execution model for the generated programs is that the program\nstarts in a state corresponding to stream values at time `t = 0`;\n\"external\" stream input values are placed in distinguished global\nvariables by the calling environment, which then executes a `step()`\nfunction to move to the next time step.  The `step()` function captures\nthe values of these external inputs and does whatever computation is\nnecessary to update its internal state from time `t=n` to time\n`t=n+1`.  In addition, it will call any defined \"trigger\" functions\nif the stream state at that time step satisfies the defined guard condition.\nIn short, the generated C program steps one moment at a time through\nthe stream program, consuming a sequence of input values provided by\na cooperating environment and calling handler functions whenever\nstates of interest occur.\n\n### The Desired Correspondence\n\nWhat does it mean, then, for a generated C program in this style\nto correctly implement a given stream program? The intuition\nis basically that after `n` calls to the `step` function,\nthe state of the ring-buffers of the C program should correctly\ncompute the value of the corresponding stream expression\nevaluated at index `n`, assuming the C program has been fed\ninputs corresponding to the first `n` values of the external stream\ninputs.  Moreover, the trigger functions should be called from\nthe `step` function exactly at the time values when the stream expressions\nevaluate to true.\n\nThe notion of correspondence for the values flowing in streams is\nrelatively straightforward: these values consist of fixed-width\nmachine integers, floating-point values, structs and fixed-length\narrays. For each, the appropriate notion of equality is fairly clear.\n\nBoth the original `Stream` program and the generated C program\ncan be viewed straightforwardly as a transition system, and under\nthis view, the correspondence we want to establish is a bisimulation\nbetween the states of the high-level stream program and the low-level\nC program. The proof method for bisimulation requires us to provide\na \"correspondence\" relation between the program states, and then prove\nthree things about this relation:\n\n1. that the initial states of the programs are in the relation;\n2. if we assume two arbitrary program states begin in the relation\nand each takes a single transition (consuming corresponding inputs),\nthe resulting states are back in the relation;\n3. that any observable\nactions taken by one program are exactly mirrored by the other.\n\nOn the high-level side of the bisimulation, the program\nstate is essentially just the value of the current time step `n`,\nwhereas on the C side it consists of the regions of global memory that\ncontain the ring-buffers and their current indices.  The transition\nrelation for the high-level program just increments the time value by\none, and the transition for the C program is defined by the action\nof the generated `step()` function.\n\nSuppose `s` is one of the stream definitions in the original Copilot\nprogram which is required to retain `k` previous values;\nlet `buf` be the global variable name of the ring-buffer in the C\nprogram, and `idx` be the global variable name maintaining the\ncurrent index into the ring buffer.  Then the correspondence\nrelation is basically that `0 \u003c= idx \u003c k` and\n`s[n+i] = buf[(idx+i) mod k]` as `i` ranges from `0 .. k-1`.\nBy abuse of notation, here we mean that `s[j]` is\nthe value of the stream expression `s` evaluated at index `j`,\nwhereas `buf[j]` means the value obtained by reading the `j`th value\nof the buffer `buf` from memory.  The overall correspondence relation\nis a conjunction of statements like this, one for each stream\nexpression that is realized via a buffer.\n\n### Implementing the Bisimulation proof steps\n\nThe kind of program correspondence property we desire is a largely\nmechanical affair. As the code under consideration is automatically\ngenerated, it has a very regular structure and is specifically\nintended to implement the semantics we wich to verify it against.  As\nsuch, we expect these proofs to be highly automatable.\n\nThe proof principle of bisimulation itself is not amenable\nto reduction to SMT, as if falls outside the first-order theories\nthose solvers understand. Likewise, the semantics of Copilot\nand C might possibly be reducible directly to SMT, but it would be\nimpractical to do so. However, we can reduce the individual\nproof obligations mentioned above into a series of lower-level\nlogical statements that are amenable to SMT proof by\ndefining the logical semantics of stream programs, and using\nsymbolic simulation techniques to interpret the semantics of the\nC program.  Performing this reduction is the key contribution\nof `copilot-verifier`.\n\n#### Initial state correspondence\n\nThe proof first obligation we must discharge is to show that\nthe initial states of the two programs correspond. For each\nstream `s` there is a corresponding `k`, which is the length of\nthe ring-buffer implementing it.  We must simply verify that\nthe C program initializes its buffer with the first `k` values\nof the stream `s`, and that the `idx` value starts at 0.\nBecause of the restrictions Copilot places on programs, these\nfirst `k` values must be specified concretely and will not be\nable to depend on any external inputs.  As such, this step\nis quite easily performed, as it requires only direct evaluation\nof concrete inputs.\n\n#### Transition correspondence\n\nThe bulk of the proof effort consists of demonstrating that\nthe bisimulation relation is preserved by transitions.\nIn order to do this step, we must begin with totally symbolic\ninitial states: we know nothing except that we are at some\narbitrary time value `n`, and that the C program buffers\ncorrespond to their stream values as required by the relation.\nThus, we create fresh symbolic variables for the streams\nfrom `n` to `n + k-1`, and for the values of all the involved\nexternal streams at time `n`. Then, we run forward the Copilot\nprogram by evaluating the stream recurrence expression to\ncompute the value of each stream at time `n+k`.\n\nNext we set up an initial state of the C program by choosing,\nfor each ring buffer, an arbitrary value for its current index\nwithin its allowed range, and then writing the variables\ncorresponding to each stream value into the buffers at\ntheir appropriate offsets. The symbolic simulator is then\ninvoked to compute the state update effects of the `step()`\nfunction. Afterward, we read the poststate values from the\nring-buffers and verify that they correspond to the stream\nvalues from `n+1` up to `n+k`.\n\nAs part of symbolic simulation, Crucible may also generate\nside-conditions that relate to memory safety of the program, or to\nerror conditions that must be avoided. All of the bisimulation\nequivalence conditions and the safety side-conditions will be\nsubmitted to an SMT solver.\n\n#### Observable effects\n\nFor our purposes, the only observable effects of a Copilot program\nrelate to any \"trigger\" functions defined in the spec.  Our task is to\nshow that the generated C code calls the external trigger functions if\nand only if the corresponding guard condition is true, and that the\narguments passed to those functions are as expected.\nThis proof obligation is proved in the same phase along with\nthe transition relation proof above because the `step()` function\nis responsible for both invoking triggers and for performing state\nupdates.\n\nThe technique we use to perform this aspect of the proof is to\ninstall \"override\" functions to the external symbols corresponding\nto the C trigger functions before we begin symbolic simulation.\nIn a real system, the external environment would be responsible\nfor implementing these functions and taking whatever appropriate\naction is necessary when the triggers fire. However, we are verifying\nthe generated code in isolation from its environment, so we have no\nimplementation in hand. Instead, the override will\nessentially implement a stub function that simply captures its\narguments and the path condition under which it was called.\nAfter simulation completes, the captured arguments and path condition\nare required to be equivalent to the corresponding trigger guard\nand arguments from the Copilot spec.  These conditions are\ndischarged to the SMT solver at the same time as the transition\nrelation obligations.\n\nBecause of the way we model the trigger functions, we make a number of\nimplicit assumptions about how the actual implementations of those\nfunctions must behave. The most important of those assumptions is that\nthe trigger functions must not modify any memory under the control of\nthe Copilot program, including its ring buffers and stack.  We also\nassume that the trigger functions are well defined, i.e. they are\nmemory safe and do not perform any undefined behavior.\n\n#### Partial operations\n\nA generated C program may make use of partial operations. These range from\ndivision, which can fail if the second argument is zero, to signed integer\narithmetic, which can overflow and result in undefined behavior. The verifier\nhas two modes for dealing with partial operations:\n\n1. Any invocation of a partial operation on undefined inputs in the generated\n   C program will result in an error, provided that the user did not add an\n   assertion that assumes this behavior will not occur.\n\n2. If the generated C program invokes a partial operation on undefined inputs,\n   the verifier will check if this coincides with a corresponding invocation\n   of a partial operation in the Copilot spec. If so, the verification will\n   succeed. In other words, the verifier will check that the spec and the\n   C program are \"crash-equivalent\".\n\nThe verifier uses mode (1) by default, but mode (2) can be enabled by using\n`Copilot.Verifier.verifyWithOptions sideCondVerifierOptions`. In this mode,\nthe verifier will analyze any invocation of a operation which could be partial\nand generate a side condition that this operation will only be invoked on\nwell defined inputs. During the transition step of the bisimulation proof,\nthe verifier will add these side conditions as assumptions. Therefore, if\nsimulating the C program generates any side conditions due to invoking partial\noperations, these side conditions from the C program should be dischargeable\nusing the corresponding side conditions from the Copilot spec.\n\nMode (2) has the caveat that `clang` may compile C code to LLVM bitcode in\nwhich a partial function is no longer applied to undefined inputs. For\ninstance, `clang` will sometimes promote 16-bit integer values to 32 bits\nbefore performing arithmetic on them. This can turn an operation that would\nresult in signed 16-bit integer overflow into a 32-bit integer operation\nthat does _not_ overflow, for instance.\n\n### Caveats About the Verifier\n\nWe rely on the `clang` compiler front-end to consume C source files\nand produce LLVM intermediate language, which then becomes the input\nto the later verification steps. To the extent that the input program\nis not fully-portable C, `clang` may make implementation-specific\ndecisions about how to compile the program which might be made\ndifferent if compiled by a different compiler (e.g. `gcc`). We expect\nthis aspect to be mitigated by the fact that Copilot programs are\nautomatically generated into a rather simple subset of the C language,\nand is designed to be as simple as possible.\nAny code-generation bugs in `clang` itself may affect the soundness\nof our verifier. Again, however, Copilot generates a well-understood\nsubset of the language, and we expect `clang` to be well-tested on\nthe code patterns produced.\n\nThe semantics of LLVM bitcode, as encoded in the `crucible-llvm`\npackage, may have errors that affect soundness. We mitigate this risk\nby testing our semantics against a corpus of verification problems\nproduced for the SV-COMP verification competition, paying special\nattention to any soundness issues that arise. `Crux`, a standalone\nverification system based on `crucible-llvm`, was a participant in the\n2022 edition of SV-COMP.\n\nThe semantics of Copilot programs, as encoded in the\n`Copilot.Theorem.What4` module may have errors that affect soundness.\nFor the moment we do not have an effective mitigation strategy for\nthis risk other than manual examination and comparison against the\nintended semantics of Copilot, as encoded in the interpreter.\n\nThere is limited SMT solver support for floating-point values,\nespecially for transcendental functions like the trig primitives.  As a\nresult, we reason about floating point expressions via uninterpreted\nfunctions. In other words, we leave the semantics of the\nfloating-point operations totally abstract, and simply verify that the\nCopilot program and the corresponding C program apply the same\noperations in the same order. This is sound, but leaves the possibility\nthat the compiler will apply some correct transformation to\nfloating-point expressions that we are nonetheless unable to verify.\nHowever, on low optimizations and without the `--fast-math` flag,\ncompilers generally (and `clang` in particular) are very reluctant to\nrearrange floating-point code, and the verifications generally succeed.\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcopilot-language%2Fcopilot-verifier","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fcopilot-language%2Fcopilot-verifier","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcopilot-language%2Fcopilot-verifier/lists"}