{"id":13830869,"url":"https://github.com/lean-dojo/LeanCopilot","last_synced_at":"2025-07-09T12:34:20.467Z","repository":{"id":193618416,"uuid":"689172454","full_name":"lean-dojo/LeanCopilot","owner":"lean-dojo","description":"LLMs as Copilots for Theorem Proving in Lean","archived":false,"fork":false,"pushed_at":"2024-11-04T01:35:17.000Z","size":1203,"stargazers_count":988,"open_issues_count":8,"forks_count":91,"subscribers_count":14,"default_branch":"main","last_synced_at":"2024-11-04T02:24:10.046Z","etag":null,"topics":["formal-mathematics","lean","lean4","llm-inference","machine-learning","theorem-proving"],"latest_commit_sha":null,"homepage":"https://leandojo.org","language":"C++","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/lean-dojo.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-09-09T01:52:22.000Z","updated_at":"2024-11-04T01:35:21.000Z","dependencies_parsed_at":"2024-02-08T17:27:37.906Z","dependency_job_id":"22578904-3f43-4c7d-88c6-eb86e7d88596","html_url":"https://github.com/lean-dojo/LeanCopilot","commit_stats":null,"previous_names":["lean-dojo/leaninfer","lean-dojo/leancopilot"],"tags_count":28,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lean-dojo%2FLeanCopilot","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lean-dojo%2FLeanCopilot/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lean-dojo%2FLeanCopilot/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lean-dojo%2FLeanCopilot/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/lean-dojo","download_url":"https://codeload.github.com/lean-dojo/LeanCopilot/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":225553258,"owners_count":17487293,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["formal-mathematics","lean","lean4","llm-inference","machine-learning","theorem-proving"],"created_at":"2024-08-04T10:01:10.948Z","updated_at":"2025-07-09T12:34:20.457Z","avatar_url":"https://github.com/lean-dojo.png","language":"C++","funding_links":[],"categories":["C++","Tools","A01_文本生成_文本对话","Models","Research","🤖 Research Agents \u0026 Autonomous Workflows"],"sub_categories":["大语言对话模型及数据","Interactive theorem proving","Domain-Specific Research Agents"],"readme":"Lean Copilot: LLMs as Copilots for Theorem Proving in Lean\n==========================================================\n\nLean Copilot allows large language models (LLMs) to be used natively in Lean for proof automation, e.g., suggesting tactics/premises and searching for proofs. You can use our built-in models from [LeanDojo](https://leandojo.org/) or bring your own models that run either locally (w/ or w/o GPUs) or on the cloud.\n\n\u003chttps://github.com/lean-dojo/LeanCopilot/assets/114432581/ee0f56f8-849e-4099-9284-d8092cbd22a3\u003e\n\n## Table of Contents\n\n1. [Requirements](#requirements)  \n1. [Using Lean Copilot in Your Project](#using-lean-copilot-in-your-project)\n   1. [Adding Lean Copilot as a Dependency](#adding-lean-copilot-as-a-dependency)\n   1. [Getting Started with Lean Copilot](#getting-started-with-lean-copilot)\n      1. [Tactic Suggestion](#tactic-suggestion)\n      1. [Proof Search](#proof-search)\n      1. [Premise Selection](#premise-selection)\n1. [Advanced Usage](#advanced-usage)\n   1. [Tactic APIs](#tactic-apis)\n   1. [Model APIs](#model-apis)\n   1. [Bring Your Own Model](#bring-your-own-model)\n1. [Caveats](#caveats)\n1. [Getting in Touch](#getting-in-touch)\n1. [Acknowledgements](#acknowledgements)\n1. [Citation](#citation)\n\n## Requirements\n\n* Supported platforms: Linux, macOS, Windows and [Windows WSL](https://learn.microsoft.com/en-us/windows/wsl/install).\n* [Git LFS](https://git-lfs.com/).\n* Optional (recommended if you have a [CUDA-enabled GPU](https://developer.nvidia.com/cuda-gpus)): CUDA and [cuDNN](https://developer.nvidia.com/cudnn).\n* Required for building Lean Copilot itself (rather than a downstream package): CMake \u003e= 3.7 and a C++17 compatible compiler.\n\n## Using Lean Copilot in Your Project\n\n:warning: Your project must use a Lean version of at least `lean4:v4.3.0-rc2`.\n\n### Adding Lean Copilot as a Dependency\n\n1. Add the package configuration option `moreLinkArgs := #[\"-L./.lake/packages/LeanCopilot/.lake/build/lib\", \"-lctranslate2\"]` to lakefile.lean. For example,\n\n```lean\npackage «my-package» {\n  moreLinkArgs := #[\n    \"-L./.lake/packages/LeanCopilot/.lake/build/lib\",\n    \"-lctranslate2\"\n  ]\n}\n```\n\nAlternatively, if your project uses lakefile.toml, it should include:\n\n```toml\nmoreLinkArgs = [\"-L./.lake/packages/LeanCopilot/.lake/build/lib\", \"-lctranslate2\"]\n```\n\n2. Add the following line to lakefile.lean, including the quotation marks:\n\n```lean\nrequire LeanCopilot from git \"https://github.com/lean-dojo/LeanCopilot.git\" @ \"LEAN_COPILOT_VERSION\"\n```\n\nFor stable Lean versions (e.g., `v4.21.0`), set `LEAN_COPILOT_VERSION` to be that version. For the latest unstable Lean versions (e.g., `v4.22.0-rc2`), set `LEAN_COPILOT_VERSION` to `main`. In either case, make sure the version is compatible with other dependencies such as mathlib. If your project uses lakefile.toml instead of lakefile.lean, it should include:\n\n```toml\n[[require]]\nname = \"LeanCopilot\"\ngit = \"https://github.com/lean-dojo/LeanCopilot.git\"\nrev = \"LEAN_COPILOT_VERSION\"\n```\n\n3. If you are using native Windows, add `\u003cpath_to_your_project\u003e/.lake/packages/LeanCopilot/.lake/build/lib` to your `Path` variable in Advanced System Settings \u003e Environment Variables... \u003e System variables. \n\n4. Run `lake update LeanCopilot`.\n\n5. Run `lake exe LeanCopilot/download` to download the built-in models from Hugging Face to `~/.cache/lean_copilot/`. *Alternatively*, you can download the models from Hugging Face manually from\n\n* [ct2-leandojo-lean4-tacgen-byt5-small](https://huggingface.co/kaiyuy/ct2-leandojo-lean4-tacgen-byt5-small)\n* [ct2-leandojo-lean4-retriever-byt5-small](https://huggingface.co/kaiyuy/ct2-leandojo-lean4-retriever-byt5-small)\n* [premise-embeddings-leandojo-lean4-retriever-byt5-small](https://huggingface.co/kaiyuy/premise-embeddings-leandojo-lean4-retriever-byt5-small)\n* [ct2-byt5-small](https://huggingface.co/kaiyuy/ct2-byt5-small)\n\n6. Run `lake build`.\n\n[Here](https://github.com/yangky11/lean4-example/blob/LeanCopilot-demo) is an example of a Lean package depending on Lean Copilot. If you have problems building the project, our [Dockerfile](./Dockerfile), [build.sh](scripts/build.sh) or [build_example.sh](scripts/build_example.sh) may be helpful.\n\n### Getting Started with Lean Copilot\n\n#### Tactic Suggestion\n\nAfter `import LeanCopilot`, you can use the tactic `suggest_tactics` to generate tactic suggestions. You can click on any of the suggested tactics to use it in the proof.\n\n\u003cimg width=\"977\" alt=\"suggest_tactics\" src=\"https://github.com/lean-dojo/LeanCopilot/assets/114432581/f06865b6-58be-4938-a75c-2a23484384b4\"\u003e\n\nYou can provide a prefix (e.g., `simp`) to constrain the generated tactics:\n\n\u003cimg width=\"915\" alt=\"suggest_tactics_simp\" src=\"https://github.com/lean-dojo/LeanCopilot/assets/114432581/95dcae31-41cb-451c-9fdf-d73522addb6e\"\u003e\n\n#### Proof Search\n\nThe tactic `search_proof` combines LLM-generated tactics with [aesop](https://github.com/leanprover-community/aesop) to search for multi-tactic proofs. When a proof is found, you can click on it to insert it into the editor.\n\n\u003cimg width=\"824\" alt=\"search_proof\" src=\"https://github.com/lean-dojo/LeanCopilot/assets/114432581/26381fca-da4e-43d9-84b5-7e27b0612626\"\u003e\n\n#### Premise Selection\n\nThe `select_premises` tactic retrieves a list of potentially useful premises. Currently, it uses the retriever in [LeanDojo](https://leandojo.org/) to select premises from a fixed snapshot of Lean and [mathlib4](https://github.com/leanprover-community/mathlib4/tree/3ce43c18f614b76e161f911b75a3e1ef641620ff).\n\n![select_premises](https://github.com/lean-dojo/LeanCopilot/assets/114432581/2817663c-ba98-4a47-9ae9-5b8680b6265a)\n\n#### Running LLMs\n\nYou can also run the inference of any LLMs in Lean, which can be used to build customized proof automation or other LLM-based applications (not limited to theorem proving). It's possible to run arbitrary models either locally or remotely (see [Bring Your Own Model](#bring-your-own-model)).\n\n\u003cimg width=\"1123\" alt=\"run_llms\" src=\"https://github.com/lean-dojo/LeanCopilot/assets/5431913/a4e5b84b-a797-4216-a416-2958448aeb07\"\u003e\n\n## Advanced Usage\n\n**This section is only for advanced users who would like to change the default behavior of `suggest_tactics`, `search_proof`, or `select_premises`, e.g., to use different models or hyperparameters.**\n\n### Tactic APIs\n\n* Examples in [TacticSuggestion.lean](LeanCopilotTests/TacticSuggestion.lean) showcase how to configure `suggest_tactics`, e.g., to use different models or generate different numbers of tactics.\n* Examples in [ProofSearch.lean](LeanCopilotTests/ProofSearch.lean) showcase how to configure `search_proof` using options provided by [aesop](https://github.com/leanprover-community/aesop).\n* Examples in [PremiseSelection.lean](LeanCopilotTests/PremiseSelection.lean) showcase how to set the number of retrieved premises for `select_premises`.\n\n### Model APIs\n\n**Examples in [ModelAPIs.lean](LeanCopilotTests/ModelAPIs.lean) showcase how to run the inference of different models and configure their parameters (temperature, beam size, etc.).**\n\nLean Copilot supports two kinds of models: generators and encoders. Generators must implement the `TextToText` interface:\n\n```lean\nclass TextToText (τ : Type) where\n  generate (model : τ) (input : String) (targetPrefix : String) : IO $ Array (String × Float)\n```\n\n* `input` is the input string\n* `targetPrefix` is used to constrain the generator's output. `\"\"` means no constraint.\n* `generate` should return an array of `String × Float`. Each `String` is an output from the model, and `Float` is the corresponding score.\n\nWe provide three types of Generators:\n\n* [`NativeGenerator`](LeanCopilot/Models/Native.lean) runs locally powered by [CTranslate2](https://github.com/OpenNMT/CTranslate2) and is linked to Lean using Foreign Function Interface (FFI).\n* [`ExternalGenerator`](LeanCopilot/Models/External.lean) is hosted either locally or remotely. See [Bring Your Own Model](#bring-your-own-model) for details.\n* [`GenericGenerator`](LeanCopilot/Models/Generic.lean) can be anything that implements the `generate` function in the `TextToText` typeclass.\n\nEncoders must implement `TextToVec`:\n\n```lean\nclass TextToVec (τ : Type) where\n  encode : τ → String → IO FloatArray\n```\n\n* `input` is the input string\n* `encode` should return a vector embedding produced by the model.\n\nSimilar to generators, we have `NativeEncoder`, `ExternalEncoder`, and `GenericEncoder`.\n\n### Bring Your Own Model\n\nIn principle, it is possible to run any model using Lean Copilot through `ExternalGenerator` or `ExternalEncoder` (examples in [ModelAPIs.lean](LeanCopilotTests/ModelAPIs.lean)). To use a model, you need to wrap it properly to expose the APIs in [external_model_api.yaml](./external_model_api.yaml). As an example, we provide a [Python API server](./python) and use it to run a few models.\n\n## Caveats\n\n* Lean may occasionally crash when restarting or editing a file. Restarting the file again should fix the problem.\n* `select_premises` always retrieves the original form of a premise. For example, `Nat.add_left_comm` is a result of the theorem below. In this case, `select_premises` retrieves `Nat.mul_left_comm` instead of `Nat.add_left_comm`.\n\n```lean\n@[to_additive]\ntheorem mul_left_comm : ∀ a b c : G, a * (b * c) = b * (a * c)\n```\n\n* In some cases, `search_proof` produces an erroneous proof with error messages like `fail to show termination for ...`. A temporary workaround is changing the theorem's name before applying `search_proof`. You can change it back after `search_proof` completes.\n\n## Getting in Touch\n\n* For general questions and discussions, please use [GitHub Discussions](https://github.com/lean-dojo/LeanCopilot/discussions).  \n* To report a potential bug, please open an issue. In the issue, please include your OS information, the exact steps to reproduce the error on **the latest stable version of Lean Copilot**, and complete logs preferrably in debug mode. **Important: If your issue cannot be reproduced easily, it will be unlikely to receive help.**\n* Feature requests and contributions are extremely welcome. Please feel free to start a [discussion](https://github.com/lean-dojo/LeanCopilot/discussions) or open a [pull request](https://github.com/lean-dojo/LeanCopilot/pulls).\n\n## Acknowledgements\n\n* We thank Scott Morrison for suggestions on simplifying Lean Copilot's installation and Mac Malone for helping implement it. Both Scott and Mac work for the [Lean FRO](https://lean-fro.org/).\n* We thank Jannis Limperg for supporting our LLM-generated tactics in Aesop (\u003chttps://github.com/leanprover-community/aesop/pull/70\u003e).\n\n## Citation\n\nIf you find our work useful, please consider citing [our paper](https://arxiv.org/abs/2404.12534):\n\n```BibTeX\n@misc{song2025leancopilotlargelanguage,\n      title={Lean Copilot: Large Language Models as Copilots for Theorem Proving in Lean}, \n      author={Peiyang Song and Kaiyu Yang and Anima Anandkumar},\n      year={2025},\n      eprint={2404.12534},\n      archivePrefix={arXiv},\n      primaryClass={cs.AI},\n      url={https://arxiv.org/abs/2404.12534}, \n}\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flean-dojo%2FLeanCopilot","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Flean-dojo%2FLeanCopilot","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flean-dojo%2FLeanCopilot/lists"}