{"id":13993936,"url":"https://github.com/wellecks/llmstep","last_synced_at":"2025-03-25T23:31:47.095Z","repository":{"id":180042792,"uuid":"664107166","full_name":"wellecks/llmstep","owner":"wellecks","description":"llmstep: [L]LM proofstep suggestions in Lean 4.","archived":false,"fork":false,"pushed_at":"2023-11-11T01:30:11.000Z","size":1634,"stargazers_count":127,"open_issues_count":4,"forks_count":15,"subscribers_count":5,"default_branch":"master","last_synced_at":"2025-03-20T22:35:50.779Z","etag":null,"topics":["lean","lean4","llm","theorem-proving"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/wellecks.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null}},"created_at":"2023-07-08T23:50:23.000Z","updated_at":"2025-03-19T13:23:05.000Z","dependencies_parsed_at":"2023-10-28T20:23:17.983Z","dependency_job_id":"e771098e-18f3-4fe9-99bd-52994f00cf58","html_url":"https://github.com/wellecks/llmstep","commit_stats":null,"previous_names":["wellecks/llmstep"],"tags_count":2,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/wellecks%2Fllmstep","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/wellecks%2Fllmstep/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/wellecks%2Fllmstep/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/wellecks%2Fllmstep/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/wellecks","download_url":"https://codeload.github.com/wellecks/llmstep/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":245561798,"owners_count":20635820,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["lean","lean4","llm","theorem-proving"],"created_at":"2024-08-09T14:02:37.988Z","updated_at":"2025-03-25T23:31:42.071Z","avatar_url":"https://github.com/wellecks.png","language":"Python","funding_links":[],"categories":["Python"],"sub_categories":[],"readme":"# `llmstep`: [L]LM proofstep suggestions in Lean\n*News*\n- [11.2023] Experimental [*Llemma*](https://arxiv.org/abs/2310.10631) suggestions that leverage file context\n- [10.2023] New paper describing version 1.0.0 of `llmstep`: [[paper](https://arxiv.org/abs/2310.18457)]\n- [10.2023] Support for [Reprover](#reprover)\n- [9.2023] Support for free GPU servers via [Google Colab](#google-colab)\n\n\n---\n\n`llmstep` is a Lean 4 tactic for suggesting proof steps using a language model:\n\n\u003cimg src=\"./llmstep.gif\" width=\"350\"/\u003e\n\nCalling `llmstep \"prefix\"` gives suggestions that start with `prefix`:\n```lean\nexample (f : ℕ → ℕ) : Monotone f → ∀ n, f n ≤ f (n + 1) := by\n  intro h n\n  llmstep \"exact\"\n\n==\u003e Lean Infoview\n  Try This:\n    * exact h (Nat.le_succ _)\n    * exact h (Nat.le_succ n)\n    * exact h (Nat.le_add_right _ _)\n```\n\nClicking a suggestion places it in the proof:\n```lean\nexample (f : ℕ → ℕ) : Monotone f → ∀ n, f n ≤ f (n + 1) := by\n  intro h n\n  exact h (Nat.le_succ _)\n```\n\n`llmstep` checks the language model suggestions in Lean, and highlights those that close the proof.\n\n## Quick start\n\nFirst, [install Lean 4 in VS Code](https://leanprover.github.io/lean4/doc/quickstart.html) and the python requirements (`pip install -r requirements.txt`).\n\nThen [start a server](#servers):\n```bash\npython python/server.py\n```\n\nOpen `LLMstep/Examples.lean` in VS Code and try out `llmstep`.\n\n## Use `llmstep` in a project\n1. Add `llmstep` in `lakefile.lean`:\n```lean\nrequire llmstep from git\n  \"https://github.com/wellecks/llmstep\"\n```\nThen run `lake update`.\n\n2. Import `llmstep` in a Lean file:\n```lean\nimport LLMstep\n```\n\n3. Start a server based on your runtime environment. For instance:\n```bash\npython python/server.py\n```\nPlease see the [recommended servers below](#servers).\n\n## Servers\nThe `llmstep` tactic communicates with a server that you can run in your own environment (e.g., CPU, GPU, Google Colab).\n\nThe table below shows the recommended language model and server scripts.\nTo start a server, use `python {script}`, e.g. `python python/server_vllm.py`:\n\n| Environment  | Script | Default Model | Context |Speed | miniF2F-test |\n| -------- | ------- | ------- |-------|------- |------- |\n| CPU  | `python/server_encdec.py` | [LeanDojo ByT5 300m](https://huggingface.co/kaiyuy/leandojo-lean4-tacgen-byt5-small) | State | 3.16s | 22.1\\%|\n| Colab GPU  | See [Colab setup](#google-colab)  | [llmstep Pythia 2.8b](https://huggingface.co/wellecks/llmstep-mathlib4-pythia2.8b) |State |1.68s | 27.9\\%|\n| CUDA GPU | `python/server_vllm.py` | [llmstep Pythia 2.8b](https://huggingface.co/wellecks/llmstep-mathlib4-pythia2.8b) |State|**0.25s** | **27.9\\%**|\n| CUDA GPU* | `python/server_llemma.py` | [Llemma 7b](https://huggingface.co/EleutherAI/llemma_7b) |State, **current file**  🔥  | N/A | N/A|\n\n\nPlease refer to [our paper](https://arxiv.org/abs/2310.18457) for further information on the benchmarks.\n\n`llmstep` aims to be a model-agnostic tool. We welcome contributions of new models.\n\n\n\\* File context support (e.g. with [Llemma](https://arxiv.org/abs/2310.10631)) is currently experimental.\n\n\n## Implementation\n\u003cimg src=\"./docs/llmstep.png\" width=\"700\"/\u003e\n\n\n`llmstep` has three parts:\n1. a [Lean tactic](./LLMstep/LLMstep.lean)\n2. a [language model](https://huggingface.co/wellecks/llmstep-mathlib4-pythia2.8b)\n3. a [Python server](./python/server.py)\n\nThe Lean tactic sends a request to the server. \\\nThe server calls the language model and returns the generated suggestions. \\\nThe suggestions are displayed by the tactic in VS Code.\n\n\n\n## Google Colab\n\nTo use Google Colab's free GPU to run a server, follow these instructions:\n\n1. Open and run this notebook to start a server: [![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/wellecks/llmstep/blob/master/python/colab/llmstep_colab_server.ipynb)\n\n2. In your local environment, set the environment variable `LLMSTEP_HOST` equal to the url printed out in this notebook (for example, `https://04fa-34-125-110-83.ngrok.io/`).\n\n3. In your local environment, set the environment variable `LLMSTEP_SERVER=COLAB`.\n\n4. Use `llmstep`.\n\n#### VS Code steps (2) and (3)\n\nTo set environment variables in VS Code, go to:\n\n- Settings (`Command` + `,` on Mac)\n- Extensions -\u003e Lean 4\n- Add the environment variables to `Server Env`. For example:\n\u003cimg src=\"./docs/vscode_env1.png\" width=\"400\"/\u003e\n\n- Then restart the Lean Server (`Command` + `t`, then type `\u003e Lean 4: Restart Server`):\n\u003cimg src=\"./docs/vscode_env2.png\" width=\"400\"/\u003e\n\n\n\n## Language model\nBy default, `llmstep` uses a Pythia 2.8b language model fine-tuned on [LeanDojo Benchmark 4](https://zenodo.org/record/8040110):\n- [`llmstep` model on Huggingface](https://huggingface.co/wellecks/llmstep-mathlib4-pythia2.8b)\n\n\nThe [python/train](python/train) directory shows how the model was fine-tuned.\n\n#### Reprover\nYou can use the non-retrieval version of [Reprover](https://github.com/lean-dojo/ReProver), which we refer to as [LeanDojo ByT5 300m](https://huggingface.co/kaiyuy/leandojo-lean4-tacgen-byt5-small):\n\n```\npython python/server_encdec.py\n```\nBy default, this runs the `leandojo-lean4-tacgen-byt5-small` model.\\\nThis model is particularly useful on CPU due to its small parameter count.\n\n#### Using a different model\n\nSwap in other decoder-only language models with the `--hf-model` argument:\n```bash\npython server.py --hf-model some/other-model-7B\n```\nUse `--hf-model` with `python/server_encdec.py` for encoder-decoder models.\n\nUse `--hf-model` with `python/server_llemma.py` for prompted base models (e.g. CodeLlama).\n\n\n#### Fine-tuning a model\nThe scripts in [python/train](python/train) show how to finetune a model.\n\n## Additional Notes\n\n#### Acknowledgements\n* The `llmstep` tactic is inspired by [`gpt-f`](https://github.com/jesse-michael-han/lean-gptf).\n* Fine-tuning data for the Pythia-2.8b model is from  [LeanDojo](https://leandojo.org/).\n* The fine-tuning code is based on the script from [Stanford Alpaca](https://github.com/tatsu-lab/stanford_alpaca).\n* The tactic implementation adopts ideas and code from Mathlib4's `Polyrith` and `Std.Tactic.TryThis`.\n* Thank you to Mario Carneiro and Scott Morrison for reviewing the tactic implementation.\n\n#### History\n`llmstep` was initially created for an IJCAI-2023 tutorial on neural theorem proving.\\\nIt aims to be a model-agnostic platform for integrating language models and Lean.\n\n#### Citation\n\nPlease cite:\n```\n@article{welleck2023llmstep,\n    title={LLMSTEP: LLM proofstep suggestions in Lean},\n    author={Sean Welleck and Rahul Saha},\n    journal={arXiv preprint arXiv:2310.18457},\n    year={2023}\n}\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fwellecks%2Fllmstep","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fwellecks%2Fllmstep","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fwellecks%2Fllmstep/lists"}