{"id":23222649,"url":"https://github.com/huggingface/search-and-learn","last_synced_at":"2025-10-14T15:29:05.870Z","repository":{"id":268417535,"uuid":"900664199","full_name":"huggingface/search-and-learn","owner":"huggingface","description":"Recipes to scale inference-time compute of open models","archived":false,"fork":false,"pushed_at":"2025-05-22T07:11:41.000Z","size":905,"stargazers_count":1109,"open_issues_count":16,"forks_count":123,"subscribers_count":9,"default_branch":"main","last_synced_at":"2025-10-12T15:05:46.742Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/huggingface.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-12-09T08:41:37.000Z","updated_at":"2025-10-11T04:13:46.000Z","dependencies_parsed_at":"2024-12-16T16:47:11.200Z","dependency_job_id":"b5cf0634-7c68-4557-a424-c467f8e8b86f","html_url":"https://github.com/huggingface/search-and-learn","commit_stats":null,"previous_names":["huggingface/search-and-learn"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/huggingface/search-and-learn","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/huggingface%2Fsearch-and-learn","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/huggingface%2Fsearch-and-learn/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/huggingface%2Fsearch-and-learn/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/huggingface%2Fsearch-and-learn/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/huggingface","download_url":"https://codeload.github.com/huggingface/search-and-learn/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/huggingface%2Fsearch-and-learn/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":279019314,"owners_count":26086711,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-14T02:00:06.444Z","response_time":60,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-12-18T23:01:22.682Z","updated_at":"2025-10-14T15:29:05.863Z","avatar_url":"https://github.com/huggingface.png","language":"Python","funding_links":[],"categories":["A01_文本生成_文本对话","Python","LLMs"],"sub_categories":["大语言对话模型及数据"],"readme":"\u003cp align=\"center\"\u003e\n  \u003cimg style=\"width:200px\" src=\"https://raw.githubusercontent.com/huggingface/search-and-learn/main/assets/logo.png\"\u003e\n\u003c/p\u003e\n\n\u003cp align=\"center\"\u003e\n      🤗 \u003ca href=\"https://huggingface.co/collections/HuggingFaceH4/scaling-test-time-compute-with-open-models-675c3b475a0d6eb4528fec23\" target=\"_blank\"\u003eModels \u0026 Datasets\u003c/a\u003e |\n      📃 \u003ca href=\"https://huggingface.co/spaces/HuggingFaceH4/blogpost-scaling-test-time-compute\" target=\"_blank\"\u003eBlog Post\u003c/a\u003e\n\u003c/p\u003e\n\n# Search and Learn\n\nRecipes to enhance LLM capabilities by scaling inference-time compute. Name inspired by Rich Sutton's [Bitter Lesson](https://www.cs.utexas.edu/~eunsol/courses/data/bitter_lesson.pdf):\n\n\u003e One thing that should be learned from the bitter lesson is the great power of general purpose methods, of methods that continue to scale with increased computation even as the available computation becomes very great. The two methods that seem to scale arbitrarily in this way are _**search**_ and _**learning**_.\n\n## What is this?\n\nOver the last few years, the scaling of _**train-time compute**_ has dominated the progress of LLMs. Although this paradigm has proven to be remarkably effective, the resources needed to pretrain ever larger models are becoming prohibitively expensive, with billion-dollar clusters already on the horizon. This trend has sparked significant interest in a complementary approach: _**test-time compute scaling.**_ Rather than relying on ever-larger pretraining budgets, test-time methods use dynamic inference strategies that allow models to “think longer” on harder problems. A prominent example is OpenAI’s o1 model, which shows consistent improvement on difficult math and coding problems as one increases the amount of test-time compute.\n\nAlthough we don't know how o1 was trained, Search and Learn aims to fill that gap by providing the community with a series of recipes that enable open models to solve complex problems if you give them enough “time to think”. \n\n## News 🗞️\n\n* **December 16, 2024**: Initial release with code to replicate the test-time compute scaling results of our [blog post](https://huggingface.co/spaces/HuggingFaceH4/blogpost-scaling-test-time-compute).\n\n## How to navigate this project 🧭\n\nThis project is simple by design and mostly consists of:\n\n* [`scripts`](./scripts/) to scale test-time compute for open models. \n* [`recipes`](./recipes/) to apply different search algorithms at test-time. Three algorithms are currently supported: Best-of-N, beam search, and Diverse Verifier Tree Search (DVTS). Each recipe takes the form of a YAML file which contains all the parameters associated with a single inference run. \n\nTo get started, we recommend the following:\n\n1. Follow the [installation instructions](#installation-instructions) to set up your environment etc.\n2. Replicate our test-time compute results by following the [recipe instructions](./recipes/README.md).\n\n## Contents\n\nThe initial release of Search and Learn will focus on the following techniques:\n\n* **Search against verifiers:** guide LLMs to search for solutions to \"verifiable problems\" (math, code) by using a stepwise or process reward model to score each step. Includes techniques like Best-of-N sampling and tree search.\n* **Training process reward models:** train reward models to provide a sequence of scores, one for each step of the reasoning process. This ability to provide fine-grained feedback makes PRMs a natural fit for search methods with LLMs.\n\n\n# Installation instructions\n\nTo run the code in this project, first, create a Python virtual environment using e.g. Conda:\n\n```shell\nconda create -n sal python=3.11 \u0026\u0026 conda activate sal\n```\n\n```shell\npip install -e '.[dev]'\n```\n\nNext, log into your Hugging Face account as follows:\n\n```shell\nhuggingface-cli login\n```\n\nFinally, install Git LFS so that you can push models to the Hugging Face Hub:\n\n```shell\nsudo apt-get install git-lfs\n```\n\nYou can now check out the `scripts` and `recipes` directories for instructions on how to scale test-time compute for open models!\n\n## Project structure\n\n```\n├── LICENSE\n├── Makefile                    \u003c- Makefile with commands like `make style`\n├── README.md                   \u003c- The top-level README for developers using this project\n├── recipes                     \u003c- Recipe configs, accelerate configs, slurm scripts\n├── scripts                     \u003c- Scripts to scale test-time compute for models\n├── pyproject.toml              \u003c- Installation config (mostly used for configuring code quality \u0026 tests)\n├── setup.py                    \u003c- Makes project pip installable (pip install -e .) so `sal` can be imported\n├── src                         \u003c- Source code for use in this project\n└── tests                       \u003c- Unit tests\n```\n\n## Replicating our test-time compute results\n\nThe [`recipes` README](recipes/README.md) includes launch commands and config files in order to replicate our results.\n\n\n## Citation\n\nIf you find the content of this repo useful in your work, please cite it as follows via `\\usepackage{biblatex}`:\n\n```\n@misc{beeching2024scalingtesttimecompute,\n      title={Scaling test-time compute with open models},\n      author={Edward Beeching and Lewis Tunstall and Sasha Rush},\n      url={https://huggingface.co/spaces/HuggingFaceH4/blogpost-scaling-test-time-compute},\n}\n```\n\nPlease also cite the original work by DeepMind upon which this repo is based:\n\n```\n@misc{snell2024scalingllmtesttimecompute,\n      title={Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters}, \n      author={Charlie Snell and Jaehoon Lee and Kelvin Xu and Aviral Kumar},\n      year={2024},\n      eprint={2408.03314},\n      archivePrefix={arXiv},\n      primaryClass={cs.LG},\n      url={https://arxiv.org/abs/2408.03314}, \n}\n```\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhuggingface%2Fsearch-and-learn","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fhuggingface%2Fsearch-and-learn","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhuggingface%2Fsearch-and-learn/lists"}