{"id":26987554,"url":"https://github.com/open-thoughts/open-thoughts","last_synced_at":"2025-04-03T20:08:19.929Z","repository":{"id":274604961,"uuid":"923252926","full_name":"open-thoughts/open-thoughts","owner":"open-thoughts","description":"Fully open data curation for reasoning models","archived":false,"fork":false,"pushed_at":"2025-04-03T06:54:47.000Z","size":2698,"stargazers_count":1604,"open_issues_count":4,"forks_count":140,"subscribers_count":24,"default_branch":"main","last_synced_at":"2025-04-03T07:35:47.713Z","etag":null,"topics":["open-data","reasoning"],"latest_commit_sha":null,"homepage":"https://open-thoughts.ai","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/open-thoughts.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2025-01-27T22:28:56.000Z","updated_at":"2025-04-03T06:54:50.000Z","dependencies_parsed_at":null,"dependency_job_id":"dac74c70-1db3-471e-ac7b-c027e41aaa88","html_url":"https://github.com/open-thoughts/open-thoughts","commit_stats":null,"previous_names":["open-thoughts/open-thoughts"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/open-thoughts%2Fopen-thoughts","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/open-thoughts%2Fopen-thoughts/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/open-thoughts%2Fopen-thoughts/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/open-thoughts%2Fopen-thoughts/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/open-thoughts","download_url":"https://codeload.github.com/open-thoughts/open-thoughts/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247070921,"owners_count":20878586,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["open-data","reasoning"],"created_at":"2025-04-03T20:07:29.313Z","updated_at":"2025-04-03T20:08:19.778Z","avatar_url":"https://github.com/open-thoughts.png","language":"Python","funding_links":[],"categories":["A01_文本生成_文本对话","Large Language Model Data","Python","9. Evaluation, Benchmarks \u0026 Datasets"],"sub_categories":["大语言对话模型及数据","Cognition Engineering \u0026 Test-Time Scaling"],"readme":"\n\u003c!-- markdownlint-disable first-line-h1 --\u003e\n\u003c!-- markdownlint-disable html --\u003e\n\u003c!-- markdownlint-disable no-duplicate-header --\u003e\n\n\u003cdiv align=\"center\"\u003e\n  \u003cimg src=\"images/open_thoughts.png\" width=\"60%\" alt=\"Open Thoughts GitHub Repository\" /\u003e\n\u003c/div\u003e\n\u003cp align=\"center\"\u003e\n  \u003ca href=\"https://open-thoughts.ai\"\u003e\n    \u003cimg alt=\"Static Badge\" src=\"https://img.shields.io/badge/Home-open--thoughts.ai-blue?style=flat\u0026link=https%3A%2F%2Fopen-thoughts.ai\"\u003e\n  \u003c/a\u003e\n  \u003ca href=\"https://huggingface.co/open-thoughts\"\u003e\n    \u003cimg alt=\"Hugging Face\" src=\"https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Open%20Thoughts-blue?color=ffc107\u0026logoColor=white\u0026style=flat\u0026link=https%3A%2F%2Fhuggingface.co/open-thoughts\"\u003e\n  \u003c/a\u003e\n  \u003cbr\u003e\n  \u003ci\u003eCurating the best open reasoning datasets\u003c/i\u003e\u003cbr\u003e \n  A collaboration led by \u003ca href=\"https://bespokelabs.ai/\"\u003eBespoke Labs\u003c/a\u003e and the \u003ca href=\"https://www.datacomp.ai/\"\u003eDataComp\u003c/a\u003e community\n\n\u003c/p\u003e\n\u003chr\u003e\n\nOur first goal is to curate a reasoning dataset to train state-of-the-art small reasoning models that surpass [DeepSeek-R1-Distill-Qwen-32B](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-32B) and [DeepSeek-R1-Distill-Qwen-7B](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-7B) on math and code reasoning benchmarks.\n\n\n# News\n- **[2025/04/03]** 🎉 We release [OpenThoughts2-1M](https://huggingface.co/datasets/open-thoughts/OpenThoughts2-1M), [OpenThinker2-7B](https://huggingface.co/open-thoughts/OpenThinker2-7B), and [OpenThinker2-32B](https://huggingface.co/open-thoughts/OpenThinker2-32B).\n- **[2025/03/13]** 🎉 [OpenThoughts Alice in Wonderland Blogpost](https://www.open-thoughts.ai/blog/aiw) is out.\n- **[2025/02/16]** 🎉 [OpenThinker on Ollama](https://ollama.com/library/openthinker) reaches 400k downloads.\n- **[2025/02/14]** 🎉 Chat with OpenThinker in the [online playground](https://playground.bespokelabs.ai/).\n- **[2025/02/13]** 🎉 OpenThinker is now [available on Ollama](https://ollama.com/library/openthinker) for easy local inference.\n- **[2025/02/12]** 🎉 We release [OpenThinker-32B](https://huggingface.co/open-thoughts/OpenThinker-32B), the [best open-data reasoning model](https://www.open-thoughts.ai/blog/scale).\n- **[2025/02/02]** 🎉 [OpenThoughts-114k dataset](https://huggingface.co/datasets/open-thoughts/OpenThoughts-114k) is the #1 trending dataset on Hugging Face.\n- **[2025/01/30]** 🎉 Reasoning benchmarks are added to [Evalchemy](https://github.com/mlfoundations/Evalchemy) and [compared](https://www.open-thoughts.ai/blog/measure) to publicly reported scores.\n- **[2025/01/28]** 🎉 [Open Thoughts](https://www.open-thoughts.ai/) launches with [OpenThoughts-114k dataset](https://huggingface.co/datasets/open-thoughts/OpenThoughts-114k) and [OpenThinker-7B model](https://huggingface.co/open-thoughts/OpenThinker-7B).\n- **[2025/01/27]** 🎉 [Bespoke-Stratos-17k dataset](https://huggingface.co/datasets/bespokelabs/Bespoke-Stratos-17k) is the #2 trending dataset on Hugging Face.\n- **[2025/01/22]** 🎉 [Bespoke-Stratos-17k dataset](https://huggingface.co/datasets/bespokelabs/Bespoke-Stratos-17k) and [Bespoke-Stratos-32B model](https://huggingface.co/bespokelabs/Bespoke-Stratos-32B) are [announced](https://www.bespokelabs.ai/blog/bespoke-stratos-the-unreasonable-effectiveness-of-reasoning-distillation).\n\n# Results\nOur [OpenThinker2-32B](https://huggingface.co/open-thoughts/OpenThinker2-32B) model trained on [OpenThoughts2-1M](https://huggingface.co/datasets/open-thoughts/OpenThoughts2-1M) is the state of the art open-data reasoning model.\n\nThe numbers reported in the table below are evaluated with our open-source tool [Evalchemy](https://github.com/mlfoundations/Evalchemy).\n\n[OpenThinker2-32B](https://huggingface.co/open-thoughts/OpenThinker2-32B) vs other 32B models\n| Model                                                                                           | Data | AIME24 | AIME25 | AMC23 | MATH500 | GPQA-D | LCBv2 |\n| ----------------------------------------------------------------------------------------------- | ---- | ------ | ------ | ----- | ------- | ------ | ----- |\n| [OpenThinker2-32B](https://huggingface.co/open-thoughts/OpenThinker2-32B)                       | ✅    | 76.7   | 58.7   | 94.0  | 90.8    | 64.1   | 72.5  |\n| [OpenThinker-32B](https://huggingface.co/open-thoughts/OpenThinker-32B)                         | ✅    | 68.0   | 49.3   | 95.5  | 90.6    | 63.5   | 68.6  |\n| [DeepSeek-R1-Distill-Qwen-32B](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-32B) | ❌    | 74.7   | 50.0   | 96.5  | 90.0    | 65.8   | 72.3  |\n| [Light-R1-32B](https://huggingface.co/qihoo360/Light-R1-32B)                                    | ✅    | 74.7   | 58.0   | 96.0  | 90.4    | 62.0   | 56.0  |\n| [S1.1-32B](https://huggingface.co/simplescaling/s1.1-32B)                                       | ✅    | 59.3   | 42.7   | 91.5  | 87.4    | 62.0   | 58.7  |\n\n[OpenThinker2-7B](https://huggingface.co/open-thoughts/OpenThinker2-7B) vs other 7B models\n| Model                                                                                         | Data | AIME24 | AIME25 | AMC23 | MATH500 | GPQA-D | LCBv2       |\n| --------------------------------------------------------------------------------------------- | ---- | ------ | ------ | ----- | ------- | ------ | ----------- |\n| [OpenThinker2-7B](https://huggingface.co/open-thoughts/OpenThinker2-7B)                       | ✅    | 50.0   | 33.3   | 89.5  | 88.4    | 49.3   | 55.6        |\n| [OpenThinker-7B](https://huggingface.co/open-thoughts/OpenThinker-7B)                         | ✅    | 31.3   | 23.3   | 74.5  | 83.2    | 42.9   | 38.0        |\n| [DeepSeek-R1-Distill-Qwen-7B](https://huggingface.co/deepseek-ai/DeepSeek-R1-Distill-Qwen-7B) | ❌    | 57.3   | 33.3   | 92.0  | 89.6    | 47.3   | 48.4        |\n| [OlympicCoder-7B](https://huggingface.co/open-r1/OlympicCoder-7B)                             | ✅    | 20.7   | 15.3   | 63.0  | 74.8    | 25.3   | 55.4        |\n| [OpenR1-Qwen-7B](https://huggingface.co/open-r1/OpenR1-Qwen-7B)                               | ✅    | 48.7   | 34.7   | 88.5  | 87.8    | 21.2   | 9.5\u003cbr\u003e\u003cbr\u003e |\n\nTo mitigate variance in evaluation accuracy, we compute average scores over multiple evaluation runs with different seeds. We average over 5 runs for AIME and AMC, and 3 runs for the other tasks. No system prompt is used, the maximum token length is set to 32,768, and temperature is 0.7.\n\nWe are fully open-source. Our [model weights](https://huggingface.co/open-thoughts), [datasets](https://huggingface.co/open-thoughts), [data generation code](https://github.com/open-thoughts/open-thoughts), [evaluation code](https://github.com/mlfoundations/Evalchemy), and [training code](https://github.com/hiyouga/LLaMA-Factory) are all publicly available. \n\n# Installation\n```\nmake install\npoetry shell\n```\nSet the DeepSeek API key:\n```\nexport DEEPSEEK_API_KEY=your_api_key\n```\n\nSet HF_ORG to your organization id. Set HF_PRIVATE=true if you want to push to a private repo.\n```\nexport HF_ORG=your_org_id\nexport HF_PRIVATE=false\n```\n\n# OpenThoughts2-1M Data Generation\nThe [OpenThoughts2-1M](https://huggingface.co/datasets/open-thoughts/OpenThoughts2-1M) dataset is a combination of [OpenThoughts-114k](https://huggingface.co/datasets/open-thoughts/OpenThoughts-114k), [OpenR1](https://huggingface.co/open-r1), and our newly generated math and code reasoning data. We generate the additional math and code data by ablating on 26 different question generation methodologies and sampling from the highest performing ones.\n\nThe recipe is outlined below:\n\u003cpicture\u003e\n    \u003csource media=\"(prefers-color-scheme: light)\" width=\"100%\" srcset=\"images/openthoughts2-diagram.png\"\u003e\n    \u003cimg alt=\"Data Curation Recipe\" width=\"100%\" src=\"images/openthoughts2-diagram_dark.png\"\u003e\n\u003c/picture\u003e\n\nMore details can be found in our [blog post](https://www.open-thoughts.ai/blog/thinkagain). \n\n\n# OpenThoughts-114k Data Generation\n\nFor OpenThoughts-114k, we generate data for the following domains:\n1. Code\n2. Math\n3. Science\n4. Puzzle\n\nThe recipe is outlined below:\n\u003cpicture\u003e\n    \u003csource media=\"(prefers-color-scheme: light)\" width=\"100%\" srcset=\"images/diagram.png\"\u003e\n    \u003cimg alt=\"Data Curation Recipe\" width=\"100%\" src=\"images/diagram_dark.png\"\u003e\n\u003c/picture\u003e\n\nMore instructions are in [open_thoughts/README.md](open_thoughts/README.md).\n\n\n# Training and Evaluation\nTraining and evaluation code coming soon.\n\n# Links\n- 📊 [OpenThoughts2 and OpenThinker2 Blog Post](https://www.open-thoughts.ai/blog/thinkagain)\n- 💻 [Open Thoughts GitHub Repository](https://github.com/open-thoughts/open-thoughts)\n- 🧠 [OpenThoughts2-1M dataset](https://huggingface.co/datasets/open-thoughts/OpenThoughts2-1M)\n- 🤖 [OpenThinker2-7B model](https://huggingface.co/open-thoughts/OpenThinker2-7B)\n- 🤖 [OpenThinker2-32B model](https://huggingface.co/open-thoughts/OpenThinker2-32B)\n\n# Citation\n```\n@misc{Open Thoughts,\n  author = {Open Thoughts Team},\n  month = jan,\n  title = {{Open Thoughts}},\n  year = {2025}\n}\n```\n\n# About Us\n\nWe are a team of researchers and engineers from [Bespoke Labs](https://www.bespokelabs.ai/), Stanford, University of California Berkeley, University of Washington, UT Austin, Juelich Supercomputing Center (JSC), LAION, UCLA, UNC Chapel Hill, UT Austin, and Toyota Research Institute united around building the best datasets (and thus the best models). See our previous works at [datacomp.ai](https://www.datacomp.ai/) and [mlfoundations](https://github.com/mlfoundations).\n\n# Sponsors\nOpen Thoughts is supported by \n- [Bespoke Labs](https://www.bespokelabs.ai/)\n- [Lambda Labs](https://lambdalabs.com/)\n- [NSF IFML](https://www.ifml.institute/)\n- [UT Austin Machine Learning Lab](https://ml.utexas.edu/)\n- [Juelich Supercomputing Center](https://www.fz-juelich.de/en/ias/jsc)\n- [Toyota Research Institute](https://www.tri.global)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fopen-thoughts%2Fopen-thoughts","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fopen-thoughts%2Fopen-thoughts","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fopen-thoughts%2Fopen-thoughts/lists"}