https://github.com/ZJU-REAL/Self-Braking-Tuning
[NeurIPS 2025] Code for Let LLMs Break Free from Overthinking via Self-Braking Tuning. https://arxiv.org/abs/2505.14604
https://github.com/ZJU-REAL/Self-Braking-Tuning
Last synced: 3 months ago
JSON representation
[NeurIPS 2025] Code for Let LLMs Break Free from Overthinking via Self-Braking Tuning. https://arxiv.org/abs/2505.14604
- Host: GitHub
- URL: https://github.com/ZJU-REAL/Self-Braking-Tuning
- Owner: ZJU-REAL
- License: apache-2.0
- Created: 2025-05-17T11:19:49.000Z (12 months ago)
- Default Branch: main
- Last Pushed: 2025-09-30T08:34:02.000Z (7 months ago)
- Last Synced: 2025-10-29T04:25:21.411Z (6 months ago)
- Language: Python
- Homepage: https://zju-real.github.io/SBT/
- Size: 4.1 MB
- Stars: 49
- Watchers: 0
- Forks: 0
- Open Issues: 2
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
- StarryDivineSky - ZJU-REAL/Self-Braking-Tuning - REAL/Self-Braking-Tuning 是一个基于论文《Let LLMs Break Free from Overthinking via Self-Braking Tuning》的开源项目,旨在通过"自我刹车调优"技术解决大语言模型(LLM)在训练过程中出现的"过思考"问题。该项目的核心创新在于提出了一种动态调整模型训练过程的机制,通过引入自我刹车(Self-Braking)策略,有效防止模型过度复杂化导致的性能下降。具体工作原理是通过在训练过程中动态监控模型的预测置信度,当检测到模型在某个步骤中出现"过度推理"迹象时,自动降低该步骤的梯度更新幅度,从而避免模型陷入局部最优或过度拟合。实验表明,该方法在多个基准测试中表现出色,不仅提升了模型的推理效率,还增强了模型对未知数据的泛化能力。项目代码实现了该调优策略的核心算法,支持主流大语言模型架构,并提供了详细的训练配置和实验结果对比。与传统调优方法相比,Self-Braking Tuning无需额外参数调整,且对模型性能的提升具有可解释性,尤其适用于需要平衡推理速度与准确性的应用场景。该项目已发布在arXiv(2505.14604),并提供完整的代码实现和实验数据,便于研究者复现和改进。 (A01_文本生成_文本对话 / 大语言对话模型及数据)
README
Let LRMs Break Free from Overthinking via Self-Braking Tuning
Overview of Self-Braking Tuning: Through a specialized data construction method and training strategy, our self-braking model is able to spontaneously halt overthinking.
## News 🔥🔥
- **2025.09.18:** Our paper has been accepted by **NeurIPS 2025**.
- **2025.05.20:** We release our paper.
## 📝 About
Self-Braking Tuning is a novel framework that unlocks the potential of large reasoning models to autonomously identify and terminate redundant reasoning, enabling the models to regulate their own reasoning processes without relying on external control mechanisms.
During fine-tuning, we use the Megatron-LM framework, with related parameters specified in [`configs/train.yaml`](configs/train.yaml); for evaluation, we employ the vLLM framework as the inference engine, with corresponding parameters located in [`configs/evaluation.yaml`](configs/evaluation.yaml).
Here, we provide a complete data construction framework that can be applied to nearly any long-chain tuning dataset, generating corresponding self-braking data accordingly.
## 🛠️ Preparation Steps Before Starting
In *Let LLMs Break Free from Overthinking via Self-Braking Tuning*, we performed self-braking tuning based on the OpenR1-Math dataset. In fact, this approach is applicable to any long-chain reasoning dataset, as long as the reasoning segments are wrapped with `` and `` tags. It is worth noting that, prior to training, it is recommended to keep the model's max_position_embeddings set to 32,768. In addition, to extend the context length from 4k to 32k, we increase the RoPE frequency to 300k.
Our method requires access to an LLM, and the recommended way to provide this is by setting:
```
export APIKEY=
```
**Tip**: To provide a convenient default option, we use the OpenAI API key. However, for large-scale datasets, it is recommended to deploy open-source models locally using vLLM or other frameworks, and to leverage efficient methods such as batch processing for better scalability and cost efficiency.
## 🚀 Quick Start
### 1. Install Dependencies
```bash
pip install -r requirements.txt
```
### 2. Download
```bash
python models/model_download.py
python data/datasets/download_benchmarks.py
```
### 3. Get the baseline
```bash
python data/datasets/download_OpenR1-Math.py
```
### 4. Preprocess Data
```bash
python data/preprocessing/build_sbt-e.py
python data/preprocessing/build_sbt-d.py
```
### 5. Configure and Run Training / Evaluation
Refer to the config Settings in the following file:
* `train.yaml`: Training settings
* `evalution.yaml`: Evaluation settings
## 📖 Citation
If you find our work helpful, feel free to give us a cite.
```
@misc{zhao2025letllmsbreakfree,
title={Let LLMs Break Free from Overthinking via Self-Braking Tuning},
author={Haoran Zhao and Yuchen Yan and Yongliang Shen and Haolei Xu and Wenqi Zhang and Kaitao Song and Jian Shao and Weiming Lu and Jun Xiao and Yueting Zhuang},
year={2025},
eprint={2505.14604},
archivePrefix={arXiv},
primaryClass={cs.CL},
url={https://arxiv.org/abs/2505.14604},
}
```
## 📬 Contact Us
If you have any questions, please contact us by email:
ran159753@tju.edu.cn