Awesome-Efficient-Reasoning-Models
[Arxiv 2025] Efficient Reasoning Models: A Survey
https://github.com/fscdc/Awesome-Efficient-Reasoning-Models
Last synced: about 15 hours ago
JSON representation
-
Full list
-
Make Long CoT Short
- Time's Up! An Empirical Study of LLM Reasoning Ability Under Output Length Constraint
- CoT-RAG: Integrating Chain of Thought and Retrieval-Augmented Generation to Enhance Reasoning in Large Language Models
- Thought Manipulation: External Thought Can Be Efficient for Large Reasoning Models
- ]()<br>[The Benefits of a Concise Chain of Thought on Problem-Solving in Large Language Models](https://arxiv.org/abs/2401.05618) <br> Matthew Renze, Erhan Guven |<img width="1002" alt="image" src="https://arxiv.org/html/2401.05618v3/x1.png"> |[Github](https://github.com/matthewrenze/jhu-concise-cot) <br> [Paper](https://arxiv.org/abs/2401.05618)| [//]: #04/08
- Break the Chain: Large Language Models Can be Shortcut Reasoners
-  <br> Yu Kang, Xianghui Sun, Liangyu Chen, Wei Zou |<img width="1002" alt="image" src="figures/co3t.png"> |[Paper](https://arxiv.org/abs/2412.11664)|[//]: #03/16
- ![Star - NeurIPS_2024-blue)]()<br>[Can Language Models Learn to Skip Steps?](https://arxiv.org/abs/2411.01855) <br> Tengxiao Liu, Qipeng Guo, Xiangkun Hu, Cheng Jiayang, Yue Zhang, Xipeng Qiu, Zheng Zhang |<img width="1002" alt="image" src="figures/skip_step.png"> |[Github](https://github.com/tengxiaoliu/LM_skip) <br> [Paper](https://arxiv.org/abs/2411.01855)|[//]: #03/16
-  <br> Tingxu Han, Zhenting Wang, Chunrong Fang, Shiyu Zhao, Shiqing Ma, Zhenyu Chen |<img width="1002" alt="image" src="https://arxiv.org/html/2412.18547v4/x10.png"> |[Github](https://github.com/GeniusHTX/TALE) <br> [Paper](https://arxiv.org/abs/2412.18547)| [//]: #04/08
-  <br> Haotian Luo, Li Shen, Haiying He, Yibo Wang, Shiwei Liu, Wei Li, Naiqiang Tan, Xiaochun Cao, Dacheng Tao |<img width="1002" alt="image" src="figures/o1_pruner.png"> |[Github](https://github.com/StarDewXXX/O1-Pruner) <br> [Paper](https://arxiv.org/abs/2501.12570)|[//]: #03/16
- Kimi k1.5: Scaling Reinforcement Learning with LLMs
-  <br> Edward Yeo, Yuxuan Tong, Morry Niu, Graham Neubig, Xiang Yue |<img width="1002" alt="image" src="https://arxiv.org/html/2502.03373v1/x1.png"> |[Github](https://github.com/eddycmu/demystify-long-cot) <br> [Paper](https://arxiv.org/abs/2502.03373)| [//]: #04/08
-  <br> Daman Arora, Andrea Zanette |<img width="1002" alt="image" src="https://arxiv.org/html/2502.04463v2/x3.png"> |[Github](https://github.com/Zanette-Labs/efficient-reasoning) <br> [Paper](https://arxiv.org/abs/2502.04463)| [//]: #04/08
-  <br> Pranjal Aggarwal, Sean Welleck |<img width="1002" alt="image" src="https://arxiv.org/html/2503.04697v1/x2.png"> |[Github](https://github.com/cmu-l3/l1) <br> [Paper](https://www.arxiv.org/abs/2503.04697)| [//]: #04/08
- Distilling System 2 into System 1
-  <br> Heming Xia, Yongqi Li, Chak Tou Leong, Wenjie Wang, Wenjie Li |<img width="1002" alt="image" src="figures/TokenSkip.png"> |[Github](https://github.com/hemingkx/TokenSkip) <br> [Paper](https://arxiv.org/abs/2502.12067)|[//]: #03/20
- Stepwise Perplexity-Guided Refinement for Efficient Chain-of-Thought Reasoning in Large Language Models
- Towards Thinking-Optimal Scaling of Test-Time Compute for LLM Reasoning
-  <br> Tergel Munkhbat, Namgyu Ho, Seo Hyun Kim, Yongjin Yang, Yujin Kim, Se-Young Yun |<img width="1002" alt="image" src="https://arxiv.org/html/2502.20122v2/x1.png"> |[Github](https://github.com/TergelMunkhbat/concise-reasoning) <br> [Paper](https://arxiv.org/abs/2502.20122)| [//]: #04/08
- ]()<br>[The Benefits of a Concise Chain of Thought on Problem-Solving in Large Language Models](https://arxiv.org/abs/2401.05618) <br> Matthew Renze, Erhan Guven |<img width="1002" alt="image" src="https://arxiv.org/html/2401.05618v3/x1.png"> |[Github](https://github.com/matthewrenze/jhu-concise-cot) <br> [Paper](https://arxiv.org/abs/2401.05618)| [//]: #04/08
- Break the Chain: Large Language Models Can be Shortcut Reasoners
-  <br> Silei Xu, Wenhao Xie, Lingxiao Zhao, Pengcheng He |<img width="1002" alt="image" src="https://arxiv.org/html/2502.18600v2/extracted/6244873/plot.png"> |[Github](https://github.com/sileix/chain-of-draft) <br> [Paper](https://arxiv.org/abs/2502.18600)| [//]: #04/08
- ]()<br>[Unlocking the Capabilities of Thought: A Reasoning Boundary Framework to Quantify and Optimize Chain-of-Thought](https://arxiv.org/abs/2410.05695) <br> Qiguang Chen, Libo Qin, Jiaqi Wang, Jinxuan Zhou, Wanxiang Che |<img width="1002" alt="image" src="https://arxiv.org/html/2410.05695v2/x1.png"> |[Github](https://github.com/LightChen233/reasoning-boundary) <br> [Paper](https://arxiv.org/abs/2410.05695)| [//]: #04/08
- How Well do LLMs Compress Their Own Chain-of-Thought? A Token Complexity Approach - pro-legend.png" width="45%"> <img src="https://arxiv.org/html/2503.01141v2/extracted/6325669/plot/Anthropic/claude-3-5-sonnet-20241022-mmlu-main.png" width="45%"> |[Paper](https://arxiv.org/abs/2503.01141)| [//]: #04/08
-  <br> Simon A. Aytes, Jinheon Baek, Sung Ju Hwang |<img width="1002" alt="image" src="https://arxiv.org/html/2503.05179v1/x1.png"> |[Github](https://github.com/SimonAytes/SoT) <br> [Paper](https://arxiv.org/abs/2503.05179)| [//]: #04/08
- Learning to Route LLMs with Confidence Tokens - Neng Chuang, Helen Zhou, Prathusha Kameswara Sarma, Parikshit Gopalan, John Boccio, Sara Bolouki, Xia Hu |<img width="1002" alt="image" src="https://arxiv.org/html/2410.13284v2/x1.png"> |[Paper](https://arxiv.org/abs/2410.13284)| [//]: #04/08
- Confident or Seek Stronger: Exploring Uncertainty-Based On-device LLM Routing From Benchmarking to Generalization - Neng Chuang, Leisheng Yu, Guanchu Wang, Lizhe Zhang, Zirui Liu, Xuanting Cai, Yang Sui, Vladimir Braverman, Xia Hu |<img width="1002" alt="image" src="https://arxiv.org/html/2502.04428v1/x1.png"> |[Paper](https://arxiv.org/abs/2502.04428)| [//]: #04/08
- Claude 3.7 Sonnet - 3-7-sonnet)
- Beyond Chains of Thought: Benchmarking Latent-Space Reasoning Abilities in Large Language Models
-  <br> Qifan Yu, Zhenyu He, Sijie Li, Xun Zhou, Jun Zhang, Jingjing Xu, Di He |<img width="1002" alt="image" src="https://arxiv.org/html/2502.08482v1/x1.png"> |[Github](https://github.com/qifanyu/RELAY) <br> [Paper](https://arxiv.org/abs/2502.08482)| [//]: #04/08
-  <br> Heming Xia, Yongqi Li, Chak Tou Leong, Wenjie Wang, Wenjie Li |<img width="1002" alt="image" src="figures/TokenSkip.png"> |[Github](https://github.com/hemingkx/TokenSkip) <br> [Paper](https://arxiv.org/abs/2502.12067)|[//]: #03/20
- Stepwise Perplexity-Guided Refinement for Efficient Chain-of-Thought Reasoning in Large Language Models
- Towards Thinking-Optimal Scaling of Test-Time Compute for LLM Reasoning
-  <br> Tergel Munkhbat, Namgyu Ho, Seo Hyun Kim, Yongjin Yang, Yujin Kim, Se-Young Yun |<img width="1002" alt="image" src="https://arxiv.org/html/2502.20122v2/x1.png"> |[Github](https://github.com/TergelMunkhbat/concise-reasoning) <br> [Paper](https://arxiv.org/abs/2502.20122)| [//]: #04/08
-  <br> Tingxu Han, Zhenting Wang, Chunrong Fang, Shiyu Zhao, Shiqing Ma, Zhenyu Chen |<img width="1002" alt="image" src="https://arxiv.org/html/2412.18547v4/x10.png"> |[Github](https://github.com/GeniusHTX/TALE) <br> [Paper](https://arxiv.org/abs/2412.18547)| [//]: #04/08
-  <br> Haotian Luo, Li Shen, Haiying He, Yibo Wang, Shiwei Liu, Wei Li, Naiqiang Tan, Xiaochun Cao, Dacheng Tao |<img width="1002" alt="image" src="figures/o1_pruner.png"> |[Github](https://github.com/StarDewXXX/O1-Pruner) <br> [Paper](https://arxiv.org/abs/2501.12570)|[//]: #03/16
- Kimi k1.5: Scaling Reinforcement Learning with LLMs
-  <br> Edward Yeo, Yuxuan Tong, Morry Niu, Graham Neubig, Xiang Yue |<img width="1002" alt="image" src="https://arxiv.org/html/2502.03373v1/x1.png"> |[Github](https://github.com/eddycmu/demystify-long-cot) <br> [Paper](https://arxiv.org/abs/2502.03373)| [//]: #04/08
-  <br> Daman Arora, Andrea Zanette |<img width="1002" alt="image" src="https://arxiv.org/html/2502.04463v2/x3.png"> |[Github](https://github.com/Zanette-Labs/efficient-reasoning) <br> [Paper](https://arxiv.org/abs/2502.04463)| [//]: #04/08
-  <br> Xinyin Ma, Guangnian Wan, Runpeng Yu, Gongfan Fang, Xinchao Wang |<img width="1002" alt="image" src="figures/cot_valve.png"> |[Github](https://github.com/horseee/CoT-Valve) <br> [Paper](https://arxiv.org/abs/2502.09601)|[//]: #03/16
-  <br> Yu Kang, Xianghui Sun, Liangyu Chen, Wei Zou |<img width="1002" alt="image" src="figures/co3t.png"> |[Paper](https://arxiv.org/abs/2412.11664)|[//]: #03/16
- ![Star - NeurIPS_2024-blue)]()<br>[Can Language Models Learn to Skip Steps?](https://arxiv.org/abs/2411.01855) <br> Tengxiao Liu, Qipeng Guo, Xiangkun Hu, Cheng Jiayang, Yue Zhang, Xipeng Qiu, Zheng Zhang |<img width="1002" alt="image" src="figures/skip_step.png"> |[Github](https://github.com/tengxiaoliu/LM_skip) <br> [Paper](https://arxiv.org/abs/2411.01855)|[//]: #03/16
- Distilling System 2 into System 1
-  <br> Pranjal Aggarwal, Sean Welleck |<img width="1002" alt="image" src="https://arxiv.org/html/2503.04697v1/x2.png"> |[Github](https://github.com/cmu-l3/l1) <br> [Paper](https://www.arxiv.org/abs/2503.04697)| [//]: #04/08
- DAST: Difficulty-Adaptive Slow-Thinking for Large Reasoning Models
- Adaptive Group Policy Optimization: Towards Stable Training and Token-Efficient Reasoning
-  <br> Bairu Hou, Yang Zhang, Jiabao Ji, Yujian Liu, Kaizhi Qian, Jacob Andreas, Shiyu Chang |<img width="1002" alt="image" src="https://arxiv.org/html/2504.01296v1/x1.png"> |[Github](https://github.com/UCSB-NLP-Chang/ThinkPrune) <br> [Paper](https://arxiv.org/abs/2504.01296)| [//]: #04/08
- Think When You Need: Self-Adaptive Chain-of-Thought Learning
- Time's Up! An Empirical Study of LLM Reasoning Ability Under Output Length Constraint
- CoT-RAG: Integrating Chain of Thought and Retrieval-Augmented Generation to Enhance Reasoning in Large Language Models
-  <br> Silei Xu, Wenhao Xie, Lingxiao Zhao, Pengcheng He |<img width="1002" alt="image" src="https://arxiv.org/html/2502.18600v2/extracted/6244873/plot.png"> |[Github](https://github.com/sileix/chain-of-draft) <br> [Paper](https://arxiv.org/abs/2502.18600)| [//]: #04/08
- How Well do LLMs Compress Their Own Chain-of-Thought? A Token Complexity Approach - pro-legend.png" width="45%"> <img src="https://arxiv.org/html/2503.01141v2/extracted/6325669/plot/Anthropic/claude-3-5-sonnet-20241022-mmlu-main.png" width="45%"> |[Paper](https://arxiv.org/abs/2503.01141)| [//]: #04/08
-  <br> Simon A. Aytes, Jinheon Baek, Sung Ju Hwang |<img width="1002" alt="image" src="https://arxiv.org/html/2503.05179v1/x1.png"> |[Github](https://github.com/SimonAytes/SoT) <br> [Paper](https://arxiv.org/abs/2503.05179)| [//]: #04/08
- Learning to Route LLMs with Confidence Tokens - Neng Chuang, Helen Zhou, Prathusha Kameswara Sarma, Parikshit Gopalan, John Boccio, Sara Bolouki, Xia Hu |<img width="1002" alt="image" src="https://arxiv.org/html/2410.13284v2/x1.png"> |[Paper](https://arxiv.org/abs/2410.13284)| [//]: #04/08
- Confident or Seek Stronger: Exploring Uncertainty-Based On-device LLM Routing From Benchmarking to Generalization - Neng Chuang, Leisheng Yu, Guanchu Wang, Lizhe Zhang, Zirui Liu, Xuanting Cai, Yang Sui, Vladimir Braverman, Xia Hu |<img width="1002" alt="image" src="https://arxiv.org/html/2502.04428v1/x1.png"> |[Paper](https://arxiv.org/abs/2502.04428)| [//]: #04/08
- DAST: Difficulty-Adaptive Slow-Thinking for Large Reasoning Models
- Adaptive Group Policy Optimization: Towards Stable Training and Token-Efficient Reasoning
-  <br> Bairu Hou, Yang Zhang, Jiabao Ji, Yujian Liu, Kaizhi Qian, Jacob Andreas, Shiyu Chang |<img width="1002" alt="image" src="https://arxiv.org/html/2504.01296v1/x1.png"> |[Github](https://github.com/UCSB-NLP-Chang/ThinkPrune) <br> [Paper](https://arxiv.org/abs/2504.01296)| [//]: #04/08
- Think When You Need: Self-Adaptive Chain-of-Thought Learning
-  <br> Jonas Geiping, Sean McLeish, Neel Jain, John Kirchenbauer, Siddharth Singh, Brian R. Bartoldson, Bhavya Kailkhura, Abhinav Bhatele, Tom Goldstein |<img width="1002" alt="image" src="https://arxiv.org/html/2502.05171v2/x2.png"> |[Github](https://github.com/seal-rg/recurrent-pretraining) <br> [Paper](https://arxiv.org/abs/2502.05171)| [//]: #04/08
- Weight-of-Thought Reasoning: Exploring Neural Network Weights for Enhanced LLM Reasoning
- Claude 3.7 Sonnet - 3-7-sonnet)
- Beyond Chains of Thought: Benchmarking Latent-Space Reasoning Abilities in Large Language Models
- ]()<br>[Diffusion of Thoughts: Chain-of-Thought Reasoning in Diffusion Language Models](https://arxiv.org/abs/2402.07754) <br> Jiacheng Ye, Shansan Gong, Liheng Chen, Lin Zheng, Jiahui Gao, Han Shi, Chuan Wu, Xin Jiang, Zhenguo Li, Wei Bi, Lingpeng Kong |<img width="1002" alt="image" src="figures/diffusion_thought.png"> |[Github](https://github.com/HKUNLP/diffusion-of-thoughts) <br> [Paper](https://arxiv.org/abs/2402.07754)| [//]: #04/08
-  <br> Qifan Yu, Zhenyu He, Sijie Li, Xun Zhou, Jun Zhang, Jingjing Xu, Di He |<img width="1002" alt="image" src="https://arxiv.org/html/2502.08482v1/x1.png"> |[Github](https://github.com/qifanyu/RELAY) <br> [Paper](https://arxiv.org/abs/2502.08482)| [//]: #04/08
- CODI: Compressing Chain-of-Thought into Continuous Space via Self-Distillation
- CODI: Compressing Chain-of-Thought into Continuous Space via Self-Distillation
-  <br> Jintian Zhang, Yuqi Zhu, Mengshu Sun, Yujie Luo, Shuofei Qiao, Lun Du, Da Zheng, Huajun Chen, Ningyu Zhang |<img width="1002" alt="image" src="https://arxiv.org/html/2502.15589v1/x1.png"> |[Github](https://github.com/zjunlp/LightThinker) <br> [Paper](https://arxiv.org/abs/2502.15589)| [//]: #04/08
- ![Star - COLM_2024-blue)]()<br>[Guiding Language Model Reasoning with Planning Tokens](https://arxiv.org/abs/2310.05707) <br> Xinyi Wang, Lucas Caccia, Oleksiy Ostapenko, Xingdi Yuan, William Yang Wang, Alessandro Sordoni |<img width="1002" alt="image" src="https://arxiv.org/html/2310.05707v4/extracted/5777851/img/overview.png"> |[Github](https://github.com/WANGXinyiLinda/planning_tokens) <br> [Paper](https://arxiv.org/abs/2310.05707)| [//]: #04/08
- ![Star - COLM_2024-blue)]()<br>[Let's Think Dot by Dot: Hidden Computation in Transformer Language Models](https://arxiv.org/abs/2404.15758) <br> Jacob Pfau, William Merrill, Samuel R. Bowman |<img width="1002" alt="image" src="https://arxiv.org/html/2404.15758v1/extracted/2404.15758v1/figs/scale_len.png"> |[Github](https://github.com/JacobPfau/fillerTokens) <br> [Paper](https://arxiv.org/abs/2404.15758)| [//]: #04/08
-  <br> Mingyu Jin, Weidi Luo, Sitao Cheng, Xinyi Wang, Wenyue Hua, Ruixiang Tang, William Yang Wang, Yongfeng Zhang |<img width="1002" alt="image" src="https://arxiv.org/html/2411.13504v2/x1.png"> |[Github](https://github.com/MingyuJ666/Disentangling-Memory-and-Reasoning) <br> [Paper](https://arxiv.org/abs/2411.13504)| [//]: #04/08
- Token Assorted: Mixing Latent and Text Tokens for Improved Language Model Reasoning
- Training Large Language Models to Reason in a Continuous Latent Space
-  <br> Jonas Geiping, Sean McLeish, Neel Jain, John Kirchenbauer, Siddharth Singh, Brian R. Bartoldson, Bhavya Kailkhura, Abhinav Bhatele, Tom Goldstein |<img width="1002" alt="image" src="https://arxiv.org/html/2502.05171v2/x2.png"> |[Github](https://github.com/seal-rg/recurrent-pretraining) <br> [Paper](https://arxiv.org/abs/2502.05171)| [//]: #04/08
- Token Assorted: Mixing Latent and Text Tokens for Improved Language Model Reasoning
- Training Large Language Models to Reason in a Continuous Latent Space
-  <br> Jintian Zhang, Yuqi Zhu, Mengshu Sun, Yujie Luo, Shuofei Qiao, Lun Du, Da Zheng, Huajun Chen, Ningyu Zhang |<img width="1002" alt="image" src="https://arxiv.org/html/2502.15589v1/x1.png"> |[Github](https://github.com/zjunlp/LightThinker) <br> [Paper](https://arxiv.org/abs/2502.15589)| [//]: #04/08
- ![Star - COLM_2024-blue)]()<br>[Guiding Language Model Reasoning with Planning Tokens](https://arxiv.org/abs/2310.05707) <br> Xinyi Wang, Lucas Caccia, Oleksiy Ostapenko, Xingdi Yuan, William Yang Wang, Alessandro Sordoni |<img width="1002" alt="image" src="https://arxiv.org/html/2310.05707v4/extracted/5777851/img/overview.png"> |[Github](https://github.com/WANGXinyiLinda/planning_tokens) <br> [Paper](https://arxiv.org/abs/2310.05707)| [//]: #04/08
-  <br> Mingyu Jin, Weidi Luo, Sitao Cheng, Xinyi Wang, Wenyue Hua, Ruixiang Tang, William Yang Wang, Yongfeng Zhang |<img width="1002" alt="image" src="https://arxiv.org/html/2411.13504v2/x1.png"> |[Github](https://github.com/MingyuJ666/Disentangling-Memory-and-Reasoning) <br> [Paper](https://arxiv.org/abs/2411.13504)| [//]: #04/08
-
Build SLM with Strong Reasoning Ability
- | [//]: #04/08
-  <br> Yuetai Li, Xiang Yue, Zhangchen Xu, Fengqing Jiang, Luyao Niu, Bill Yuchen Lin, Bhaskar Ramasubramanian, Radha Poovendran |<img width="1002" alt="image" src="https://arxiv.org/html/2502.12143v2/x1.png"> |[Github](https://github.com/Small-Model-Gap/Small-Model-Learnability-Gap) <br> [Paper](https://arxiv.org/abs/2502.12143)| [//]: #04/08
- Deconstructing Long Chain-of-Thought: A Structured Reasoning Optimization Framework for Long CoT Distillation
- ]()<br>[Small Language Models Need Strong Verifiers to Self-Correct Reasoning](https://arxiv.org/abs/2404.17140) <br> Yunxiang Zhang, Muhammad Khalifa, Lajanugen Logeswaran, Jaekyeom Kim, Moontae Lee, Honglak Lee, Lu Wang |<img width="1002" alt="image" src="https://arxiv.org/html/2404.17140v2/x1.png"> |[Github](https://github.com/yunx-z/SCORE) <br> [Paper](https://arxiv.org/abs/2404.17140)| [//]: #04/08
- Improving Mathematical Reasoning Capabilities of Small Language Models via Feedback-Driven Distillation
- | [//]: #04/08
-  <br> Yuetai Li, Xiang Yue, Zhangchen Xu, Fengqing Jiang, Luyao Niu, Bill Yuchen Lin, Bhaskar Ramasubramanian, Radha Poovendran |<img width="1002" alt="image" src="https://arxiv.org/html/2502.12143v2/x1.png"> |[Github](https://github.com/Small-Model-Gap/Small-Model-Learnability-Gap) <br> [Paper](https://arxiv.org/abs/2502.12143)| [//]: #04/08
- Deconstructing Long Chain-of-Thought: A Structured Reasoning Optimization Framework for Long CoT Distillation
- ]()<br>[Small Language Models Need Strong Verifiers to Self-Correct Reasoning](https://arxiv.org/abs/2404.17140) <br> Yunxiang Zhang, Muhammad Khalifa, Lajanugen Logeswaran, Jaekyeom Kim, Moontae Lee, Honglak Lee, Lu Wang |<img width="1002" alt="image" src="https://arxiv.org/html/2404.17140v2/x1.png"> |[Github](https://github.com/yunx-z/SCORE) <br> [Paper](https://arxiv.org/abs/2404.17140)| [//]: #04/08
- Improving Mathematical Reasoning Capabilities of Small Language Models via Feedback-Driven Distillation
- | [//]: #04/08
- Thinking Slow, Fast: Scaling Inference Compute with Distilled Reasoners
- Distilling Reasoning Ability from Large Language Models with Adaptive Thinking
- Thinking Slow, Fast: Scaling Inference Compute with Distilled Reasoners
- Distilling Reasoning Ability from Large Language Models with Adaptive Thinking
-  <br> Xinghao Chen, Zhijing Sun, Wenjin Guo, Miaoran Zhang, Yanjun Chen, Yirong Sun, Hui Su, Yijie Pan, Dietrich Klakow, Wenjie Li, Xiaoyu Shen |<img width="1002" alt="image" src="https://arxiv.org/html/2502.18001v1/x1.png"> |[Github](https://github.com/EIT-NLP/Distilling-CoT-Reasoning) <br> [Paper](https://arxiv.org/abs/2502.18001)| [//]: #04/08
-  <br> Xinghao Chen, Zhijing Sun, Wenjin Guo, Miaoran Zhang, Yanjun Chen, Yirong Sun, Hui Su, Yijie Pan, Dietrich Klakow, Wenjie Li, Xiaoyu Shen |<img width="1002" alt="image" src="https://arxiv.org/html/2502.18001v1/x1.png"> |[Github](https://github.com/EIT-NLP/Distilling-CoT-Reasoning) <br> [Paper](https://arxiv.org/abs/2502.18001)| [//]: #04/08
- Towards Reasoning Ability of Small Language Models
-  <br> Ruikang Liu, Yuxuan Sun, Manyi Zhang, Haoli Bai, Xianzhi Yu, Tiezheng Yu, Chun Yuan, Lu Hou |<img width="1002" alt="image" src="figures/quant_hurt.png"> |[Github](https://github.com/ruikangliu/Quantized-Reasoning-Models) <br> [Paper](https://arxiv.org/abs/2504.04823)| [//]: #04/14
- When Reasoning Meets Compression: Benchmarking Compressed Large Reasoning Models on Complex Reasoning Tasks
-  <br> Quy-Anh Dang, Chris Ngo |<img src="https://arxiv.org/html/2503.16219v1/extracted/6296504/images/pass1.png" width="45%"> <img src="https://arxiv.org/html/2503.16219v1/extracted/6296504/images/costs.png" width="45%"> |[Github](https://github.com/knoveleng/open-rs) <br> [Paper](https://arxiv.org/abs/2503.16219)| [//]: #04/08
-  <br> Weihao Zeng, Yuzhen Huang, Qian Liu, Wei Liu, Keqing He, Zejun Ma, Junxian He |<img width="1002" alt="image" src="figures/simplerl_zoo.png"> |[Github](https://github.com/hkust-nlp/simpleRL-reason) <br> [Paper](https://arxiv.org/abs/2503.18892)| [//]: #04/08
- DeepScaleR - project.com/)
- Towards Reasoning Ability of Small Language Models
-  <br> Ruikang Liu, Yuxuan Sun, Manyi Zhang, Haoli Bai, Xianzhi Yu, Tiezheng Yu, Chun Yuan, Lu Hou |<img width="1002" alt="image" src="figures/quant_hurt.png"> |[Github](https://github.com/ruikangliu/Quantized-Reasoning-Models) <br> [Paper](https://arxiv.org/abs/2504.04823)| [//]: #04/14
- When Reasoning Meets Compression: Benchmarking Compressed Large Reasoning Models on Complex Reasoning Tasks
-  <br> Quy-Anh Dang, Chris Ngo |<img src="https://arxiv.org/html/2503.16219v1/extracted/6296504/images/pass1.png" width="45%"> <img src="https://arxiv.org/html/2503.16219v1/extracted/6296504/images/costs.png" width="45%"> |[Github](https://github.com/knoveleng/open-rs) <br> [Paper](https://arxiv.org/abs/2503.16219)| [//]: #04/08
-  <br> Weihao Zeng, Yuzhen Huang, Qian Liu, Wei Liu, Keqing He, Zejun Ma, Junxian He |<img width="1002" alt="image" src="figures/simplerl_zoo.png"> |[Github](https://github.com/hkust-nlp/simpleRL-reason) <br> [Paper](https://arxiv.org/abs/2503.18892)| [//]: #04/08
- DeepScaleR - project.com/)
-
Let Decoding More Efficient
-  <br> Ding Chen, Qingchen Yu, Pengyuan Wang, Wentao Zhang, Bo Tang, Feiyu Xiong, Xinchi Li, Minchuan Yang, Zhiyu Li |<img width="1002" alt="image" src="https://arxiv.org/html/2504.10481v1/x1.png"> |[Github](https://github.com/IAAR-Shanghai/xVerify) <br> [Paper](https://arxiv.org/abs/2504.10481)| [//]: #04/17
-  <br> Pranjal Aggarwal, Aman Madaan, Yiming Yang, Mausam |<img width="1002" alt="image" src="figures/asc.png"> |[Github](https://github.com/Pranjal2041/AdaptiveConsistency) <br> [Paper](https://arxiv.org/abs/2305.11860)| [//]: #04/08
- ![Star - ICLR_2024-blue)]()<br>[Escape Sky-high Cost: Early-stopping Self-Consistency for Multi-step Reasoning](https://arxiv.org/abs/2401.10480) <br> Yiwei Li, Peiwen Yuan, Shaoxiong Feng, Boyuan Pan, Xinglin Wang, Bin Sun, Heda Wang, Kan Li |<img width="1002" alt="image" src="https://arxiv.org/html/2401.10480v1/x1.png"> |[Github](https://github.com/Yiwei98/ESC) <br> [Paper](https://arxiv.org/abs/2401.10480)| [//]: #04/08
- Think Deep, Think Fast: Investigating Efficiency of Verifier-free Inference-time-scaling Methods - Falcon, Ben Athiwaratkun, Qingyang Wu, Jue Wang, Shuaiwen Leon Song, Ce Zhang, Bhuwan Dhingra, James Zou |<img width="1002" alt="image" src="https://arxiv.org/html/2504.14047v1/x1.png"> |[Paper](https://arxiv.org/abs/2504.14047)| [//]: #04/23
-  <br> Ding Chen, Qingchen Yu, Pengyuan Wang, Wentao Zhang, Bo Tang, Feiyu Xiong, Xinchi Li, Minchuan Yang, Zhiyu Li |<img width="1002" alt="image" src="https://arxiv.org/html/2504.10481v1/x1.png"> |[Github](https://github.com/IAAR-Shanghai/xVerify) <br> [Paper](https://arxiv.org/abs/2504.10481)| [//]: #04/17
-  <br> Pranjal Aggarwal, Aman Madaan, Yiming Yang, Mausam |<img width="1002" alt="image" src="figures/asc.png"> |[Github](https://github.com/Pranjal2041/AdaptiveConsistency) <br> [Paper](https://arxiv.org/abs/2305.11860)| [//]: #04/08
- ![Star - ICLR_2024-blue)]()<br>[Escape Sky-high Cost: Early-stopping Self-Consistency for Multi-step Reasoning](https://arxiv.org/abs/2401.10480) <br> Yiwei Li, Peiwen Yuan, Shaoxiong Feng, Boyuan Pan, Xinglin Wang, Bin Sun, Heda Wang, Kan Li |<img width="1002" alt="image" src="https://arxiv.org/html/2401.10480v1/x1.png"> |[Github](https://github.com/Yiwei98/ESC) <br> [Paper](https://arxiv.org/abs/2401.10480)| [//]: #04/08
- Think Deep, Think Fast: Investigating Efficiency of Verifier-free Inference-time-scaling Methods - Falcon, Ben Athiwaratkun, Qingyang Wu, Jue Wang, Shuaiwen Leon Song, Ce Zhang, Bhuwan Dhingra, James Zou |<img width="1002" alt="image" src="https://arxiv.org/html/2504.14047v1/x1.png"> |[Paper](https://arxiv.org/abs/2504.14047)| [//]: #04/23
-  <br> Baohao Liao, Yuhui Xu, Hanze Dong, Junnan Li, Christof Monz, Silvio Savarese, Doyen Sahoo, Caiming Xiong |<img width="1002" alt="image" src="figures/rsd.png"> |[Github](https://github.com/BaohaoLiao/RSD) <br> [Paper](https://arxiv.org/abs/2501.19324)| [//]: #04/08
- Meta-Reasoner: Dynamic Guidance for Optimized Inference-time Reasoning in Large Language Models
-  <br> Fengwei Teng, Zhaoyang Yu, Quan Shi, Jiayi Zhang, Chenglin Wu, Yuyu Luo |<img width="1002" alt="image" src="figures/aot.png"> |[Github](https://github.com/qixucen/atom) <br> [Paper](https://arxiv.org/abs/2502.12018)| [//]: #04/08
- ![Star - NAACL_Findings_2025-blue)]()<br>[Make Every Penny Count: Difficulty-Adaptive Self-Consistency for Cost-Efficient Reasoning](https://arxiv.org/abs/2408.13457) <br> Xinglin Wang, Shaoxiong Feng, Yiwei Li, Peiwen Yuan, Yueqi Zhang, Chuyi Tan, Boyuan Pan, Yao Hu, Kan Li |<img width="1002" alt="image" src="https://arxiv.org/html/2408.13457v3/x3.png"> |[Github](https://github.com/WangXinglin/DSC) <br> [Paper](https://arxiv.org/abs/2408.13457)| [//]: #04/08
- Path-Consistency: Prefix Enhancement for Efficient Inference in LLM
- Bridging Internal Probability and Self-Consistency for Effective and Efficient LLM Reasoning - Zhe Guo, Xiaoxing Ma, Yu-Feng Li |<img width="1002" alt="image" src="https://arxiv.org/html/2502.00511v2/x3.png"> |[Paper](https://arxiv.org/abs/2502.00511)| [//]: #04/08
- Confidence Improves Self-Consistency in LLMs
-  <br> Chengsong Huang, Langlin Huang, Jixuan Leng, Jiacheng Liu, Jiaxin Huang |<img width="1002" alt="image" src="https://arxiv.org/html/2503.00031v1/x2.png"> |[Github](https://github.com/Chengsong-Huang/Self-Calibration) <br> [Paper](https://arxiv.org/abs/2503.00031)| [//]: #04/08
- ]()<br>[Fast Best-of-N Decoding via Speculative Rejection](https://arxiv.org/abs/2410.20290) <br> Hanshi Sun, Momin Haider, Ruiqi Zhang, Huitao Yang, Jiahao Qiu, Ming Yin, Mengdi Wang, Peter Bartlett, Andrea Zanette |<img width="1002" alt="image" src="https://arxiv.org/html/2410.20290v2/x1.png"> |[Github](https://github.com/Zanette-Labs/SpeculativeRejection) <br> [Paper](https://arxiv.org/abs/2410.20290)| [//]: #04/08
- Sampling-Efficient Test-Time Scaling: Self-Estimating the Best-of-N Sampling in Early Decoding
- FastMCTS: A Simple Sampling Strategy for Data Synthesis
- ]()<br>[Non-myopic Generation of Language Models for Reasoning and Planning](https://arxiv.org/abs/2410.17195) <br> Chang Ma, Haiteng Zhao, Junlei Zhang, Junxian He, Lingpeng Kong |<img width="1002" alt="image" src="figures/predictive_decoding.png"> |[Github](https://github.com/chang-github-00/LLM-Predictive-Decoding) <br> [Paper](https://arxiv.org/abs/2410.17195)| [//]: #04/08
- ![Star - NAACL_Findings_2025-blue)]()<br>[Make Every Penny Count: Difficulty-Adaptive Self-Consistency for Cost-Efficient Reasoning](https://arxiv.org/abs/2408.13457) <br> Xinglin Wang, Shaoxiong Feng, Yiwei Li, Peiwen Yuan, Yueqi Zhang, Chuyi Tan, Boyuan Pan, Yao Hu, Kan Li |<img width="1002" alt="image" src="https://arxiv.org/html/2408.13457v3/x3.png"> |[Github](https://github.com/WangXinglin/DSC) <br> [Paper](https://arxiv.org/abs/2408.13457)| [//]: #04/08
- Path-Consistency: Prefix Enhancement for Efficient Inference in LLM
- Bridging Internal Probability and Self-Consistency for Effective and Efficient LLM Reasoning - Zhe Guo, Xiaoxing Ma, Yu-Feng Li |<img width="1002" alt="image" src="https://arxiv.org/html/2502.00511v2/x3.png"> |[Paper](https://arxiv.org/abs/2502.00511)| [//]: #04/08
- Confidence Improves Self-Consistency in LLMs
-  <br> Chengsong Huang, Langlin Huang, Jixuan Leng, Jiacheng Liu, Jiaxin Huang |<img width="1002" alt="image" src="https://arxiv.org/html/2503.00031v1/x2.png"> |[Github](https://github.com/Chengsong-Huang/Self-Calibration) <br> [Paper](https://arxiv.org/abs/2503.00031)| [//]: #04/08
- ]()<br>[Fast Best-of-N Decoding via Speculative Rejection](https://arxiv.org/abs/2410.20290) <br> Hanshi Sun, Momin Haider, Ruiqi Zhang, Huitao Yang, Jiahao Qiu, Ming Yin, Mengdi Wang, Peter Bartlett, Andrea Zanette |<img width="1002" alt="image" src="https://arxiv.org/html/2410.20290v2/x1.png"> |[Github](https://github.com/Zanette-Labs/SpeculativeRejection) <br> [Paper](https://arxiv.org/abs/2410.20290)| [//]: #04/08
- Sampling-Efficient Test-Time Scaling: Self-Estimating the Best-of-N Sampling in Early Decoding
- FastMCTS: A Simple Sampling Strategy for Data Synthesis
- ]()<br>[Non-myopic Generation of Language Models for Reasoning and Planning](https://arxiv.org/abs/2410.17195) <br> Chang Ma, Haiteng Zhao, Junlei Zhang, Junxian He, Lingpeng Kong |<img width="1002" alt="image" src="figures/predictive_decoding.png"> |[Github](https://github.com/chang-github-00/LLM-Predictive-Decoding) <br> [Paper](https://arxiv.org/abs/2410.17195)| [//]: #04/08
-  <br> Ethan Mendes, Alan Ritter |<img width="1002" alt="image" src="https://arxiv.org/html/2503.02878v1/x1.png"> |[Github](https://github.com/ethanm88/self-taught-lookahead) <br> [Paper](https://arxiv.org/abs/2503.02878)| [//]: #04/08
-  <br> Fangzhi Xu, Hang Yan, Chang Ma, Haiteng Zhao, Jun Liu, Qika Lin, Zhiyong Wu |<img width="1002" alt="image" src="https://arxiv.org/html/2503.13288v1/x2.png"> |[Github](https://github.com/xufangzhi/phi-Decoding) <br> [Paper](https://arxiv.org/abs/2503.13288)| [//]: #04/08
- Dynamic Parallel Tree Search for Efficient LLM Reasoning
-  <br> Ethan Mendes, Alan Ritter |<img width="1002" alt="image" src="https://arxiv.org/html/2503.02878v1/x1.png"> |[Github](https://github.com/ethanm88/self-taught-lookahead) <br> [Paper](https://arxiv.org/abs/2503.02878)| [//]: #04/08
-  <br> Fangzhi Xu, Hang Yan, Chang Ma, Haiteng Zhao, Jun Liu, Qika Lin, Zhiyong Wu |<img width="1002" alt="image" src="https://arxiv.org/html/2503.13288v1/x2.png"> |[Github](https://github.com/xufangzhi/phi-Decoding) <br> [Paper](https://arxiv.org/abs/2503.13288)| [//]: #04/08
- Dynamic Parallel Tree Search for Efficient LLM Reasoning
-  <br> Jiayi Pan, Xiuyu Li, Long Lian, Charlie Snell, Yifei Zhou, Adam Yala, Trevor Darrell, Kurt Keutzer, Alane Suhr |<img width="1002" alt="image" src="https://arxiv.org/html/2504.15466v1/x2.png"> |[Github](https://github.com/Parallel-Reasoning/APR) <br> [Paper](https://arxiv.org/abs/2504.15466)| [//]: #04/23
- ]()<br>[Skeleton-of-Thought: Prompting LLMs for Efficient Parallel Generation](https://arxiv.org/abs/2307.15337) <br> Xuefei Ning, Zinan Lin, Zixuan Zhou, Zifu Wang, Huazhong Yang, Yu Wang |<img width="1002" alt="image" src="figures/skeleton_ot.png"> |[Github](https://github.com/imagination-research/sot) <br> [Paper](https://arxiv.org/abs/2307.15337)| [//]: #04/08
- Adaptive Skeleton Graph Decoding
-  <br> Jiayi Pan, Xiuyu Li, Long Lian, Charlie Snell, Yifei Zhou, Adam Yala, Trevor Darrell, Kurt Keutzer, Alane Suhr |<img width="1002" alt="image" src="https://arxiv.org/html/2504.15466v1/x2.png"> |[Github](https://github.com/Parallel-Reasoning/APR) <br> [Paper](https://arxiv.org/abs/2504.15466)| [//]: #04/23
- THOUGHTTERMINATOR: Benchmarking, Calibrating, and Mitigating Overthinking in Reasoning Models
- ]()<br>[Skeleton-of-Thought: Prompting LLMs for Efficient Parallel Generation](https://arxiv.org/abs/2307.15337) <br> Xuefei Ning, Zinan Lin, Zixuan Zhou, Zifu Wang, Huazhong Yang, Yu Wang |<img width="1002" alt="image" src="figures/skeleton_ot.png"> |[Github](https://github.com/imagination-research/sot) <br> [Paper](https://arxiv.org/abs/2307.15337)| [//]: #04/08
- Adaptive Skeleton Graph Decoding
-  <br> Baohao Liao, Yuhui Xu, Hanze Dong, Junnan Li, Christof Monz, Silvio Savarese, Doyen Sahoo, Caiming Xiong |<img width="1002" alt="image" src="figures/rsd.png"> |[Github](https://github.com/BaohaoLiao/RSD) <br> [Paper](https://arxiv.org/abs/2501.19324)| [//]: #04/08
- Meta-Reasoner: Dynamic Guidance for Optimized Inference-time Reasoning in Large Language Models
-  <br> Fengwei Teng, Zhaoyang Yu, Quan Shi, Jiayi Zhang, Chenglin Wu, Yuyu Luo |<img width="1002" alt="image" src="figures/aot.png"> |[Github](https://github.com/qixucen/atom) <br> [Paper](https://arxiv.org/abs/2502.12018)| [//]: #04/08
- DISC: Dynamic Decomposition Improves LLM Inference Scaling
- From Chaos to Order: The Atomic Reasoner Framework for Fine-grained Reasoning in Large Language Models
- DISC: Dynamic Decomposition Improves LLM Inference Scaling
- From Chaos to Order: The Atomic Reasoner Framework for Fine-grained Reasoning in Large Language Models
-  <br> Kun Xiang, Zhili Liu, Zihao Jiang, Yunshuang Nie, Kaixin Cai, Yiyang Yin, Runhui Huang, Haoxiang Fan, Hanhui Li, Weiran Huang, Yihan Zeng, Yu-Jie Yuan, Jianhua Han, Lanqing Hong, Hang Xu, Xiaodan Liang |<img width="1002" alt="image" src="figures/atom.png"> |[Github](https://github.com/Quinn777/AtomThink) <br> [Paper](https://arxiv.org/abs/2503.06252)| [//]: #04/08
- ]()<br>[Inference Scaling Laws: An Empirical Analysis of Compute-Optimal Inference for Problem-Solving with Language Models](https://arxiv.org/abs/2408.00724) <br> Yangzhen Wu, Zhiqing Sun, Shanda Li, Sean Welleck, Yiming Yang |<img width="1002" alt="image" src="figures/scaling_law.png"> |[Github](https://github.com/thu-wyz/inference_scaling) <br> [Paper](https://arxiv.org/abs/2408.00724)| [//]: #04/08
-  <br> Yuxiao Qu, Matthew Y. R. Yang, Amrith Setlur, Lewis Tunstall, Edward Emanuel Beeching, Ruslan Salakhutdinov, Aviral Kumar |<img width="1002" alt="image" src="figures/mrt.png"> |[Github](https://github.com/CMU-AIRe/MRT) <br> [Paper](https://arxiv.org/abs/2503.07572)| [//]: #04/08
-  <br> Rui Pan, Yinwei Dai, Zhihao Zhang, Gabriele Oliaro, Zhihao Jia, Ravi Netravali |<img width="1002" alt="image" src="figures/specreason.png"> |[Github](https://github.com/ruipeterpan/specreason) <br> [Paper](https://arxiv.org/abs/2504.07891)| [//]: #04/14
-  <br> Kun Xiang, Zhili Liu, Zihao Jiang, Yunshuang Nie, Kaixin Cai, Yiyang Yin, Runhui Huang, Haoxiang Fan, Hanhui Li, Weiran Huang, Yihan Zeng, Yu-Jie Yuan, Jianhua Han, Lanqing Hong, Hang Xu, Xiaodan Liang |<img width="1002" alt="image" src="figures/atom.png"> |[Github](https://github.com/Quinn777/AtomThink) <br> [Paper](https://arxiv.org/abs/2503.06252)| [//]: #04/08
-  <br> Charlie Snell, Jaehoon Lee, Kelvin Xu, Aviral Kumar |<img width="1002" alt="image" src="figures/tts_effective.png"> |[Paper](https://arxiv.org/abs/2408.03314)| [//]: #04/08
-  <br> Rui Pan, Yinwei Dai, Zhihao Zhang, Gabriele Oliaro, Zhihao Jia, Ravi Netravali |<img width="1002" alt="image" src="figures/specreason.png"> |[Github](https://github.com/ruipeterpan/specreason) <br> [Paper](https://arxiv.org/abs/2504.07891)| [//]: #04/14
- ]()<br>[Inference Scaling Laws: An Empirical Analysis of Compute-Optimal Inference for Problem-Solving with Language Models](https://arxiv.org/abs/2408.00724) <br> Yangzhen Wu, Zhiqing Sun, Shanda Li, Sean Welleck, Yiming Yang |<img width="1002" alt="image" src="figures/scaling_law.png"> |[Github](https://github.com/thu-wyz/inference_scaling) <br> [Paper](https://arxiv.org/abs/2408.00724)| [//]: #04/08
-  <br> Yuxiao Qu, Matthew Y. R. Yang, Amrith Setlur, Lewis Tunstall, Edward Emanuel Beeching, Ruslan Salakhutdinov, Aviral Kumar |<img width="1002" alt="image" src="figures/mrt.png"> |[Github](https://github.com/CMU-AIRe/MRT) <br> [Paper](https://arxiv.org/abs/2503.07572)| [//]: #04/08
-
Evaluation and Benchmarks
- THOUGHTTERMINATOR: Benchmarking, Calibrating, and Mitigating Overthinking in Reasoning Models
-  <br> Berk Atil, Sarp Aykent, Alexa Chittams, Lisheng Fu, Rebecca J. Passonneau, Evan Radcliffe, Guru Rajan Rajagopal, Adam Sloan, Tomasz Tudrej, Ferhan Ture, Zhe Wu, Lixinyu Xu, Breck Baldwin |<img width="1002" alt="image" src="https://arxiv.org/html/2408.04667v5/extracted/6331111/max_min_diff.png"> |[Github](https://github.com/breckbaldwin/llm-stability) <br> [Paper](https://arxiv.org/abs/2408.04667)| [//]: #04/08
- The Danger of Overthinking: Examining the Reasoning-Action Dilemma in Agentic Tasks
- Evaluating Large Language Models Trained on Code
- τ-bench: A Benchmark for Tool-Agent-User Interaction in Real-World Domains
-  <br> Junnan Liu, Hongwei Liu, Linchen Xiao, Ziyi Wang, Kuikun Liu, Songyang Gao, Wenwei Zhang, Songyang Zhang, Kai Chen |<img width="1002" alt="image" src="https://arxiv.org/html/2412.13147v3/x1.png"> |[Github](https://github.com/open-compass/GPassK) <br> [Paper](https://arxiv.org/abs/2412.13147)| [//]: #04/08
- LongPerceptualThoughts: Distilling System-2 Reasoning for System-1 Perception - Hong Liao, Sven Elflein, Liu He, Laura Leal-Taixé, Yejin Choi, Sanja Fidler, David Acuna |<img width="1002" alt="image" src="https://arxiv.org/html/2504.15362v1/x1.png"> |[Paper](https://arxiv.org/abs/2504.15362)| [//]: #04/23
-  <br> Shubham Parashar, Blake Olson, Sambhav Khurana, Eric Li, Hongyi Ling, James Caverlee, Shuiwang Ji |<img width="1002" alt="image" src="https://arxiv.org/html/2502.12521v1/x1.png"> |[Github](https://github.com/divelab/sys2bench) <br> [Paper](https://arxiv.org/abs/2502.12521)| [//]: #04/08
- Do NOT Think That Much for 2+3=? On the Overthinking of o1-Like LLMs
-  <br> Xinyin Ma, Guangnian Wan, Runpeng Yu, Gongfan Fang, Xinchao Wang |<img width="1002" alt="image" src="figures/cot_valve.png"> |[Github](https://github.com/horseee/CoT-Valve) <br> [Paper](https://arxiv.org/abs/2502.09601)|[//]: #03/16
-  <br> Berk Atil, Sarp Aykent, Alexa Chittams, Lisheng Fu, Rebecca J. Passonneau, Evan Radcliffe, Guru Rajan Rajagopal, Adam Sloan, Tomasz Tudrej, Ferhan Ture, Zhe Wu, Lixinyu Xu, Breck Baldwin |<img width="1002" alt="image" src="https://arxiv.org/html/2408.04667v5/extracted/6331111/max_min_diff.png"> |[Github](https://github.com/breckbaldwin/llm-stability) <br> [Paper](https://arxiv.org/abs/2408.04667)| [//]: #04/08
- The Danger of Overthinking: Examining the Reasoning-Action Dilemma in Agentic Tasks
- Evaluating Large Language Models Trained on Code
- τ-bench: A Benchmark for Tool-Agent-User Interaction in Real-World Domains
-  <br> Junnan Liu, Hongwei Liu, Linchen Xiao, Ziyi Wang, Kuikun Liu, Songyang Gao, Wenwei Zhang, Songyang Zhang, Kai Chen |<img width="1002" alt="image" src="https://arxiv.org/html/2412.13147v3/x1.png"> |[Github](https://github.com/open-compass/GPassK) <br> [Paper](https://arxiv.org/abs/2412.13147)| [//]: #04/08
- LongPerceptualThoughts: Distilling System-2 Reasoning for System-1 Perception - Hong Liao, Sven Elflein, Liu He, Laura Leal-Taixé, Yejin Choi, Sanja Fidler, David Acuna |<img width="1002" alt="image" src="https://arxiv.org/html/2504.15362v1/x1.png"> |[Paper](https://arxiv.org/abs/2504.15362)| [//]: #04/23
-  <br> Shubham Parashar, Blake Olson, Sambhav Khurana, Eric Li, Hongyi Ling, James Caverlee, Shuiwang Ji |<img width="1002" alt="image" src="https://arxiv.org/html/2502.12521v1/x1.png"> |[Github](https://github.com/divelab/sys2bench) <br> [Paper](https://arxiv.org/abs/2502.12521)| [//]: #04/08
-  <br> Fan Liu, Wenshuo Chao, Naiqiang Tan, Hao Liu |<img width="1002" alt="image" src="https://arxiv.org/html/2502.07191v4/x1.png"> |[Github](https://github.com/usail-hkust/benchmark_inference_time_computation_LLM) <br> [Paper](https://arxiv.org/abs/2502.07191)| [//]: #04/08
-  <br> Runze Liu, Junqi Gao, Jian Zhao, Kaiyan Zhang, Xiu Li, Biqing Qi, Wanli Ouyang, Bowen Zhou |<img width="1002" alt="image" src="https://arxiv.org/html/2502.06703v1/x2.png"> |[Github](https://github.com/RyanLiu112/compute-optimal-tts) <br> [Paper](https://arxiv.org/abs/2502.06703)| [//]: #04/08
- DNA Bench: When Silence is Smarter -- Benchmarking Over-Reasoning in Reasoning LLMs
- S1-Bench: A Simple Benchmark for Evaluating System 1 Thinking Capability of Large Reasoning Models
-  <br> Yukun Qi, Yiming Zhao, Yu Zeng, Xikun Bao, Wenxuan Huang, Lin Chen, Zehui Chen, Jie Zhao, Zhongang Qi, Feng Zhao |<img width="1002" alt="image" src="figures/video.png"> |[Github](https://github.com/zhishuifeiqian/VCR-Bench) <br> [Paper](https://arxiv.org/abs/2504.07956)| [//]: #04/16
- S1-Bench: A Simple Benchmark for Evaluating System 1 Thinking Capability of Large Reasoning Models
-  <br> Yukun Qi, Yiming Zhao, Yu Zeng, Xikun Bao, Wenxuan Huang, Lin Chen, Zehui Chen, Jie Zhao, Zhongang Qi, Feng Zhao |<img width="1002" alt="image" src="figures/video.png"> |[Github](https://github.com/zhishuifeiqian/VCR-Bench) <br> [Paper](https://arxiv.org/abs/2504.07956)| [//]: #04/16
-  <br> Runze Liu, Junqi Gao, Jian Zhao, Kaiyan Zhang, Xiu Li, Biqing Qi, Wanli Ouyang, Bowen Zhou |<img width="1002" alt="image" src="https://arxiv.org/html/2502.06703v1/x2.png"> |[Github](https://github.com/RyanLiu112/compute-optimal-tts) <br> [Paper](https://arxiv.org/abs/2502.06703)| [//]: #04/08
- DNA Bench: When Silence is Smarter -- Benchmarking Over-Reasoning in Reasoning LLMs
-
Background Papers
-  <br> Yang Yue, Zhiqi Chen, Rui Lu, Andrew Zhao, Zhaokai Wang, Yang Yue, Shiji Song, Gao Huang |<img width="1002" alt="image" src="https://arxiv.org/html/2504.13837v1/x1.png"> |[Github](https://github.com/LeapLabTHU/limit-of-RLVR) <br> [Paper](https://arxiv.org/abs/2504.13837)| [//]: #04/22
-  <br> Jason Wei, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, Brian Ichter, Fei Xia, Ed Chi, Quoc Le, Denny Zhou |<img width="1002" alt="image" src="figures/cot_prompting.png"> |[Paper](https://arxiv.org/abs/2201.11903)| [//]: #04/08
- ]()<br>[Tree of Thoughts: Deliberate Problem Solving with Large Language Models](https://arxiv.org/abs/2305.10601) <br> Shunyu Yao, Dian Yu, Jeffrey Zhao, Izhak Shafran, Thomas L. Griffiths, Yuan Cao, Karthik Narasimhan |<img width="1002" alt="image" src="https://arxiv.org/html/2305.10601v2/x1.png"> |[Github](https://github.com/princeton-nlp/tree-of-thought-llm) <br> [Paper](https://arxiv.org/abs/2305.10601)| [//]: #04/08
- ]()<br>[Graph of Thoughts: Solving Elaborate Problems with Large Language Models](https://arxiv.org/abs/2308.09687) <br> Maciej Besta, Nils Blach, Ales Kubicek, Robert Gerstenberger, Michal Podstawski, Lukas Gianinazzi, Joanna Gajda, Tomasz Lehmann, Hubert Niewiadomski, Piotr Nyczyk, Torsten Hoefler |<img width="1002" alt="image" src="figures/got.png"> |[Github](https://github.com/spcl/graph-of-thoughts) <br> [Paper](https://arxiv.org/abs/2308.09687)| [//]: #04/08
-  <br> Xuezhi Wang, Jason Wei, Dale Schuurmans, Quoc Le, Ed Chi, Sharan Narang, Aakanksha Chowdhery, Denny Zhou |<img width="1002" alt="image" src="figures/sc.png"> |[Paper](https://arxiv.org/abs/2203.11171)| [//]: #04/08
-  <br> Jason Wei, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, Brian Ichter, Fei Xia, Ed Chi, Quoc Le, Denny Zhou |<img width="1002" alt="image" src="figures/cot_prompting.png"> |[Paper](https://arxiv.org/abs/2201.11903)| [//]: #04/08
- ]()<br>[Tree of Thoughts: Deliberate Problem Solving with Large Language Models](https://arxiv.org/abs/2305.10601) <br> Shunyu Yao, Dian Yu, Jeffrey Zhao, Izhak Shafran, Thomas L. Griffiths, Yuan Cao, Karthik Narasimhan |<img width="1002" alt="image" src="https://arxiv.org/html/2305.10601v2/x1.png"> |[Github](https://github.com/princeton-nlp/tree-of-thought-llm) <br> [Paper](https://arxiv.org/abs/2305.10601)| [//]: #04/08
- ]()<br>[Graph of Thoughts: Solving Elaborate Problems with Large Language Models](https://arxiv.org/abs/2308.09687) <br> Maciej Besta, Nils Blach, Ales Kubicek, Robert Gerstenberger, Michal Podstawski, Lukas Gianinazzi, Joanna Gajda, Tomasz Lehmann, Hubert Niewiadomski, Piotr Nyczyk, Torsten Hoefler |<img width="1002" alt="image" src="figures/got.png"> |[Github](https://github.com/spcl/graph-of-thoughts) <br> [Paper](https://arxiv.org/abs/2308.09687)| [//]: #04/08
- ]()<br>[Chain-of-Symbol Prompting Elicits Planning in Large Langauge Models](https://arxiv.org/abs/2305.10276) <br> Hanxu Hu, Hongyuan Lu, Huajian Zhang, Yun-Ze Song, Wai Lam, Yue Zhang |<img width="1002" alt="image" src="https://arxiv.org/html/2305.10276v7/x1.png"> |[Github](https://github.com/hanxuhu/chain-of-symbol-planning) <br> [Paper](https://arxiv.org/abs/2305.10276)| [//]: #04/08
- Thinking Machines: A Survey of LLM based Reasoning Strategies
-  <br> Zhong-Zhi Li, Duzhen Zhang, Ming-Liang Zhang, Jiaxin Zhang, Zengyan Liu, Yuxuan Yao, Haotian Xu, Junhao Zheng, Pei-Jie Wang, Xiuyi Chen, Yingying Zhang, Fei Yin, Jiahua Dong, Zhijiang Guo, Le Song, Cheng-Lin Liu |<img width="1002" alt="image" src="https://arxiv.org/html/2502.17419v2/extracted/6232702/images/timeline.png"> |[Github](https://github.com/zzli2022/Awesome-System2-Reasoning-LLM) <br> [Paper](https://arxiv.org/abs/2502.17419)| [//]: #04/08
- ]()<br>[Program of Thoughts Prompting: Disentangling Computation from Reasoning for Numerical Reasoning Tasks](https://arxiv.org/abs/2211.12588) <br> Wenhu Chen, Xueguang Ma, Xinyi Wang, William W. Cohen |<img width="1002" alt="image" src="figures/pot.png"> |[Github](https://github.com/TIGER-AI-Lab/Program-of-Thoughts) <br> [Paper](https://arxiv.org/abs/2211.12588)| [//]: #04/08
- ]()<br>[Chain-of-Symbol Prompting Elicits Planning in Large Langauge Models](https://arxiv.org/abs/2305.10276) <br> Hanxu Hu, Hongyuan Lu, Huajian Zhang, Yun-Ze Song, Wai Lam, Yue Zhang |<img width="1002" alt="image" src="https://arxiv.org/html/2305.10276v7/x1.png"> |[Github](https://github.com/hanxuhu/chain-of-symbol-planning) <br> [Paper](https://arxiv.org/abs/2305.10276)| [//]: #04/08
- Thinking Machines: A Survey of LLM based Reasoning Strategies
-
-
Updates
Programming Languages
Categories