An open API service indexing awesome lists of open source software.

Awesome-Efficient-Reasoning-Models

[Arxiv 2025] Efficient Reasoning Models: A Survey
https://github.com/fscdc/Awesome-Efficient-Reasoning-Models

Last synced: about 15 hours ago
JSON representation

  • Full list

    • Make Long CoT Short

      • Time's Up! An Empirical Study of LLM Reasoning Ability Under Output Length Constraint
      • CoT-RAG: Integrating Chain of Thought and Retrieval-Augmented Generation to Enhance Reasoning in Large Language Models
      • Thought Manipulation: External Thought Can Be Efficient for Large Reasoning Models
      • ![Star - concise-cot) [![Publish](https://img.shields.io/badge/Conference-FLLM_2024-blue)]()<br>[The Benefits of a Concise Chain of Thought on Problem-Solving in Large Language Models](https://arxiv.org/abs/2401.05618) <br> Matthew Renze, Erhan Guven |<img width="1002" alt="image" src="https://arxiv.org/html/2401.05618v3/x1.png"> |[Github](https://github.com/matthewrenze/jhu-concise-cot) <br> [Paper](https://arxiv.org/abs/2401.05618)| [//]: #04/08
      • Break the Chain: Large Language Models Can be Shortcut Reasoners
      • ![Publish - of-Thought without Compromising Effectiveness](https://arxiv.org/abs/2412.11664) <br> Yu Kang, Xianghui Sun, Liangyu Chen, Wei Zou |<img width="1002" alt="image" src="figures/co3t.png"> |[Paper](https://arxiv.org/abs/2412.11664)|[//]: #03/16
      • ![Star - NeurIPS_2024-blue)]()<br>[Can Language Models Learn to Skip Steps?](https://arxiv.org/abs/2411.01855) <br> Tengxiao Liu, Qipeng Guo, Xiangkun Hu, Cheng Jiayang, Yue Zhang, Xipeng Qiu, Zheng Zhang |<img width="1002" alt="image" src="figures/skip_step.png"> |[Github](https://github.com/tengxiaoliu/LM_skip) <br> [Paper](https://arxiv.org/abs/2411.01855)|[//]: #03/16
      • ![Star - Budget-Aware LLM Reasoning](https://arxiv.org/abs/2412.18547) <br> Tingxu Han, Zhenting Wang, Chunrong Fang, Shiyu Zhao, Shiqing Ma, Zhenyu Chen |<img width="1002" alt="image" src="https://arxiv.org/html/2412.18547v4/x10.png"> |[Github](https://github.com/GeniusHTX/TALE) <br> [Paper](https://arxiv.org/abs/2412.18547)| [//]: #04/08
      • ![Star - Pruner)<br>[O1-Pruner: Length-Harmonizing Fine-Tuning for O1-Like Reasoning Pruning](https://arxiv.org/abs/2501.12570) <br> Haotian Luo, Li Shen, Haiying He, Yibo Wang, Shiwei Liu, Wei Li, Naiqiang Tan, Xiaochun Cao, Dacheng Tao |<img width="1002" alt="image" src="figures/o1_pruner.png"> |[Github](https://github.com/StarDewXXX/O1-Pruner) <br> [Paper](https://arxiv.org/abs/2501.12570)|[//]: #03/16
      • Kimi k1.5: Scaling Reinforcement Learning with LLMs
      • ![Star - long-cot)<br>[Demystifying Long Chain-of-Thought Reasoning in LLMs](https://arxiv.org/abs/2502.03373) <br> Edward Yeo, Yuxuan Tong, Morry Niu, Graham Neubig, Xiang Yue |<img width="1002" alt="image" src="https://arxiv.org/html/2502.03373v1/x1.png"> |[Github](https://github.com/eddycmu/demystify-long-cot) <br> [Paper](https://arxiv.org/abs/2502.03373)| [//]: #04/08
      • ![Star - Labs/efficient-reasoning)<br>[Training Language Models to Reason Efficiently](https://arxiv.org/abs/2502.04463) <br> Daman Arora, Andrea Zanette |<img width="1002" alt="image" src="https://arxiv.org/html/2502.04463v2/x3.png"> |[Github](https://github.com/Zanette-Labs/efficient-reasoning) <br> [Paper](https://arxiv.org/abs/2502.04463)| [//]: #04/08
      • ![Star - l3/l1)<br>[L1: Controlling How Long A Reasoning Model Thinks With Reinforcement Learning](https://www.arxiv.org/abs/2503.04697) <br> Pranjal Aggarwal, Sean Welleck |<img width="1002" alt="image" src="https://arxiv.org/html/2503.04697v1/x2.png"> |[Github](https://github.com/cmu-l3/l1) <br> [Paper](https://www.arxiv.org/abs/2503.04697)| [//]: #04/08
      • Distilling System 2 into System 1
      • ![Star - of-Thought Compression in LLMs](https://arxiv.org/abs/2502.12067) <br> Heming Xia, Yongqi Li, Chak Tou Leong, Wenjie Wang, Wenjie Li |<img width="1002" alt="image" src="figures/TokenSkip.png"> |[Github](https://github.com/hemingkx/TokenSkip) <br> [Paper](https://arxiv.org/abs/2502.12067)|[//]: #03/20
      • Stepwise Perplexity-Guided Refinement for Efficient Chain-of-Thought Reasoning in Large Language Models
      • Towards Thinking-Optimal Scaling of Test-Time Compute for LLM Reasoning
      • ![Star - reasoning)<br>[Self-Training Elicits Concise Reasoning in Large Language Models](https://arxiv.org/abs/2502.20122) <br> Tergel Munkhbat, Namgyu Ho, Seo Hyun Kim, Yongjin Yang, Yujin Kim, Se-Young Yun |<img width="1002" alt="image" src="https://arxiv.org/html/2502.20122v2/x1.png"> |[Github](https://github.com/TergelMunkhbat/concise-reasoning) <br> [Paper](https://arxiv.org/abs/2502.20122)| [//]: #04/08
      • ![Star - concise-cot) [![Publish](https://img.shields.io/badge/Conference-FLLM_2024-blue)]()<br>[The Benefits of a Concise Chain of Thought on Problem-Solving in Large Language Models](https://arxiv.org/abs/2401.05618) <br> Matthew Renze, Erhan Guven |<img width="1002" alt="image" src="https://arxiv.org/html/2401.05618v3/x1.png"> |[Github](https://github.com/matthewrenze/jhu-concise-cot) <br> [Paper](https://arxiv.org/abs/2401.05618)| [//]: #04/08
      • Break the Chain: Large Language Models Can be Shortcut Reasoners
      • ![Star - of-draft)<br>[Chain of Draft: Thinking Faster by Writing Less](https://arxiv.org/abs/2502.18600) <br> Silei Xu, Wenhao Xie, Lingxiao Zhao, Pengcheng He |<img width="1002" alt="image" src="https://arxiv.org/html/2502.18600v2/extracted/6244873/plot.png"> |[Github](https://github.com/sileix/chain-of-draft) <br> [Paper](https://arxiv.org/abs/2502.18600)| [//]: #04/08
      • ![Star - boundary) [![Publish](https://img.shields.io/badge/Conference-NeurIPS_2024-blue)]()<br>[Unlocking the Capabilities of Thought: A Reasoning Boundary Framework to Quantify and Optimize Chain-of-Thought](https://arxiv.org/abs/2410.05695) <br> Qiguang Chen, Libo Qin, Jiaqi Wang, Jinxuan Zhou, Wanxiang Che |<img width="1002" alt="image" src="https://arxiv.org/html/2410.05695v2/x1.png"> |[Github](https://github.com/LightChen233/reasoning-boundary) <br> [Paper](https://arxiv.org/abs/2410.05695)| [//]: #04/08
      • How Well do LLMs Compress Their Own Chain-of-Thought? A Token Complexity Approach - pro-legend.png" width="45%"> <img src="https://arxiv.org/html/2503.01141v2/extracted/6325669/plot/Anthropic/claude-3-5-sonnet-20241022-mmlu-main.png" width="45%"> |[Paper](https://arxiv.org/abs/2503.01141)| [//]: #04/08
      • ![Star - of-Thought: Efficient LLM Reasoning with Adaptive Cognitive-Inspired Sketching](https://arxiv.org/abs/2503.05179) <br> Simon A. Aytes, Jinheon Baek, Sung Ju Hwang |<img width="1002" alt="image" src="https://arxiv.org/html/2503.05179v1/x1.png"> |[Github](https://github.com/SimonAytes/SoT) <br> [Paper](https://arxiv.org/abs/2503.05179)| [//]: #04/08
      • Learning to Route LLMs with Confidence Tokens - Neng Chuang, Helen Zhou, Prathusha Kameswara Sarma, Parikshit Gopalan, John Boccio, Sara Bolouki, Xia Hu |<img width="1002" alt="image" src="https://arxiv.org/html/2410.13284v2/x1.png"> |[Paper](https://arxiv.org/abs/2410.13284)| [//]: #04/08
      • Confident or Seek Stronger: Exploring Uncertainty-Based On-device LLM Routing From Benchmarking to Generalization - Neng Chuang, Leisheng Yu, Guanchu Wang, Lizhe Zhang, Zirui Liu, Xuanting Cai, Yang Sui, Vladimir Braverman, Xia Hu |<img width="1002" alt="image" src="https://arxiv.org/html/2502.04428v1/x1.png"> |[Paper](https://arxiv.org/abs/2502.04428)| [//]: #04/08
      • Claude 3.7 Sonnet - 3-7-sonnet)
      • Beyond Chains of Thought: Benchmarking Latent-Space Reasoning Abilities in Large Language Models
      • ![Star
      • ![Star
      • Compressed Chain of Thought: Efficient Reasoning Through Dense Representations
      • SoftCoT: Soft Chain-of-Thought for Efficient Reasoning with LLMs
      • ![Star - regressive Chain-of-Thought through Loop-Aligned Reasoning](https://arxiv.org/abs/2502.08482) <br> Qifan Yu, Zhenyu He, Sijie Li, Xun Zhou, Jun Zhang, Jingjing Xu, Di He |<img width="1002" alt="image" src="https://arxiv.org/html/2502.08482v1/x1.png"> |[Github](https://github.com/qifanyu/RELAY) <br> [Paper](https://arxiv.org/abs/2502.08482)| [//]: #04/08
      • ![Star - of-Thought Compression in LLMs](https://arxiv.org/abs/2502.12067) <br> Heming Xia, Yongqi Li, Chak Tou Leong, Wenjie Wang, Wenjie Li |<img width="1002" alt="image" src="figures/TokenSkip.png"> |[Github](https://github.com/hemingkx/TokenSkip) <br> [Paper](https://arxiv.org/abs/2502.12067)|[//]: #03/20
      • Stepwise Perplexity-Guided Refinement for Efficient Chain-of-Thought Reasoning in Large Language Models
      • Towards Thinking-Optimal Scaling of Test-Time Compute for LLM Reasoning
      • ![Star - reasoning)<br>[Self-Training Elicits Concise Reasoning in Large Language Models](https://arxiv.org/abs/2502.20122) <br> Tergel Munkhbat, Namgyu Ho, Seo Hyun Kim, Yongjin Yang, Yujin Kim, Se-Young Yun |<img width="1002" alt="image" src="https://arxiv.org/html/2502.20122v2/x1.png"> |[Github](https://github.com/TergelMunkhbat/concise-reasoning) <br> [Paper](https://arxiv.org/abs/2502.20122)| [//]: #04/08
      • ![Star - Budget-Aware LLM Reasoning](https://arxiv.org/abs/2412.18547) <br> Tingxu Han, Zhenting Wang, Chunrong Fang, Shiyu Zhao, Shiqing Ma, Zhenyu Chen |<img width="1002" alt="image" src="https://arxiv.org/html/2412.18547v4/x10.png"> |[Github](https://github.com/GeniusHTX/TALE) <br> [Paper](https://arxiv.org/abs/2412.18547)| [//]: #04/08
      • ![Star - Pruner)<br>[O1-Pruner: Length-Harmonizing Fine-Tuning for O1-Like Reasoning Pruning](https://arxiv.org/abs/2501.12570) <br> Haotian Luo, Li Shen, Haiying He, Yibo Wang, Shiwei Liu, Wei Li, Naiqiang Tan, Xiaochun Cao, Dacheng Tao |<img width="1002" alt="image" src="figures/o1_pruner.png"> |[Github](https://github.com/StarDewXXX/O1-Pruner) <br> [Paper](https://arxiv.org/abs/2501.12570)|[//]: #03/16
      • Kimi k1.5: Scaling Reinforcement Learning with LLMs
      • ![Star - long-cot)<br>[Demystifying Long Chain-of-Thought Reasoning in LLMs](https://arxiv.org/abs/2502.03373) <br> Edward Yeo, Yuxuan Tong, Morry Niu, Graham Neubig, Xiang Yue |<img width="1002" alt="image" src="https://arxiv.org/html/2502.03373v1/x1.png"> |[Github](https://github.com/eddycmu/demystify-long-cot) <br> [Paper](https://arxiv.org/abs/2502.03373)| [//]: #04/08
      • ![Star - Labs/efficient-reasoning)<br>[Training Language Models to Reason Efficiently](https://arxiv.org/abs/2502.04463) <br> Daman Arora, Andrea Zanette |<img width="1002" alt="image" src="https://arxiv.org/html/2502.04463v2/x3.png"> |[Github](https://github.com/Zanette-Labs/efficient-reasoning) <br> [Paper](https://arxiv.org/abs/2502.04463)| [//]: #04/08
      • ![Star - Valve)<br>[CoT-Valve: Length-Compressible Chain-of-Thought Tuning](https://arxiv.org/abs/2502.09601) <br> Xinyin Ma, Guangnian Wan, Runpeng Yu, Gongfan Fang, Xinchao Wang |<img width="1002" alt="image" src="figures/cot_valve.png"> |[Github](https://github.com/horseee/CoT-Valve) <br> [Paper](https://arxiv.org/abs/2502.09601)|[//]: #03/16
      • ![Publish - of-Thought without Compromising Effectiveness](https://arxiv.org/abs/2412.11664) <br> Yu Kang, Xianghui Sun, Liangyu Chen, Wei Zou |<img width="1002" alt="image" src="figures/co3t.png"> |[Paper](https://arxiv.org/abs/2412.11664)|[//]: #03/16
      • ![Star - NeurIPS_2024-blue)]()<br>[Can Language Models Learn to Skip Steps?](https://arxiv.org/abs/2411.01855) <br> Tengxiao Liu, Qipeng Guo, Xiangkun Hu, Cheng Jiayang, Yue Zhang, Xipeng Qiu, Zheng Zhang |<img width="1002" alt="image" src="figures/skip_step.png"> |[Github](https://github.com/tengxiaoliu/LM_skip) <br> [Paper](https://arxiv.org/abs/2411.01855)|[//]: #03/16
      • Distilling System 2 into System 1
      • ![Star - l3/l1)<br>[L1: Controlling How Long A Reasoning Model Thinks With Reinforcement Learning](https://www.arxiv.org/abs/2503.04697) <br> Pranjal Aggarwal, Sean Welleck |<img width="1002" alt="image" src="https://arxiv.org/html/2503.04697v1/x2.png"> |[Github](https://github.com/cmu-l3/l1) <br> [Paper](https://www.arxiv.org/abs/2503.04697)| [//]: #04/08
      • DAST: Difficulty-Adaptive Slow-Thinking for Large Reasoning Models
      • Adaptive Group Policy Optimization: Towards Stable Training and Token-Efficient Reasoning
      • ![Star - NLP-Chang/ThinkPrune)<br>[ThinkPrune: Pruning Long Chain-of-Thought of LLMs via Reinforcement Learning](https://arxiv.org/abs/2504.01296) <br> Bairu Hou, Yang Zhang, Jiabao Ji, Yujian Liu, Kaizhi Qian, Jacob Andreas, Shiyu Chang |<img width="1002" alt="image" src="https://arxiv.org/html/2504.01296v1/x1.png"> |[Github](https://github.com/UCSB-NLP-Chang/ThinkPrune) <br> [Paper](https://arxiv.org/abs/2504.01296)| [//]: #04/08
      • Think When You Need: Self-Adaptive Chain-of-Thought Learning
      • Time's Up! An Empirical Study of LLM Reasoning Ability Under Output Length Constraint
      • CoT-RAG: Integrating Chain of Thought and Retrieval-Augmented Generation to Enhance Reasoning in Large Language Models
      • ![Star - of-draft)<br>[Chain of Draft: Thinking Faster by Writing Less](https://arxiv.org/abs/2502.18600) <br> Silei Xu, Wenhao Xie, Lingxiao Zhao, Pengcheng He |<img width="1002" alt="image" src="https://arxiv.org/html/2502.18600v2/extracted/6244873/plot.png"> |[Github](https://github.com/sileix/chain-of-draft) <br> [Paper](https://arxiv.org/abs/2502.18600)| [//]: #04/08
      • How Well do LLMs Compress Their Own Chain-of-Thought? A Token Complexity Approach - pro-legend.png" width="45%"> <img src="https://arxiv.org/html/2503.01141v2/extracted/6325669/plot/Anthropic/claude-3-5-sonnet-20241022-mmlu-main.png" width="45%"> |[Paper](https://arxiv.org/abs/2503.01141)| [//]: #04/08
      • ![Star - of-Thought: Efficient LLM Reasoning with Adaptive Cognitive-Inspired Sketching](https://arxiv.org/abs/2503.05179) <br> Simon A. Aytes, Jinheon Baek, Sung Ju Hwang |<img width="1002" alt="image" src="https://arxiv.org/html/2503.05179v1/x1.png"> |[Github](https://github.com/SimonAytes/SoT) <br> [Paper](https://arxiv.org/abs/2503.05179)| [//]: #04/08
      • Learning to Route LLMs with Confidence Tokens - Neng Chuang, Helen Zhou, Prathusha Kameswara Sarma, Parikshit Gopalan, John Boccio, Sara Bolouki, Xia Hu |<img width="1002" alt="image" src="https://arxiv.org/html/2410.13284v2/x1.png"> |[Paper](https://arxiv.org/abs/2410.13284)| [//]: #04/08
      • Confident or Seek Stronger: Exploring Uncertainty-Based On-device LLM Routing From Benchmarking to Generalization - Neng Chuang, Leisheng Yu, Guanchu Wang, Lizhe Zhang, Zirui Liu, Xuanting Cai, Yang Sui, Vladimir Braverman, Xia Hu |<img width="1002" alt="image" src="https://arxiv.org/html/2502.04428v1/x1.png"> |[Paper](https://arxiv.org/abs/2502.04428)| [//]: #04/08
      • DAST: Difficulty-Adaptive Slow-Thinking for Large Reasoning Models
      • Adaptive Group Policy Optimization: Towards Stable Training and Token-Efficient Reasoning
      • ![Star - NLP-Chang/ThinkPrune)<br>[ThinkPrune: Pruning Long Chain-of-Thought of LLMs via Reinforcement Learning](https://arxiv.org/abs/2504.01296) <br> Bairu Hou, Yang Zhang, Jiabao Ji, Yujian Liu, Kaizhi Qian, Jacob Andreas, Shiyu Chang |<img width="1002" alt="image" src="https://arxiv.org/html/2504.01296v1/x1.png"> |[Github](https://github.com/UCSB-NLP-Chang/ThinkPrune) <br> [Paper](https://arxiv.org/abs/2504.01296)| [//]: #04/08
      • Think When You Need: Self-Adaptive Chain-of-Thought Learning
      • ![Star - rg/recurrent-pretraining)<br>[Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach](https://arxiv.org/abs/2502.05171) <br> Jonas Geiping, Sean McLeish, Neel Jain, John Kirchenbauer, Siddharth Singh, Brian R. Bartoldson, Bhavya Kailkhura, Abhinav Bhatele, Tom Goldstein |<img width="1002" alt="image" src="https://arxiv.org/html/2502.05171v2/x2.png"> |[Github](https://github.com/seal-rg/recurrent-pretraining) <br> [Paper](https://arxiv.org/abs/2502.05171)| [//]: #04/08
      • Weight-of-Thought Reasoning: Exploring Neural Network Weights for Enhanced LLM Reasoning
      • Claude 3.7 Sonnet - 3-7-sonnet)
      • Beyond Chains of Thought: Benchmarking Latent-Space Reasoning Abilities in Large Language Models
      • ![Star
      • Compressed Chain of Thought: Efficient Reasoning Through Dense Representations
      • SoftCoT: Soft Chain-of-Thought for Efficient Reasoning with LLMs
      • ![Star
      • ![Star - of-thoughts) [![Publish](https://img.shields.io/badge/Conference-NeurIPS_2024-blue)]()<br>[Diffusion of Thoughts: Chain-of-Thought Reasoning in Diffusion Language Models](https://arxiv.org/abs/2402.07754) <br> Jiacheng Ye, Shansan Gong, Liheng Chen, Lin Zheng, Jiahui Gao, Han Shi, Chuan Wu, Xin Jiang, Zhenguo Li, Wei Bi, Lingpeng Kong |<img width="1002" alt="image" src="figures/diffusion_thought.png"> |[Github](https://github.com/HKUNLP/diffusion-of-thoughts) <br> [Paper](https://arxiv.org/abs/2402.07754)| [//]: #04/08
      • ![Star - regressive Chain-of-Thought through Loop-Aligned Reasoning](https://arxiv.org/abs/2502.08482) <br> Qifan Yu, Zhenyu He, Sijie Li, Xun Zhou, Jun Zhang, Jingjing Xu, Di He |<img width="1002" alt="image" src="https://arxiv.org/html/2502.08482v1/x1.png"> |[Github](https://github.com/qifanyu/RELAY) <br> [Paper](https://arxiv.org/abs/2502.08482)| [//]: #04/08
      • CODI: Compressing Chain-of-Thought into Continuous Space via Self-Distillation
      • CODI: Compressing Chain-of-Thought into Continuous Space via Self-Distillation
      • ![Star - by-Step Compression](https://arxiv.org/abs/2502.15589) <br> Jintian Zhang, Yuqi Zhu, Mengshu Sun, Yujie Luo, Shuofei Qiao, Lun Du, Da Zheng, Huajun Chen, Ningyu Zhang |<img width="1002" alt="image" src="https://arxiv.org/html/2502.15589v1/x1.png"> |[Github](https://github.com/zjunlp/LightThinker) <br> [Paper](https://arxiv.org/abs/2502.15589)| [//]: #04/08
      • ![Star - COLM_2024-blue)]()<br>[Guiding Language Model Reasoning with Planning Tokens](https://arxiv.org/abs/2310.05707) <br> Xinyi Wang, Lucas Caccia, Oleksiy Ostapenko, Xingdi Yuan, William Yang Wang, Alessandro Sordoni |<img width="1002" alt="image" src="https://arxiv.org/html/2310.05707v4/extracted/5777851/img/overview.png"> |[Github](https://github.com/WANGXinyiLinda/planning_tokens) <br> [Paper](https://arxiv.org/abs/2310.05707)| [//]: #04/08
      • ![Star - COLM_2024-blue)]()<br>[Let's Think Dot by Dot: Hidden Computation in Transformer Language Models](https://arxiv.org/abs/2404.15758) <br> Jacob Pfau, William Merrill, Samuel R. Bowman |<img width="1002" alt="image" src="https://arxiv.org/html/2404.15758v1/extracted/2404.15758v1/figs/scale_len.png"> |[Github](https://github.com/JacobPfau/fillerTokens) <br> [Paper](https://arxiv.org/abs/2404.15758)| [//]: #04/08
      • ![Star - Memory-and-Reasoning)<br>[Disentangling Memory and Reasoning Ability in Large Language Models](https://arxiv.org/abs/2411.13504) <br> Mingyu Jin, Weidi Luo, Sitao Cheng, Xinyi Wang, Wenyue Hua, Ruixiang Tang, William Yang Wang, Yongfeng Zhang |<img width="1002" alt="image" src="https://arxiv.org/html/2411.13504v2/x1.png"> |[Github](https://github.com/MingyuJ666/Disentangling-Memory-and-Reasoning) <br> [Paper](https://arxiv.org/abs/2411.13504)| [//]: #04/08
      • Token Assorted: Mixing Latent and Text Tokens for Improved Language Model Reasoning
      • Training Large Language Models to Reason in a Continuous Latent Space
      • ![Star
      • ![Publish
      • ![Star - rg/recurrent-pretraining)<br>[Scaling up Test-Time Compute with Latent Reasoning: A Recurrent Depth Approach](https://arxiv.org/abs/2502.05171) <br> Jonas Geiping, Sean McLeish, Neel Jain, John Kirchenbauer, Siddharth Singh, Brian R. Bartoldson, Bhavya Kailkhura, Abhinav Bhatele, Tom Goldstein |<img width="1002" alt="image" src="https://arxiv.org/html/2502.05171v2/x2.png"> |[Github](https://github.com/seal-rg/recurrent-pretraining) <br> [Paper](https://arxiv.org/abs/2502.05171)| [//]: #04/08
      • Token Assorted: Mixing Latent and Text Tokens for Improved Language Model Reasoning
      • Training Large Language Models to Reason in a Continuous Latent Space
      • ![Star
      • ![Publish
      • Weight-of-Thought Reasoning: Exploring Neural Network Weights for Enhanced LLM Reasoning
      • ![Star - by-Step Compression](https://arxiv.org/abs/2502.15589) <br> Jintian Zhang, Yuqi Zhu, Mengshu Sun, Yujie Luo, Shuofei Qiao, Lun Du, Da Zheng, Huajun Chen, Ningyu Zhang |<img width="1002" alt="image" src="https://arxiv.org/html/2502.15589v1/x1.png"> |[Github](https://github.com/zjunlp/LightThinker) <br> [Paper](https://arxiv.org/abs/2502.15589)| [//]: #04/08
      • ![Star - COLM_2024-blue)]()<br>[Guiding Language Model Reasoning with Planning Tokens](https://arxiv.org/abs/2310.05707) <br> Xinyi Wang, Lucas Caccia, Oleksiy Ostapenko, Xingdi Yuan, William Yang Wang, Alessandro Sordoni |<img width="1002" alt="image" src="https://arxiv.org/html/2310.05707v4/extracted/5777851/img/overview.png"> |[Github](https://github.com/WANGXinyiLinda/planning_tokens) <br> [Paper](https://arxiv.org/abs/2310.05707)| [//]: #04/08
      • ![Star - Memory-and-Reasoning)<br>[Disentangling Memory and Reasoning Ability in Large Language Models](https://arxiv.org/abs/2411.13504) <br> Mingyu Jin, Weidi Luo, Sitao Cheng, Xinyi Wang, Wenyue Hua, Ruixiang Tang, William Yang Wang, Yongfeng Zhang |<img width="1002" alt="image" src="https://arxiv.org/html/2411.13504v2/x1.png"> |[Github](https://github.com/MingyuJ666/Disentangling-Memory-and-Reasoning) <br> [Paper](https://arxiv.org/abs/2411.13504)| [//]: #04/08
    • Build SLM with Strong Reasoning Ability

      • ![Publish
      • ![Publish - main.333/) <br> Tao Feng, Yicheng Li, Li Chenglin, Hao Chen, Fei Yu, Yin Zhang |<img width="1002" alt="image" src="figures/counterfactual_distillation.png"> |[Paper](https://aclanthology.org/2024.emnlp-main.333/)| [//]: #04/08
      • ![Star - Model-Gap/Small-Model-Learnability-Gap)<br>[Small Models Struggle to Learn from Strong Reasoners](https://arxiv.org/abs/2502.12143) <br> Yuetai Li, Xiang Yue, Zhangchen Xu, Fengqing Jiang, Luyao Niu, Bill Yuchen Lin, Bhaskar Ramasubramanian, Radha Poovendran |<img width="1002" alt="image" src="https://arxiv.org/html/2502.12143v2/x1.png"> |[Github](https://github.com/Small-Model-Gap/Small-Model-Learnability-Gap) <br> [Paper](https://arxiv.org/abs/2502.12143)| [//]: #04/08
      • Deconstructing Long Chain-of-Thought: A Structured Reasoning Optimization Framework for Long CoT Distillation
      • ![Star - z/SCORE) [![Publish](https://img.shields.io/badge/Conference-ACL_Findings_2024-blue)]()<br>[Small Language Models Need Strong Verifiers to Self-Correct Reasoning](https://arxiv.org/abs/2404.17140) <br> Yunxiang Zhang, Muhammad Khalifa, Lajanugen Logeswaran, Jaekyeom Kim, Moontae Lee, Honglak Lee, Lu Wang |<img width="1002" alt="image" src="https://arxiv.org/html/2404.17140v2/x1.png"> |[Github](https://github.com/yunx-z/SCORE) <br> [Paper](https://arxiv.org/abs/2404.17140)| [//]: #04/08
      • Improving Mathematical Reasoning Capabilities of Small Language Models via Feedback-Driven Distillation
      • ![Publish - main.1140.pdf) <br> Yichun Zhao, Shuheng Zhou, Huijia Zhu |<img width="1002" alt="image" src="figures/prr.png"> |[Paper](https://aclanthology.org/2024.lrec-main.1140.pdf)| [//]: #04/08
      • ![Publish
      • ![Publish
      • ![Star - Model-Gap/Small-Model-Learnability-Gap)<br>[Small Models Struggle to Learn from Strong Reasoners](https://arxiv.org/abs/2502.12143) <br> Yuetai Li, Xiang Yue, Zhangchen Xu, Fengqing Jiang, Luyao Niu, Bill Yuchen Lin, Bhaskar Ramasubramanian, Radha Poovendran |<img width="1002" alt="image" src="https://arxiv.org/html/2502.12143v2/x1.png"> |[Github](https://github.com/Small-Model-Gap/Small-Model-Learnability-Gap) <br> [Paper](https://arxiv.org/abs/2502.12143)| [//]: #04/08
      • Deconstructing Long Chain-of-Thought: A Structured Reasoning Optimization Framework for Long CoT Distillation
      • ![Star - z/SCORE) [![Publish](https://img.shields.io/badge/Conference-ACL_Findings_2024-blue)]()<br>[Small Language Models Need Strong Verifiers to Self-Correct Reasoning](https://arxiv.org/abs/2404.17140) <br> Yunxiang Zhang, Muhammad Khalifa, Lajanugen Logeswaran, Jaekyeom Kim, Moontae Lee, Honglak Lee, Lu Wang |<img width="1002" alt="image" src="https://arxiv.org/html/2404.17140v2/x1.png"> |[Github](https://github.com/yunx-z/SCORE) <br> [Paper](https://arxiv.org/abs/2404.17140)| [//]: #04/08
      • Improving Mathematical Reasoning Capabilities of Small Language Models via Feedback-Driven Distillation
      • ![Publish - main.1140.pdf) <br> Yichun Zhao, Shuheng Zhou, Huijia Zhu |<img width="1002" alt="image" src="figures/prr.png"> |[Paper](https://aclanthology.org/2024.lrec-main.1140.pdf)| [//]: #04/08
      • Thinking Slow, Fast: Scaling Inference Compute with Distilled Reasoners
      • Distilling Reasoning Ability from Large Language Models with Adaptive Thinking
      • Thinking Slow, Fast: Scaling Inference Compute with Distilled Reasoners
      • Distilling Reasoning Ability from Large Language Models with Adaptive Thinking
      • ![Star - NLP/Distilling-CoT-Reasoning)<br>[Unveiling the Key Factors for Distilling Chain-of-Thought Reasoning](https://arxiv.org/abs/2502.18001) <br> Xinghao Chen, Zhijing Sun, Wenjin Guo, Miaoran Zhang, Yanjun Chen, Yirong Sun, Hui Su, Yijie Pan, Dietrich Klakow, Wenjie Li, Xiaoyu Shen |<img width="1002" alt="image" src="https://arxiv.org/html/2502.18001v1/x1.png"> |[Github](https://github.com/EIT-NLP/Distilling-CoT-Reasoning) <br> [Paper](https://arxiv.org/abs/2502.18001)| [//]: #04/08
      • ![Star - NLP/Distilling-CoT-Reasoning)<br>[Unveiling the Key Factors for Distilling Chain-of-Thought Reasoning](https://arxiv.org/abs/2502.18001) <br> Xinghao Chen, Zhijing Sun, Wenjin Guo, Miaoran Zhang, Yanjun Chen, Yirong Sun, Hui Su, Yijie Pan, Dietrich Klakow, Wenjie Li, Xiaoyu Shen |<img width="1002" alt="image" src="https://arxiv.org/html/2502.18001v1/x1.png"> |[Github](https://github.com/EIT-NLP/Distilling-CoT-Reasoning) <br> [Paper](https://arxiv.org/abs/2502.18001)| [//]: #04/08
      • Towards Reasoning Ability of Small Language Models
      • ![Star - Reasoning-Models)<br>[Quantization Hurts Reasoning? An Empirical Study on Quantized Reasoning Models](https://arxiv.org/abs/2504.04823) <br> Ruikang Liu, Yuxuan Sun, Manyi Zhang, Haoli Bai, Xianzhi Yu, Tiezheng Yu, Chun Yuan, Lu Hou |<img width="1002" alt="image" src="figures/quant_hurt.png"> |[Github](https://github.com/ruikangliu/Quantized-Reasoning-Models) <br> [Paper](https://arxiv.org/abs/2504.04823)| [//]: #04/14
      • When Reasoning Meets Compression: Benchmarking Compressed Large Reasoning Models on Complex Reasoning Tasks
      • ![Star - rs)<br>[Reinforcement Learning for Reasoning in Small LLMs: What Works and What Doesn't](https://arxiv.org/abs/2503.16219) <br> Quy-Anh Dang, Chris Ngo |<img src="https://arxiv.org/html/2503.16219v1/extracted/6296504/images/pass1.png" width="45%"> <img src="https://arxiv.org/html/2503.16219v1/extracted/6296504/images/costs.png" width="45%"> |[Github](https://github.com/knoveleng/open-rs) <br> [Paper](https://arxiv.org/abs/2503.16219)| [//]: #04/08
      • ![Star - nlp/simpleRL-reason)<br>[SimpleRL-Zoo: Investigating and Taming Zero Reinforcement Learning for Open Base Models in the Wild](https://arxiv.org/abs/2503.18892) <br> Weihao Zeng, Yuzhen Huang, Qian Liu, Wei Liu, Keqing He, Zejun Ma, Junxian He |<img width="1002" alt="image" src="figures/simplerl_zoo.png"> |[Github](https://github.com/hkust-nlp/simpleRL-reason) <br> [Paper](https://arxiv.org/abs/2503.18892)| [//]: #04/08
      • DeepScaleR - project.com/)
      • Towards Reasoning Ability of Small Language Models
      • ![Star - Reasoning-Models)<br>[Quantization Hurts Reasoning? An Empirical Study on Quantized Reasoning Models](https://arxiv.org/abs/2504.04823) <br> Ruikang Liu, Yuxuan Sun, Manyi Zhang, Haoli Bai, Xianzhi Yu, Tiezheng Yu, Chun Yuan, Lu Hou |<img width="1002" alt="image" src="figures/quant_hurt.png"> |[Github](https://github.com/ruikangliu/Quantized-Reasoning-Models) <br> [Paper](https://arxiv.org/abs/2504.04823)| [//]: #04/14
      • When Reasoning Meets Compression: Benchmarking Compressed Large Reasoning Models on Complex Reasoning Tasks
      • ![Star - rs)<br>[Reinforcement Learning for Reasoning in Small LLMs: What Works and What Doesn't](https://arxiv.org/abs/2503.16219) <br> Quy-Anh Dang, Chris Ngo |<img src="https://arxiv.org/html/2503.16219v1/extracted/6296504/images/pass1.png" width="45%"> <img src="https://arxiv.org/html/2503.16219v1/extracted/6296504/images/costs.png" width="45%"> |[Github](https://github.com/knoveleng/open-rs) <br> [Paper](https://arxiv.org/abs/2503.16219)| [//]: #04/08
      • ![Star - nlp/simpleRL-reason)<br>[SimpleRL-Zoo: Investigating and Taming Zero Reinforcement Learning for Open Base Models in the Wild](https://arxiv.org/abs/2503.18892) <br> Weihao Zeng, Yuzhen Huang, Qian Liu, Wei Liu, Keqing He, Zejun Ma, Junxian He |<img width="1002" alt="image" src="figures/simplerl_zoo.png"> |[Github](https://github.com/hkust-nlp/simpleRL-reason) <br> [Paper](https://arxiv.org/abs/2503.18892)| [//]: #04/08
      • DeepScaleR - project.com/)
    • Let Decoding More Efficient

      • ![Star - Shanghai/xVerify)<br>[xVerify: Efficient Answer Verifier for Reasoning Model Evaluations](https://arxiv.org/abs/2504.10481) <br> Ding Chen, Qingchen Yu, Pengyuan Wang, Wentao Zhang, Bo Tang, Feiyu Xiong, Xinchi Li, Minchuan Yang, Zhiyu Li |<img width="1002" alt="image" src="https://arxiv.org/html/2504.10481v1/x1.png"> |[Github](https://github.com/IAAR-Shanghai/xVerify) <br> [Paper](https://arxiv.org/abs/2504.10481)| [//]: #04/17
      • ![Star - Consistency for Efficient Reasoning and Coding with LLMs](https://arxiv.org/abs/2305.11860) <br> Pranjal Aggarwal, Aman Madaan, Yiming Yang, Mausam |<img width="1002" alt="image" src="figures/asc.png"> |[Github](https://github.com/Pranjal2041/AdaptiveConsistency) <br> [Paper](https://arxiv.org/abs/2305.11860)| [//]: #04/08
      • ![Star - ICLR_2024-blue)]()<br>[Escape Sky-high Cost: Early-stopping Self-Consistency for Multi-step Reasoning](https://arxiv.org/abs/2401.10480) <br> Yiwei Li, Peiwen Yuan, Shaoxiong Feng, Boyuan Pan, Xinglin Wang, Bin Sun, Heda Wang, Kan Li |<img width="1002" alt="image" src="https://arxiv.org/html/2401.10480v1/x1.png"> |[Github](https://github.com/Yiwei98/ESC) <br> [Paper](https://arxiv.org/abs/2401.10480)| [//]: #04/08
      • Think Deep, Think Fast: Investigating Efficiency of Verifier-free Inference-time-scaling Methods - Falcon, Ben Athiwaratkun, Qingyang Wu, Jue Wang, Shuaiwen Leon Song, Ce Zhang, Bhuwan Dhingra, James Zou |<img width="1002" alt="image" src="https://arxiv.org/html/2504.14047v1/x1.png"> |[Paper](https://arxiv.org/abs/2504.14047)| [//]: #04/23
      • ![Star - Shanghai/xVerify)<br>[xVerify: Efficient Answer Verifier for Reasoning Model Evaluations](https://arxiv.org/abs/2504.10481) <br> Ding Chen, Qingchen Yu, Pengyuan Wang, Wentao Zhang, Bo Tang, Feiyu Xiong, Xinchi Li, Minchuan Yang, Zhiyu Li |<img width="1002" alt="image" src="https://arxiv.org/html/2504.10481v1/x1.png"> |[Github](https://github.com/IAAR-Shanghai/xVerify) <br> [Paper](https://arxiv.org/abs/2504.10481)| [//]: #04/17
      • ![Star - Consistency for Efficient Reasoning and Coding with LLMs](https://arxiv.org/abs/2305.11860) <br> Pranjal Aggarwal, Aman Madaan, Yiming Yang, Mausam |<img width="1002" alt="image" src="figures/asc.png"> |[Github](https://github.com/Pranjal2041/AdaptiveConsistency) <br> [Paper](https://arxiv.org/abs/2305.11860)| [//]: #04/08
      • ![Star - ICLR_2024-blue)]()<br>[Escape Sky-high Cost: Early-stopping Self-Consistency for Multi-step Reasoning](https://arxiv.org/abs/2401.10480) <br> Yiwei Li, Peiwen Yuan, Shaoxiong Feng, Boyuan Pan, Xinglin Wang, Bin Sun, Heda Wang, Kan Li |<img width="1002" alt="image" src="https://arxiv.org/html/2401.10480v1/x1.png"> |[Github](https://github.com/Yiwei98/ESC) <br> [Paper](https://arxiv.org/abs/2401.10480)| [//]: #04/08
      • Think Deep, Think Fast: Investigating Efficiency of Verifier-free Inference-time-scaling Methods - Falcon, Ben Athiwaratkun, Qingyang Wu, Jue Wang, Shuaiwen Leon Song, Ce Zhang, Bhuwan Dhingra, James Zou |<img width="1002" alt="image" src="https://arxiv.org/html/2504.14047v1/x1.png"> |[Paper](https://arxiv.org/abs/2504.14047)| [//]: #04/23
      • ![Star - Guided Speculative Decoding for Efficient LLM Reasoning](https://arxiv.org/abs/2501.19324) <br> Baohao Liao, Yuhui Xu, Hanze Dong, Junnan Li, Christof Monz, Silvio Savarese, Doyen Sahoo, Caiming Xiong |<img width="1002" alt="image" src="figures/rsd.png"> |[Github](https://github.com/BaohaoLiao/RSD) <br> [Paper](https://arxiv.org/abs/2501.19324)| [//]: #04/08
      • Meta-Reasoner: Dynamic Guidance for Optimized Inference-time Reasoning in Large Language Models
      • ![Star - Time Scaling](https://arxiv.org/abs/2502.12018) <br> Fengwei Teng, Zhaoyang Yu, Quan Shi, Jiayi Zhang, Chenglin Wu, Yuyu Luo |<img width="1002" alt="image" src="figures/aot.png"> |[Github](https://github.com/qixucen/atom) <br> [Paper](https://arxiv.org/abs/2502.12018)| [//]: #04/08
      • ![Star - NAACL_Findings_2025-blue)]()<br>[Make Every Penny Count: Difficulty-Adaptive Self-Consistency for Cost-Efficient Reasoning](https://arxiv.org/abs/2408.13457) <br> Xinglin Wang, Shaoxiong Feng, Yiwei Li, Peiwen Yuan, Yueqi Zhang, Chuyi Tan, Boyuan Pan, Yao Hu, Kan Li |<img width="1002" alt="image" src="https://arxiv.org/html/2408.13457v3/x3.png"> |[Github](https://github.com/WangXinglin/DSC) <br> [Paper](https://arxiv.org/abs/2408.13457)| [//]: #04/08
      • Path-Consistency: Prefix Enhancement for Efficient Inference in LLM
      • Bridging Internal Probability and Self-Consistency for Effective and Efficient LLM Reasoning - Zhe Guo, Xiaoxing Ma, Yu-Feng Li |<img width="1002" alt="image" src="https://arxiv.org/html/2502.00511v2/x3.png"> |[Paper](https://arxiv.org/abs/2502.00511)| [//]: #04/08
      • Confidence Improves Self-Consistency in LLMs
      • ![Star - Huang/Self-Calibration)<br>[Efficient Test-Time Scaling via Self-Calibration](https://arxiv.org/abs/2503.00031) <br> Chengsong Huang, Langlin Huang, Jixuan Leng, Jiacheng Liu, Jiaxin Huang |<img width="1002" alt="image" src="https://arxiv.org/html/2503.00031v1/x2.png"> |[Github](https://github.com/Chengsong-Huang/Self-Calibration) <br> [Paper](https://arxiv.org/abs/2503.00031)| [//]: #04/08
      • ![Star - Labs/SpeculativeRejection) [![Publish](https://img.shields.io/badge/Conference-NeurIPS_2024-blue)]()<br>[Fast Best-of-N Decoding via Speculative Rejection](https://arxiv.org/abs/2410.20290) <br> Hanshi Sun, Momin Haider, Ruiqi Zhang, Huitao Yang, Jiahao Qiu, Ming Yin, Mengdi Wang, Peter Bartlett, Andrea Zanette |<img width="1002" alt="image" src="https://arxiv.org/html/2410.20290v2/x1.png"> |[Github](https://github.com/Zanette-Labs/SpeculativeRejection) <br> [Paper](https://arxiv.org/abs/2410.20290)| [//]: #04/08
      • Sampling-Efficient Test-Time Scaling: Self-Estimating the Best-of-N Sampling in Early Decoding
      • FastMCTS: A Simple Sampling Strategy for Data Synthesis
      • ![Star - github-00/LLM-Predictive-Decoding) [![Publish](https://img.shields.io/badge/Conference-ICLR_2025-blue)]()<br>[Non-myopic Generation of Language Models for Reasoning and Planning](https://arxiv.org/abs/2410.17195) <br> Chang Ma, Haiteng Zhao, Junlei Zhang, Junxian He, Lingpeng Kong |<img width="1002" alt="image" src="figures/predictive_decoding.png"> |[Github](https://github.com/chang-github-00/LLM-Predictive-Decoding) <br> [Paper](https://arxiv.org/abs/2410.17195)| [//]: #04/08
      • ![Star - NAACL_Findings_2025-blue)]()<br>[Make Every Penny Count: Difficulty-Adaptive Self-Consistency for Cost-Efficient Reasoning](https://arxiv.org/abs/2408.13457) <br> Xinglin Wang, Shaoxiong Feng, Yiwei Li, Peiwen Yuan, Yueqi Zhang, Chuyi Tan, Boyuan Pan, Yao Hu, Kan Li |<img width="1002" alt="image" src="https://arxiv.org/html/2408.13457v3/x3.png"> |[Github](https://github.com/WangXinglin/DSC) <br> [Paper](https://arxiv.org/abs/2408.13457)| [//]: #04/08
      • Path-Consistency: Prefix Enhancement for Efficient Inference in LLM
      • Bridging Internal Probability and Self-Consistency for Effective and Efficient LLM Reasoning - Zhe Guo, Xiaoxing Ma, Yu-Feng Li |<img width="1002" alt="image" src="https://arxiv.org/html/2502.00511v2/x3.png"> |[Paper](https://arxiv.org/abs/2502.00511)| [//]: #04/08
      • Confidence Improves Self-Consistency in LLMs
      • ![Star - Huang/Self-Calibration)<br>[Efficient Test-Time Scaling via Self-Calibration](https://arxiv.org/abs/2503.00031) <br> Chengsong Huang, Langlin Huang, Jixuan Leng, Jiacheng Liu, Jiaxin Huang |<img width="1002" alt="image" src="https://arxiv.org/html/2503.00031v1/x2.png"> |[Github](https://github.com/Chengsong-Huang/Self-Calibration) <br> [Paper](https://arxiv.org/abs/2503.00031)| [//]: #04/08
      • ![Star - Labs/SpeculativeRejection) [![Publish](https://img.shields.io/badge/Conference-NeurIPS_2024-blue)]()<br>[Fast Best-of-N Decoding via Speculative Rejection](https://arxiv.org/abs/2410.20290) <br> Hanshi Sun, Momin Haider, Ruiqi Zhang, Huitao Yang, Jiahao Qiu, Ming Yin, Mengdi Wang, Peter Bartlett, Andrea Zanette |<img width="1002" alt="image" src="https://arxiv.org/html/2410.20290v2/x1.png"> |[Github](https://github.com/Zanette-Labs/SpeculativeRejection) <br> [Paper](https://arxiv.org/abs/2410.20290)| [//]: #04/08
      • Sampling-Efficient Test-Time Scaling: Self-Estimating the Best-of-N Sampling in Early Decoding
      • FastMCTS: A Simple Sampling Strategy for Data Synthesis
      • ![Star - github-00/LLM-Predictive-Decoding) [![Publish](https://img.shields.io/badge/Conference-ICLR_2025-blue)]()<br>[Non-myopic Generation of Language Models for Reasoning and Planning](https://arxiv.org/abs/2410.17195) <br> Chang Ma, Haiteng Zhao, Junlei Zhang, Junxian He, Lingpeng Kong |<img width="1002" alt="image" src="figures/predictive_decoding.png"> |[Github](https://github.com/chang-github-00/LLM-Predictive-Decoding) <br> [Paper](https://arxiv.org/abs/2410.17195)| [//]: #04/08
      • ![Star - taught-lookahead)<br>[Language Models can Self-Improve at State-Value Estimation for Better Search](https://arxiv.org/abs/2503.02878) <br> Ethan Mendes, Alan Ritter |<img width="1002" alt="image" src="https://arxiv.org/html/2503.02878v1/x1.png"> |[Github](https://github.com/ethanm88/self-taught-lookahead) <br> [Paper](https://arxiv.org/abs/2503.02878)| [//]: #04/08
      • ![Star - Decoding)<br>[ϕ-Decoding: Adaptive Foresight Sampling for Balanced Inference-Time Exploration and Exploitation](https://arxiv.org/abs/2503.13288) <br> Fangzhi Xu, Hang Yan, Chang Ma, Haiteng Zhao, Jun Liu, Qika Lin, Zhiyong Wu |<img width="1002" alt="image" src="https://arxiv.org/html/2503.13288v1/x2.png"> |[Github](https://github.com/xufangzhi/phi-Decoding) <br> [Paper](https://arxiv.org/abs/2503.13288)| [//]: #04/08
      • Dynamic Parallel Tree Search for Efficient LLM Reasoning
      • ![Star - taught-lookahead)<br>[Language Models can Self-Improve at State-Value Estimation for Better Search](https://arxiv.org/abs/2503.02878) <br> Ethan Mendes, Alan Ritter |<img width="1002" alt="image" src="https://arxiv.org/html/2503.02878v1/x1.png"> |[Github](https://github.com/ethanm88/self-taught-lookahead) <br> [Paper](https://arxiv.org/abs/2503.02878)| [//]: #04/08
      • ![Star - Decoding)<br>[ϕ-Decoding: Adaptive Foresight Sampling for Balanced Inference-Time Exploration and Exploitation](https://arxiv.org/abs/2503.13288) <br> Fangzhi Xu, Hang Yan, Chang Ma, Haiteng Zhao, Jun Liu, Qika Lin, Zhiyong Wu |<img width="1002" alt="image" src="https://arxiv.org/html/2503.13288v1/x2.png"> |[Github](https://github.com/xufangzhi/phi-Decoding) <br> [Paper](https://arxiv.org/abs/2503.13288)| [//]: #04/08
      • Dynamic Parallel Tree Search for Efficient LLM Reasoning
      • ![Star
      • ![Star - Reasoning/APR)<br>[Learning Adaptive Parallel Reasoning with Language Models](https://arxiv.org/abs/2504.15466) <br> Jiayi Pan, Xiuyu Li, Long Lian, Charlie Snell, Yifei Zhou, Adam Yala, Trevor Darrell, Kurt Keutzer, Alane Suhr |<img width="1002" alt="image" src="https://arxiv.org/html/2504.15466v1/x2.png"> |[Github](https://github.com/Parallel-Reasoning/APR) <br> [Paper](https://arxiv.org/abs/2504.15466)| [//]: #04/23
      • ![Star - research/sot) [![Publish](https://img.shields.io/badge/Conference-ICLR_2024-blue)]()<br>[Skeleton-of-Thought: Prompting LLMs for Efficient Parallel Generation](https://arxiv.org/abs/2307.15337) <br> Xuefei Ning, Zinan Lin, Zixuan Zhou, Zifu Wang, Huazhong Yang, Yu Wang |<img width="1002" alt="image" src="figures/skeleton_ot.png"> |[Github](https://github.com/imagination-research/sot) <br> [Paper](https://arxiv.org/abs/2307.15337)| [//]: #04/08
      • Adaptive Skeleton Graph Decoding
      • ![Star
      • ![Star - Reasoning/APR)<br>[Learning Adaptive Parallel Reasoning with Language Models](https://arxiv.org/abs/2504.15466) <br> Jiayi Pan, Xiuyu Li, Long Lian, Charlie Snell, Yifei Zhou, Adam Yala, Trevor Darrell, Kurt Keutzer, Alane Suhr |<img width="1002" alt="image" src="https://arxiv.org/html/2504.15466v1/x2.png"> |[Github](https://github.com/Parallel-Reasoning/APR) <br> [Paper](https://arxiv.org/abs/2504.15466)| [//]: #04/23
      • THOUGHTTERMINATOR: Benchmarking, Calibrating, and Mitigating Overthinking in Reasoning Models
      • ![Star - research/sot) [![Publish](https://img.shields.io/badge/Conference-ICLR_2024-blue)]()<br>[Skeleton-of-Thought: Prompting LLMs for Efficient Parallel Generation](https://arxiv.org/abs/2307.15337) <br> Xuefei Ning, Zinan Lin, Zixuan Zhou, Zifu Wang, Huazhong Yang, Yu Wang |<img width="1002" alt="image" src="figures/skeleton_ot.png"> |[Github](https://github.com/imagination-research/sot) <br> [Paper](https://arxiv.org/abs/2307.15337)| [//]: #04/08
      • Adaptive Skeleton Graph Decoding
      • ![Star - Guided Speculative Decoding for Efficient LLM Reasoning](https://arxiv.org/abs/2501.19324) <br> Baohao Liao, Yuhui Xu, Hanze Dong, Junnan Li, Christof Monz, Silvio Savarese, Doyen Sahoo, Caiming Xiong |<img width="1002" alt="image" src="figures/rsd.png"> |[Github](https://github.com/BaohaoLiao/RSD) <br> [Paper](https://arxiv.org/abs/2501.19324)| [//]: #04/08
      • Meta-Reasoner: Dynamic Guidance for Optimized Inference-time Reasoning in Large Language Models
      • ![Star - Time Scaling](https://arxiv.org/abs/2502.12018) <br> Fengwei Teng, Zhaoyang Yu, Quan Shi, Jiayi Zhang, Chenglin Wu, Yuyu Luo |<img width="1002" alt="image" src="figures/aot.png"> |[Github](https://github.com/qixucen/atom) <br> [Paper](https://arxiv.org/abs/2502.12018)| [//]: #04/08
      • DISC: Dynamic Decomposition Improves LLM Inference Scaling
      • From Chaos to Order: The Atomic Reasoner Framework for Fine-grained Reasoning in Large Language Models
      • DISC: Dynamic Decomposition Improves LLM Inference Scaling
      • From Chaos to Order: The Atomic Reasoner Framework for Fine-grained Reasoning in Large Language Models
      • ![Star - structured Reasoning of Multimodal Large Models?](https://arxiv.org/abs/2503.06252) <br> Kun Xiang, Zhili Liu, Zihao Jiang, Yunshuang Nie, Kaixin Cai, Yiyang Yin, Runhui Huang, Haoxiang Fan, Hanhui Li, Weiran Huang, Yihan Zeng, Yu-Jie Yuan, Jianhua Han, Lanqing Hong, Hang Xu, Xiaodan Liang |<img width="1002" alt="image" src="figures/atom.png"> |[Github](https://github.com/Quinn777/AtomThink) <br> [Paper](https://arxiv.org/abs/2503.06252)| [//]: #04/08
      • ![Star - wyz/inference_scaling) [![Publish](https://img.shields.io/badge/Conference-ICLR_2025-blue)]()<br>[Inference Scaling Laws: An Empirical Analysis of Compute-Optimal Inference for Problem-Solving with Language Models](https://arxiv.org/abs/2408.00724) <br> Yangzhen Wu, Zhiqing Sun, Shanda Li, Sean Welleck, Yiming Yang |<img width="1002" alt="image" src="figures/scaling_law.png"> |[Github](https://github.com/thu-wyz/inference_scaling) <br> [Paper](https://arxiv.org/abs/2408.00724)| [//]: #04/08
      • ![Star - AIRe/MRT)<br>[Optimizing Test-Time Compute via Meta Reinforcement Fine-Tuning](https://arxiv.org/abs/2503.07572) <br> Yuxiao Qu, Matthew Y. R. Yang, Amrith Setlur, Lewis Tunstall, Edward Emanuel Beeching, Ruslan Salakhutdinov, Aviral Kumar |<img width="1002" alt="image" src="figures/mrt.png"> |[Github](https://github.com/CMU-AIRe/MRT) <br> [Paper](https://arxiv.org/abs/2503.07572)| [//]: #04/08
      • ![Star - Time Compute via Speculative Reasoning](https://arxiv.org/abs/2504.07891) <br> Rui Pan, Yinwei Dai, Zhihao Zhang, Gabriele Oliaro, Zhihao Jia, Ravi Netravali |<img width="1002" alt="image" src="figures/specreason.png"> |[Github](https://github.com/ruipeterpan/specreason) <br> [Paper](https://arxiv.org/abs/2504.07891)| [//]: #04/14
      • ![Star - structured Reasoning of Multimodal Large Models?](https://arxiv.org/abs/2503.06252) <br> Kun Xiang, Zhili Liu, Zihao Jiang, Yunshuang Nie, Kaixin Cai, Yiyang Yin, Runhui Huang, Haoxiang Fan, Hanhui Li, Weiran Huang, Yihan Zeng, Yu-Jie Yuan, Jianhua Han, Lanqing Hong, Hang Xu, Xiaodan Liang |<img width="1002" alt="image" src="figures/atom.png"> |[Github](https://github.com/Quinn777/AtomThink) <br> [Paper](https://arxiv.org/abs/2503.06252)| [//]: #04/08
      • ![Publish - Time Compute Optimally can be More Effective than Scaling Model Parameters](https://arxiv.org/abs/2408.03314) <br> Charlie Snell, Jaehoon Lee, Kelvin Xu, Aviral Kumar |<img width="1002" alt="image" src="figures/tts_effective.png"> |[Paper](https://arxiv.org/abs/2408.03314)| [//]: #04/08
      • ![Star - Time Compute via Speculative Reasoning](https://arxiv.org/abs/2504.07891) <br> Rui Pan, Yinwei Dai, Zhihao Zhang, Gabriele Oliaro, Zhihao Jia, Ravi Netravali |<img width="1002" alt="image" src="figures/specreason.png"> |[Github](https://github.com/ruipeterpan/specreason) <br> [Paper](https://arxiv.org/abs/2504.07891)| [//]: #04/14
      • ![Star - wyz/inference_scaling) [![Publish](https://img.shields.io/badge/Conference-ICLR_2025-blue)]()<br>[Inference Scaling Laws: An Empirical Analysis of Compute-Optimal Inference for Problem-Solving with Language Models](https://arxiv.org/abs/2408.00724) <br> Yangzhen Wu, Zhiqing Sun, Shanda Li, Sean Welleck, Yiming Yang |<img width="1002" alt="image" src="figures/scaling_law.png"> |[Github](https://github.com/thu-wyz/inference_scaling) <br> [Paper](https://arxiv.org/abs/2408.00724)| [//]: #04/08
      • ![Star - AIRe/MRT)<br>[Optimizing Test-Time Compute via Meta Reinforcement Fine-Tuning](https://arxiv.org/abs/2503.07572) <br> Yuxiao Qu, Matthew Y. R. Yang, Amrith Setlur, Lewis Tunstall, Edward Emanuel Beeching, Ruslan Salakhutdinov, Aviral Kumar |<img width="1002" alt="image" src="figures/mrt.png"> |[Github](https://github.com/CMU-AIRe/MRT) <br> [Paper](https://arxiv.org/abs/2503.07572)| [//]: #04/08
    • Evaluation and Benchmarks

      • THOUGHTTERMINATOR: Benchmarking, Calibrating, and Mitigating Overthinking in Reasoning Models
      • ![Star - stability)<br>[Non-Determinism of "Deterministic" LLM Settings](https://arxiv.org/abs/2408.04667) <br> Berk Atil, Sarp Aykent, Alexa Chittams, Lisheng Fu, Rebecca J. Passonneau, Evan Radcliffe, Guru Rajan Rajagopal, Adam Sloan, Tomasz Tudrej, Ferhan Ture, Zhe Wu, Lixinyu Xu, Breck Baldwin |<img width="1002" alt="image" src="https://arxiv.org/html/2408.04667v5/extracted/6331111/max_min_diff.png"> |[Github](https://github.com/breckbaldwin/llm-stability) <br> [Paper](https://arxiv.org/abs/2408.04667)| [//]: #04/08
      • The Danger of Overthinking: Examining the Reasoning-Action Dilemma in Agentic Tasks
      • Evaluating Large Language Models Trained on Code
      • τ-bench: A Benchmark for Tool-Agent-User Interaction in Real-World Domains
      • ![Star - compass/GPassK)<br>[Are Your LLMs Capable of Stable Reasoning?](https://arxiv.org/abs/2412.13147) <br> Junnan Liu, Hongwei Liu, Linchen Xiao, Ziyi Wang, Kuikun Liu, Songyang Gao, Wenwei Zhang, Songyang Zhang, Kai Chen |<img width="1002" alt="image" src="https://arxiv.org/html/2412.13147v3/x1.png"> |[Github](https://github.com/open-compass/GPassK) <br> [Paper](https://arxiv.org/abs/2412.13147)| [//]: #04/08
      • LongPerceptualThoughts: Distilling System-2 Reasoning for System-1 Perception - Hong Liao, Sven Elflein, Liu He, Laura Leal-Taixé, Yejin Choi, Sanja Fidler, David Acuna |<img width="1002" alt="image" src="https://arxiv.org/html/2504.15362v1/x1.png"> |[Paper](https://arxiv.org/abs/2504.15362)| [//]: #04/23
      • ![Star - Time Computations for LLM Reasoning and Planning: A Benchmark and Insights](https://arxiv.org/abs/2502.12521) <br> Shubham Parashar, Blake Olson, Sambhav Khurana, Eric Li, Hongyi Ling, James Caverlee, Shuiwang Ji |<img width="1002" alt="image" src="https://arxiv.org/html/2502.12521v1/x1.png"> |[Github](https://github.com/divelab/sys2bench) <br> [Paper](https://arxiv.org/abs/2502.12521)| [//]: #04/08
      • Do NOT Think That Much for 2+3=? On the Overthinking of o1-Like LLMs
      • ![Star - Valve)<br>[CoT-Valve: Length-Compressible Chain-of-Thought Tuning](https://arxiv.org/abs/2502.09601) <br> Xinyin Ma, Guangnian Wan, Runpeng Yu, Gongfan Fang, Xinchao Wang |<img width="1002" alt="image" src="figures/cot_valve.png"> |[Github](https://github.com/horseee/CoT-Valve) <br> [Paper](https://arxiv.org/abs/2502.09601)|[//]: #03/16
      • ![Star - stability)<br>[Non-Determinism of "Deterministic" LLM Settings](https://arxiv.org/abs/2408.04667) <br> Berk Atil, Sarp Aykent, Alexa Chittams, Lisheng Fu, Rebecca J. Passonneau, Evan Radcliffe, Guru Rajan Rajagopal, Adam Sloan, Tomasz Tudrej, Ferhan Ture, Zhe Wu, Lixinyu Xu, Breck Baldwin |<img width="1002" alt="image" src="https://arxiv.org/html/2408.04667v5/extracted/6331111/max_min_diff.png"> |[Github](https://github.com/breckbaldwin/llm-stability) <br> [Paper](https://arxiv.org/abs/2408.04667)| [//]: #04/08
      • The Danger of Overthinking: Examining the Reasoning-Action Dilemma in Agentic Tasks
      • Evaluating Large Language Models Trained on Code
      • τ-bench: A Benchmark for Tool-Agent-User Interaction in Real-World Domains
      • ![Star - compass/GPassK)<br>[Are Your LLMs Capable of Stable Reasoning?](https://arxiv.org/abs/2412.13147) <br> Junnan Liu, Hongwei Liu, Linchen Xiao, Ziyi Wang, Kuikun Liu, Songyang Gao, Wenwei Zhang, Songyang Zhang, Kai Chen |<img width="1002" alt="image" src="https://arxiv.org/html/2412.13147v3/x1.png"> |[Github](https://github.com/open-compass/GPassK) <br> [Paper](https://arxiv.org/abs/2412.13147)| [//]: #04/08
      • LongPerceptualThoughts: Distilling System-2 Reasoning for System-1 Perception - Hong Liao, Sven Elflein, Liu He, Laura Leal-Taixé, Yejin Choi, Sanja Fidler, David Acuna |<img width="1002" alt="image" src="https://arxiv.org/html/2504.15362v1/x1.png"> |[Paper](https://arxiv.org/abs/2504.15362)| [//]: #04/23
      • ![Star - Time Computations for LLM Reasoning and Planning: A Benchmark and Insights](https://arxiv.org/abs/2502.12521) <br> Shubham Parashar, Blake Olson, Sambhav Khurana, Eric Li, Hongyi Ling, James Caverlee, Shuiwang Ji |<img width="1002" alt="image" src="https://arxiv.org/html/2502.12521v1/x1.png"> |[Github](https://github.com/divelab/sys2bench) <br> [Paper](https://arxiv.org/abs/2502.12521)| [//]: #04/08
      • ![Star - hkust/benchmark_inference_time_computation_LLM)<br>[Bag of Tricks for Inference-time Computation of LLM Reasoning](https://arxiv.org/abs/2502.07191) <br> Fan Liu, Wenshuo Chao, Naiqiang Tan, Hao Liu |<img width="1002" alt="image" src="https://arxiv.org/html/2502.07191v4/x1.png"> |[Github](https://github.com/usail-hkust/benchmark_inference_time_computation_LLM) <br> [Paper](https://arxiv.org/abs/2502.07191)| [//]: #04/08
      • ![Star - optimal-tts)<br>[Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling](https://arxiv.org/abs/2502.06703) <br> Runze Liu, Junqi Gao, Jian Zhao, Kaiyan Zhang, Xiu Li, Biqing Qi, Wanli Ouyang, Bowen Zhou |<img width="1002" alt="image" src="https://arxiv.org/html/2502.06703v1/x2.png"> |[Github](https://github.com/RyanLiu112/compute-optimal-tts) <br> [Paper](https://arxiv.org/abs/2502.06703)| [//]: #04/08
      • DNA Bench: When Silence is Smarter -- Benchmarking Over-Reasoning in Reasoning LLMs
      • S1-Bench: A Simple Benchmark for Evaluating System 1 Thinking Capability of Large Reasoning Models
      • ![Star - Bench)<br>[VCR-Bench: A Comprehensive Evaluation Framework for Video Chain-of-Thought Reasoning](https://arxiv.org/abs/2504.07956) <br> Yukun Qi, Yiming Zhao, Yu Zeng, Xikun Bao, Wenxuan Huang, Lin Chen, Zehui Chen, Jie Zhao, Zhongang Qi, Feng Zhao |<img width="1002" alt="image" src="figures/video.png"> |[Github](https://github.com/zhishuifeiqian/VCR-Bench) <br> [Paper](https://arxiv.org/abs/2504.07956)| [//]: #04/16
      • S1-Bench: A Simple Benchmark for Evaluating System 1 Thinking Capability of Large Reasoning Models
      • ![Star - Bench)<br>[VCR-Bench: A Comprehensive Evaluation Framework for Video Chain-of-Thought Reasoning](https://arxiv.org/abs/2504.07956) <br> Yukun Qi, Yiming Zhao, Yu Zeng, Xikun Bao, Wenxuan Huang, Lin Chen, Zehui Chen, Jie Zhao, Zhongang Qi, Feng Zhao |<img width="1002" alt="image" src="figures/video.png"> |[Github](https://github.com/zhishuifeiqian/VCR-Bench) <br> [Paper](https://arxiv.org/abs/2504.07956)| [//]: #04/16
      • ![Star - optimal-tts)<br>[Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling](https://arxiv.org/abs/2502.06703) <br> Runze Liu, Junqi Gao, Jian Zhao, Kaiyan Zhang, Xiu Li, Biqing Qi, Wanli Ouyang, Bowen Zhou |<img width="1002" alt="image" src="https://arxiv.org/html/2502.06703v1/x2.png"> |[Github](https://github.com/RyanLiu112/compute-optimal-tts) <br> [Paper](https://arxiv.org/abs/2502.06703)| [//]: #04/08
      • DNA Bench: When Silence is Smarter -- Benchmarking Over-Reasoning in Reasoning LLMs
    • Background Papers

      • ![Star - of-RLVR)<br>[Does Reinforcement Learning Really Incentivize Reasoning Capacity in LLMs Beyond the Base Model?](https://arxiv.org/abs/2504.13837) <br> Yang Yue, Zhiqi Chen, Rui Lu, Andrew Zhao, Zhaokai Wang, Yang Yue, Shiji Song, Gao Huang |<img width="1002" alt="image" src="https://arxiv.org/html/2504.13837v1/x1.png"> |[Github](https://github.com/LeapLabTHU/limit-of-RLVR) <br> [Paper](https://arxiv.org/abs/2504.13837)| [//]: #04/22
      • ![Publish - of-Thought Prompting Elicits Reasoning in Large Language Models](https://arxiv.org/abs/2201.11903) <br> Jason Wei, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, Brian Ichter, Fei Xia, Ed Chi, Quoc Le, Denny Zhou |<img width="1002" alt="image" src="figures/cot_prompting.png"> |[Paper](https://arxiv.org/abs/2201.11903)| [//]: #04/08
      • ![Star - nlp/tree-of-thought-llm) [![Publish](https://img.shields.io/badge/Conference-NeurIPS_2023-blue)]()<br>[Tree of Thoughts: Deliberate Problem Solving with Large Language Models](https://arxiv.org/abs/2305.10601) <br> Shunyu Yao, Dian Yu, Jeffrey Zhao, Izhak Shafran, Thomas L. Griffiths, Yuan Cao, Karthik Narasimhan |<img width="1002" alt="image" src="https://arxiv.org/html/2305.10601v2/x1.png"> |[Github](https://github.com/princeton-nlp/tree-of-thought-llm) <br> [Paper](https://arxiv.org/abs/2305.10601)| [//]: #04/08
      • ![Star - of-thoughts) [![Publish](https://img.shields.io/badge/Conference-AAAI_2024-blue)]()<br>[Graph of Thoughts: Solving Elaborate Problems with Large Language Models](https://arxiv.org/abs/2308.09687) <br> Maciej Besta, Nils Blach, Ales Kubicek, Robert Gerstenberger, Michal Podstawski, Lukas Gianinazzi, Joanna Gajda, Tomasz Lehmann, Hubert Niewiadomski, Piotr Nyczyk, Torsten Hoefler |<img width="1002" alt="image" src="figures/got.png"> |[Github](https://github.com/spcl/graph-of-thoughts) <br> [Paper](https://arxiv.org/abs/2308.09687)| [//]: #04/08
      • ![Publish - Consistency Improves Chain of Thought Reasoning in Language Models](https://arxiv.org/abs/2203.11171) <br> Xuezhi Wang, Jason Wei, Dale Schuurmans, Quoc Le, Ed Chi, Sharan Narang, Aakanksha Chowdhery, Denny Zhou |<img width="1002" alt="image" src="figures/sc.png"> |[Paper](https://arxiv.org/abs/2203.11171)| [//]: #04/08
      • ![Publish - of-Thought Prompting Elicits Reasoning in Large Language Models](https://arxiv.org/abs/2201.11903) <br> Jason Wei, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, Brian Ichter, Fei Xia, Ed Chi, Quoc Le, Denny Zhou |<img width="1002" alt="image" src="figures/cot_prompting.png"> |[Paper](https://arxiv.org/abs/2201.11903)| [//]: #04/08
      • ![Star - nlp/tree-of-thought-llm) [![Publish](https://img.shields.io/badge/Conference-NeurIPS_2023-blue)]()<br>[Tree of Thoughts: Deliberate Problem Solving with Large Language Models](https://arxiv.org/abs/2305.10601) <br> Shunyu Yao, Dian Yu, Jeffrey Zhao, Izhak Shafran, Thomas L. Griffiths, Yuan Cao, Karthik Narasimhan |<img width="1002" alt="image" src="https://arxiv.org/html/2305.10601v2/x1.png"> |[Github](https://github.com/princeton-nlp/tree-of-thought-llm) <br> [Paper](https://arxiv.org/abs/2305.10601)| [//]: #04/08
      • ![Star - of-thoughts) [![Publish](https://img.shields.io/badge/Conference-AAAI_2024-blue)]()<br>[Graph of Thoughts: Solving Elaborate Problems with Large Language Models](https://arxiv.org/abs/2308.09687) <br> Maciej Besta, Nils Blach, Ales Kubicek, Robert Gerstenberger, Michal Podstawski, Lukas Gianinazzi, Joanna Gajda, Tomasz Lehmann, Hubert Niewiadomski, Piotr Nyczyk, Torsten Hoefler |<img width="1002" alt="image" src="figures/got.png"> |[Github](https://github.com/spcl/graph-of-thoughts) <br> [Paper](https://arxiv.org/abs/2308.09687)| [//]: #04/08
      • ![Star - of-symbol-planning) [![Publish](https://img.shields.io/badge/Conference-COLM_2024-blue)]()<br>[Chain-of-Symbol Prompting Elicits Planning in Large Langauge Models](https://arxiv.org/abs/2305.10276) <br> Hanxu Hu, Hongyuan Lu, Huajian Zhang, Yun-Ze Song, Wai Lam, Yue Zhang |<img width="1002" alt="image" src="https://arxiv.org/html/2305.10276v7/x1.png"> |[Github](https://github.com/hanxuhu/chain-of-symbol-planning) <br> [Paper](https://arxiv.org/abs/2305.10276)| [//]: #04/08
      • Thinking Machines: A Survey of LLM based Reasoning Strategies
      • ![Star - System2-Reasoning-LLM)<br>[From System 1 to System 2: A Survey of Reasoning Large Language Models](https://arxiv.org/abs/2502.17419) <br> Zhong-Zhi Li, Duzhen Zhang, Ming-Liang Zhang, Jiaxin Zhang, Zengyan Liu, Yuxuan Yao, Haotian Xu, Junhao Zheng, Pei-Jie Wang, Xiuyi Chen, Yingying Zhang, Fei Yin, Jiahua Dong, Zhijiang Guo, Le Song, Cheng-Lin Liu |<img width="1002" alt="image" src="https://arxiv.org/html/2502.17419v2/extracted/6232702/images/timeline.png"> |[Github](https://github.com/zzli2022/Awesome-System2-Reasoning-LLM) <br> [Paper](https://arxiv.org/abs/2502.17419)| [//]: #04/08
      • ![Star - AI-Lab/Program-of-Thoughts) [![Publish](https://img.shields.io/badge/Conference-TMLR_2023-blue)]()<br>[Program of Thoughts Prompting: Disentangling Computation from Reasoning for Numerical Reasoning Tasks](https://arxiv.org/abs/2211.12588) <br> Wenhu Chen, Xueguang Ma, Xinyi Wang, William W. Cohen |<img width="1002" alt="image" src="figures/pot.png"> |[Github](https://github.com/TIGER-AI-Lab/Program-of-Thoughts) <br> [Paper](https://arxiv.org/abs/2211.12588)| [//]: #04/08
      • ![Star - of-symbol-planning) [![Publish](https://img.shields.io/badge/Conference-COLM_2024-blue)]()<br>[Chain-of-Symbol Prompting Elicits Planning in Large Langauge Models](https://arxiv.org/abs/2305.10276) <br> Hanxu Hu, Hongyuan Lu, Huajian Zhang, Yun-Ze Song, Wai Lam, Yue Zhang |<img width="1002" alt="image" src="https://arxiv.org/html/2305.10276v7/x1.png"> |[Github](https://github.com/hanxuhu/chain-of-symbol-planning) <br> [Paper](https://arxiv.org/abs/2305.10276)| [//]: #04/08
      • Thinking Machines: A Survey of LLM based Reasoning Strategies
  • Updates