Awesome-Efficient-Reasoning-Models
[TMLR 2025] Efficient Reasoning Models: A Survey
https://github.com/fscdc/Awesome-Efficient-Reasoning-Models
Last synced: 9 days ago
JSON representation
-
Full list
-
Background Papers
-  <br> Yang Yue, Zhiqi Chen, Rui Lu, Andrew Zhao, Zhaokai Wang, Yang Yue, Shiji Song, Gao Huang |<img width="1002" alt="image" src="https://arxiv.org/html/2504.13837v1/x1.png"> |[Github](https://github.com/LeapLabTHU/limit-of-RLVR) <br> [Paper](https://arxiv.org/abs/2504.13837)| [//]: #04/22
-  <br> Jason Wei, Xuezhi Wang, Dale Schuurmans, Maarten Bosma, Brian Ichter, Fei Xia, Ed Chi, Quoc Le, Denny Zhou |<img width="1002" alt="image" src="figures/cot_prompting.png"> |[Paper](https://arxiv.org/abs/2201.11903)| [//]: #04/08
- ]()<br>[Tree of Thoughts: Deliberate Problem Solving with Large Language Models](https://arxiv.org/abs/2305.10601) <br> Shunyu Yao, Dian Yu, Jeffrey Zhao, Izhak Shafran, Thomas L. Griffiths, Yuan Cao, Karthik Narasimhan |<img width="1002" alt="image" src="https://arxiv.org/html/2305.10601v2/x1.png"> |[Github](https://github.com/princeton-nlp/tree-of-thought-llm) <br> [Paper](https://arxiv.org/abs/2305.10601)| [//]: #04/08
- ]()<br>[Graph of Thoughts: Solving Elaborate Problems with Large Language Models](https://arxiv.org/abs/2308.09687) <br> Maciej Besta, Nils Blach, Ales Kubicek, Robert Gerstenberger, Michal Podstawski, Lukas Gianinazzi, Joanna Gajda, Tomasz Lehmann, Hubert Niewiadomski, Piotr Nyczyk, Torsten Hoefler |<img width="1002" alt="image" src="figures/got.png"> |[Github](https://github.com/spcl/graph-of-thoughts) <br> [Paper](https://arxiv.org/abs/2308.09687)| [//]: #04/08
-  <br> Xuezhi Wang, Jason Wei, Dale Schuurmans, Quoc Le, Ed Chi, Sharan Narang, Aakanksha Chowdhery, Denny Zhou |<img width="1002" alt="image" src="figures/sc.png"> |[Paper](https://arxiv.org/abs/2203.11171)| [//]: #04/08
- ]()<br>[Program of Thoughts Prompting: Disentangling Computation from Reasoning for Numerical Reasoning Tasks](https://arxiv.org/abs/2211.12588) <br> Wenhu Chen, Xueguang Ma, Xinyi Wang, William W. Cohen |<img width="1002" alt="image" src="figures/pot.png"> |[Github](https://github.com/TIGER-AI-Lab/Program-of-Thoughts) <br> [Paper](https://arxiv.org/abs/2211.12588)| [//]: #04/08
- ]()<br>[Chain-of-Symbol Prompting Elicits Planning in Large Langauge Models](https://arxiv.org/abs/2305.10276) <br> Hanxu Hu, Hongyuan Lu, Huajian Zhang, Yun-Ze Song, Wai Lam, Yue Zhang |<img width="1002" alt="image" src="https://arxiv.org/html/2305.10276v7/x1.png"> |[Github](https://github.com/hanxuhu/chain-of-symbol-planning) <br> [Paper](https://arxiv.org/abs/2305.10276)| [//]: #04/08
- Thinking Machines: A Survey of LLM based Reasoning Strategies
-  <br> Zhong-Zhi Li, Duzhen Zhang, Ming-Liang Zhang, Jiaxin Zhang, Zengyan Liu, Yuxuan Yao, Haotian Xu, Junhao Zheng, Pei-Jie Wang, Xiuyi Chen, Yingying Zhang, Fei Yin, Jiahua Dong, Zhijiang Guo, Le Song, Cheng-Lin Liu |<img width="1002" alt="image" src="https://arxiv.org/html/2502.17419v2/extracted/6232702/images/timeline.png"> |[Github](https://github.com/zzli2022/Awesome-System2-Reasoning-LLM) <br> [Paper](https://arxiv.org/abs/2502.17419)| [//]: #04/08
-  <br> Xiaoxi Li, Jiajie Jin, Guanting Dong, Hongjin Qian, Yutao Zhu, Yongkang Wu, Ji-Rong Wen, Zhicheng Dou |<img width="1002" alt="image" src="figures/webthinker.png"> |[Github](https://github.com/RUC-NLPIR/WebThinker) <br> [Paper](https://arxiv.org/abs/2504.21776)|[//]: #05/02
-  <br> Yiping Wang, Qing Yang, Zhiyuan Zeng, Liliang Ren, Lucas Liu, Baolin Peng, Hao Cheng, Xuehai He, Kuan Wang, Jianfeng Gao, Weizhu Chen, Shuohang Wang, Simon Shaolei Du, Yelong Shen |<img width="1002" alt="image" src="https://arxiv.org/html/2504.20571v1/x3.png"> |[Github](https://github.com/ypwang61/One-Shot-RLVR) <br> [Paper](https://arxiv.org/abs/2504.20571)|[//]: #04/30
- Learning to Rank Chain-of-Thought: An Energy-Based Approach with Outcome Supervision - Fu Yang, Zongyu Lin, Xinfeng Li, Hao Xu, Kai-Wei Chang, Ying Nian Wu |<img width="1002" alt="image" src="figures/eorm.png"> |[Paper](https://arxiv.org/abs/2505.14999)| [//]: #05/22
- AceReason-Nemotron: Advancing Math and Code Reasoning through Reinforcement Learning
-  <br> Junxiao Yang, Jinzhe Tu, Haoran Liu, Xiaoce Wang, Chujie Zheng, Zhexin Zhang, Shiyao Cui, Caishun Chen, Tiantian He, Hongning Wang, Yew-Soon Ong, Minlie Huang |<img width="1002" alt="image" src="figures/BARREL.png"> |[Github](https://github.com/thu-coai/BARREL) <br> [Paper](https://arxiv.org/abs/2505.13529)| [//]: #05/23
-  <br> Kaiwen Zha, Zhengqi Gao, Maohao Shen, Zhang-Wei Hong, Duane S. Boning, Dina Katabi |<img width="1002" alt="image" src="figures/Tango.png"> |[Github](https://github.com/kaiwenzha/rl-tango) <br> [Paper](https://arxiv.org/abs/2505.15034)| [//]: #05/23
- Warm Up Before You Train: Unlocking General Reasoning in Resource-Constrained Settings
- Reasoning Models Better Express Their Confidence
- Mind the Gap: Bridging Thought Leap for Improved Chain-of-Thought Tuning
-  <br> Jiaan Wang, Fandong Meng, Jie Zhou |<img width="1002" alt="image" src="https://arxiv.org/html/2505.12996v1/x4.png"> |[Github](https://github.com/krystalan/DRT) <br> [Paper](https://arxiv.org/abs/2505.12996)| [//]: #05/20
- Absolute Zero: Reinforced Self-play Reasoning with Zero Data
-  <br> Zhiyuan Hu, Yibo Wang, Hanze Dong, Yuhui Xu, Amrita Saha, Caiming Xiong, Bryan Hooi, Junnan Li |<img width="1002" alt="image" src="https://arxiv.org/html/2505.10554v1/x2.png"> |[Github](https://github.com/zhiyuanhubj/Meta-Ability-Alignment) <br> [Paper](https://arxiv.org/abs/2505.10554)| [//]: #05/19
- | [//]: #05/18
- J1: Incentivizing Thinking in LLM-as-a-Judge via Reinforcement Learning
- INTELLECT-2: A Reasoning Model Trained Through Globally Decentralized Reinforcement Learning
- AM-Thinking-v1: Advancing the Frontier of Reasoning at 32B Scale
-  <br> Fangwei Zhu, Peiyi Wang, Zhifang Sui |<img width="1002" alt="image" src="https://arxiv.org/html/2505.04955v1/x2.png"> |[Github](https://github.com/solitaryzero/CoTs_are_Variables) <br> [Paper](https://arxiv.org/abs/2505.04955)| [//]: #05/17
-  <br> Xiaomi LLM-Core Team |<img width="1002" alt="image" src="https://arxiv.org/html/2505.07608v1/x1.png"> |[Github](https://github.com/xiaomimimo/MiMo) <br> [Paper](https://arxiv.org/abs/2505.07608)| [//]: #05/17
- Resa: Transparent Reasoning Models via SAEs
-  <br> MiniMax Team |<img width="1002" alt="image" src="https://arxiv.org/html/2506.13585v1/x1.png"> |[Github](https://github.com/MiniMax-AI/MiniMax-M1) <br> [Paper](https://arxiv.org/abs/2506.13585)| [//]: #06/24
-
Build SLM with Strong Reasoning Ability
- | [//]: #04/08
-  <br> Yuetai Li, Xiang Yue, Zhangchen Xu, Fengqing Jiang, Luyao Niu, Bill Yuchen Lin, Bhaskar Ramasubramanian, Radha Poovendran |<img width="1002" alt="image" src="https://arxiv.org/html/2502.12143v2/x1.png"> |[Github](https://github.com/Small-Model-Gap/Small-Model-Learnability-Gap) <br> [Paper](https://arxiv.org/abs/2502.12143)| [//]: #04/08
- Deconstructing Long Chain-of-Thought: A Structured Reasoning Optimization Framework for Long CoT Distillation
- ]()<br>[Small Language Models Need Strong Verifiers to Self-Correct Reasoning](https://arxiv.org/abs/2404.17140) <br> Yunxiang Zhang, Muhammad Khalifa, Lajanugen Logeswaran, Jaekyeom Kim, Moontae Lee, Honglak Lee, Lu Wang |<img width="1002" alt="image" src="https://arxiv.org/html/2404.17140v2/x1.png"> |[Github](https://github.com/yunx-z/SCORE) <br> [Paper](https://arxiv.org/abs/2404.17140)| [//]: #04/08
- Improving Mathematical Reasoning Capabilities of Small Language Models via Feedback-Driven Distillation
- | [//]: #04/08
- Thinking Slow, Fast: Scaling Inference Compute with Distilled Reasoners
- Distilling Reasoning Ability from Large Language Models with Adaptive Thinking
-  <br> Xinghao Chen, Zhijing Sun, Wenjin Guo, Miaoran Zhang, Yanjun Chen, Yirong Sun, Hui Su, Yijie Pan, Dietrich Klakow, Wenjie Li, Xiaoyu Shen |<img width="1002" alt="image" src="https://arxiv.org/html/2502.18001v1/x1.png"> |[Github](https://github.com/EIT-NLP/Distilling-CoT-Reasoning) <br> [Paper](https://arxiv.org/abs/2502.18001)| [//]: #04/08
- Towards Reasoning Ability of Small Language Models
-  <br> Ruikang Liu, Yuxuan Sun, Manyi Zhang, Haoli Bai, Xianzhi Yu, Tiezheng Yu, Chun Yuan, Lu Hou |<img width="1002" alt="image" src="figures/quant_hurt.png"> |[Github](https://github.com/ruikangliu/Quantized-Reasoning-Models) <br> [Paper](https://arxiv.org/abs/2504.04823)| [//]: #04/14
- When Reasoning Meets Compression: Benchmarking Compressed Large Reasoning Models on Complex Reasoning Tasks
-  <br> Quy-Anh Dang, Chris Ngo |<img src="https://arxiv.org/html/2503.16219v1/extracted/6296504/images/pass1.png" width="45%"> <img src="https://arxiv.org/html/2503.16219v1/extracted/6296504/images/costs.png" width="45%"> |[Github](https://github.com/knoveleng/open-rs) <br> [Paper](https://arxiv.org/abs/2503.16219)| [//]: #04/08
-  <br> Weihao Zeng, Yuzhen Huang, Qian Liu, Wei Liu, Keqing He, Zejun Ma, Junxian He |<img width="1002" alt="image" src="figures/simplerl_zoo.png"> |[Github](https://github.com/hkust-nlp/simpleRL-reason) <br> [Paper](https://arxiv.org/abs/2503.18892)| [//]: #04/08
-  <br> Shangshang Wang, Julian Asilis, Ömer Faruk Akgül, Enes Burak Bilgin, Ollie Liu, Willie Neiswanger |<img width="1002" alt="image" src="https://arxiv.org/html/2504.15777v1/x4.png"> |[Github](https://github.com/shangshang-wang/Tina) <br> [Paper](https://arxiv.org/abs/2504.15777)| [//]: #04/25
- Llama-Nemotron: Efficient Reasoning Models
- Phi-4-Mini-Reasoning: Exploring the Limits of Small Reasoning Language Models in Math - Chun Chen, Mei Gao, Young Jin Kim, Yunsheng Li, Liliang Ren, Yelong Shen, Shuohang Wang, Weijian Xu, Jianfeng Gao, Weizhu Chen |<img width="1002" alt="image" src="figures/phi_4_mini_reasoning.png"> |[Paper](https://arxiv.org/abs/2504.21233)|[//]: #05/02
- Phi-4-reasoning Technical Report
- Harnessing Negative Signals: Reinforcement Distillation from Teacher Data for LLM Reasoning
- Skip-Thinking: Chunk-wise Chain-of-Thought Distillation Enable Smaller Language Models to Reason Better and Faster
- Replacing thinking with tool usage enables reasoning in small language models
- DeepScaleR - project.com/)
-  <br> Yuetai Li, Xiang Yue, Zhangchen Xu, Fengqing Jiang, Luyao Niu, Bill Yuchen Lin, Bhaskar Ramasubramanian, Radha Poovendran |<img width="1002" alt="image" src="https://arxiv.org/html/2502.12143v2/x1.png"> |[Github](https://github.com/Small-Model-Gap/Small-Model-Learnability-Gap) <br> [Paper](https://arxiv.org/abs/2502.12143)| [//]: #04/08
- ![Star - AAAI_2024-blue)]()<br>[Turning Dust into Gold: Distilling Complex Reasoning Capabilities from LLMs by Leveraging Negative Data](https://arxiv.org/abs/2312.12832) <br> Yiwei Li, Peiwen Yuan, Shaoxiong Feng, Boyuan Pan, Bin Sun, Xinglin Wang, Heda Wang, Kan Li |<img width="1002" alt="image" src="https://arxiv.org/html/2312.12832v1/x1.png"> |[Github](https://github.com/Yiwei98/TDG) <br> [Paper](https://arxiv.org/abs/2312.12832)| [//]: #04/08
- ]()<br>[Small Language Models Need Strong Verifiers to Self-Correct Reasoning](https://arxiv.org/abs/2404.17140) <br> Yunxiang Zhang, Muhammad Khalifa, Lajanugen Logeswaran, Jaekyeom Kim, Moontae Lee, Honglak Lee, Lu Wang |<img width="1002" alt="image" src="https://arxiv.org/html/2404.17140v2/x1.png"> |[Github](https://github.com/yunx-z/SCORE) <br> [Paper](https://arxiv.org/abs/2404.17140)| [//]: #04/08
- ![Star - COLING_2025-blue)]()<br>[SKIntern : Internalizing Symbolic Knowledge for Distilling Better CoT Capabilities into Small Language Models](https://arxiv.org/abs/2409.13183) <br> Huanxuan Liao, Shizhu He, Yupu Hao, Xiang Li, Yuanzhe Zhang, Jun Zhao, Kang Liu |<img width="1002" alt="image" src="https://arxiv.org/html/2409.13183v2/x1.png"> |[Github](https://github.com/Xnhyacinth/SKIntern) <br> [Paper](https://arxiv.org/abs/2409.13183)| [//]: #04/08
-  <br> Xinghao Chen, Zhijing Sun, Wenjin Guo, Miaoran Zhang, Yanjun Chen, Yirong Sun, Hui Su, Yijie Pan, Dietrich Klakow, Wenjie Li, Xiaoyu Shen |<img width="1002" alt="image" src="https://arxiv.org/html/2502.18001v1/x1.png"> |[Github](https://github.com/EIT-NLP/Distilling-CoT-Reasoning) <br> [Paper](https://arxiv.org/abs/2502.18001)| [//]: #04/08
-  <br> Ruikang Liu, Yuxuan Sun, Manyi Zhang, Haoli Bai, Xianzhi Yu, Tiezheng Yu, Chun Yuan, Lu Hou |<img width="1002" alt="image" src="figures/quant_hurt.png"> |[Github](https://github.com/ruikangliu/Quantized-Reasoning-Models) <br> [Paper](https://arxiv.org/abs/2504.04823)| [//]: #04/14
-  <br> Shangshang Wang, Julian Asilis, Ömer Faruk Akgül, Enes Burak Bilgin, Ollie Liu, Willie Neiswanger |<img width="1002" alt="image" src="https://arxiv.org/html/2504.15777v1/x4.png"> |[Github](https://github.com/shangshang-wang/Tina) <br> [Paper](https://arxiv.org/abs/2504.15777)| [//]: #04/25
-  <br> Quy-Anh Dang, Chris Ngo |<img src="https://arxiv.org/html/2503.16219v1/extracted/6296504/images/pass1.png" width="45%"> <img src="https://arxiv.org/html/2503.16219v1/extracted/6296504/images/costs.png" width="45%"> |[Github](https://github.com/knoveleng/open-rs) <br> [Paper](https://arxiv.org/abs/2503.16219)| [//]: #04/08
-  <br> Weihao Zeng, Yuzhen Huang, Qian Liu, Wei Liu, Keqing He, Zejun Ma, Junxian He |<img width="1002" alt="image" src="figures/simplerl_zoo.png"> |[Github](https://github.com/hkust-nlp/simpleRL-reason) <br> [Paper](https://arxiv.org/abs/2503.18892)| [//]: #04/08
-
Competition
-  [AIMO-2 Winning Solution: Building State-of-the-Art Mathematical Reasoning Models with OpenMathReasoning dataset](https://arxiv.org/abs/2504.16891). Ivan Moshkov, Darragh Hanley, Ivan Sorokin, Shubham Toshniwal, Christof Henkel, Benedikt Schifferer, Wei Du, Igor Gitman. [[Paper]](https://arxiv.org/abs/2504.16891)[[Github]](https://github.com/NVIDIA/NeMo-Skills)
-
Efficient Agentic Reasoning
-
Efficient Multimodal Reasoning
- MilChat: Introducing Chain of Thought Reasoning and GRPO to a Multimodal Small Language Model for Remote Sensing
-  <br> Yibin Wang, Zhimin Li, Yuhang Zang, Chunyu Wang, Qinglin Lu, Cheng Jin, Jiaqi Wang |<img width="1002" alt="image" src="figures/umrf.png"> |[Github](https://github.com/CodeGoat24/UnifiedReward) <br> [Paper](https://arxiv.org/abs/2505.03318)| [//]: #05/17
-  <br> Song Wang, Gongfan Fang, Lingdong Kong, Xiangtai Li, Jianyun Xu, Sheng Yang, Qiang Li, Jianke Zhu, Xinchao Wang |<img width="1002" alt="image" src="https://arxiv.org/html/2505.23727v1/x2.png"> |[Github](https://github.com/songw-zju/PixelThink) <br> [Paper](https://arxiv.org/abs/2505.23727)| [//]: #06/06
-  <br> Jiaqi Wang, Kevin Qinghong Lin, James Cheng, Mike Zheng Shou |<img width="1002" alt="image" src="https://arxiv.org/html/2505.16854v1/x1.png"> |[Github](https://github.com/kokolerk/TON) <br> [Paper](https://arxiv.org/abs/2505.16854)| [//]: #05/24
- Active-O3: Empowering Multimodal Large Language Models with Active Perception via GRPO
- One RL to See Them All: Visual Triple Unified Reinforcement Learning
- MINT-CoT: Enabling Interleaved Visual Tokens in Mathematical Chain-of-Thought Reasoning
- GThinker: Towards General Multimodal Reasoning via Cue-Guided Rethinking
- Fast or Slow? Integrating Fast Intuition and Deliberate Thinking for Enhancing Visual Question Answering
- Grounded Reinforcement Learning for Visual Reasoning
- Infi-MMR: Curriculum-based Unlocking Multimodal Reasoning via Phased Reinforcement Learning in Multimodal Small Language Models - Chi Cheung,Shengyu Zhang,Fei Wu,Hongxia Yang |<img width="1002" alt="image" src="https://arxiv.org/html/2505.23091v2/extracted/6518453/images_folder/mmr1_framework_update.png"> |[Paper](https://arxiv.org/abs/2505.23091)| [//]: #06/11
-  <br> Xu Chu, Xinrong Chen, Guanyu Wang, Zhijie Tan, Kui Huang, Wenyu Lv, Tong Mo, Weiping Li |<img width="1002" alt="image" src="https://arxiv.org/html/2505.23558v2/x5.png"> |[Github](https://github.com/Liar406/Look_Again) <br> [Paper](https://arxiv.org/abs/2505.23558)| [//]: #06/11
- Argus: Vision-Centric Reasoning with Grounded Chain-of-Thought - An Huang,Guilin Liu,Shiwei Sheng,Shilong Liu,Liang-Yan Gui,Jan Kautz,Yu-Xiong Wang,Zhiding Yu |<img width="1002" alt="image" src="https://arxiv.org/html/2505.23766v1/x2.png"> |[Paper](https://arxiv.org/abs/2505.23766)| [//]: #06/11
- Understand, Think, and Answer: Advancing Visual Reasoning with Large Multimodal Models
- Point-RFT: Improving Multimodal Reasoning with Visually Grounded Reinforcement Finetuning - Ching Lin, Kevin Lin, Wangmeng Zuo, Lijuan Wang |<img width="1002" alt="image" src="https://arxiv.org/html/2505.19702v1/x1.png"> |[Paper](https://arxiv.org/abs/2505.19702)| [//]: #06/11
- Ground-R1: Incentivizing Grounded Visual Reasoning via Reinforcement Learning
- SATORI-R1: Incentivizing Multimodal Reasoning with Spatial Grounding and Verifiable Rewards
- Don't Look Only Once: Towards Multimodal Interactive Reasoning with Selective Visual Revisitation
- Visual Abstract Thinking Empowers Multimodal Reasoning
- VTool-R1: VLMs Learn to Think with Images via Reinforcement Learning on Multimodal Tool Use
- DreamPRM: Domain-Reweighted Process Reward Model for Multimodal Reasoning
- FutureSightDrive: Thinking Visually with Spatio-Temporal CoT for Autonomous Driving
-  <br> Hongchen Wei, Zhenzhong Chen |<img width="1002" alt="image" src="https://arxiv.org/html/2505.16151v1/x1.png"> |[Github](https://github.com/hcwei13/FRANK-ZERO-Inference) <br> [Paper](https://arxiv.org/abs/2505.16151)| [//]: #05/24
-  <br> Huanjin Yao, Qixiang Yin, Jingyi Zhang, Min Yang, Yibo Wang, Wenhao Wu, Fei Su, Li Shen, Minghui Qiu, Dacheng Tao, Jiaxing Huang |<img width="1002" alt="image" src="https://arxiv.org/html/2505.16673v1/x2.png"> |[Github](https://github.com/HJYao00/R1-ShareVL) <br> [Paper](https://arxiv.org/abs/2505.16673)| [//]: #05/24
-  <br> Kaixuan Fan, Kaituo Feng, Haoming Lyu, Dongzhan Zhou, Xiangyu Yue |<img width="1002" alt="image" src="https://arxiv.org/html/2505.17018v1/x1.png"> |[Github](https://github.com/kxfan2002/SophiaVL-R1) <br> [Paper](https://arxiv.org/abs/2505.17018)| [//]: #05/24
- VLM-R3: Region Recognition, Reasoning, and Refinement for Enhanced Multimodal Chain-of-Thought
-  <br> Siqu Ou, Hongcheng Liu, Pingjie Wang, Yusheng Liao, Chuan Xuan, Yanfeng Wang, Yu Wang |<img width="1002" alt="image" src="https://arxiv.org/html/2505.16579v1/x1.png"> |[Github](https://github.com/Cratileo/D2R) <br> [Paper](https://arxiv.org/abs/2505.16579)| [//]: #05/24
- GRIT: Teaching MLLMs to Think with Images - Chen Kuo, Yuting Zheng, Sravana Jyothi Narayanaraju, Xinze Guan, Xin Eric Wang |<img width="1002" alt="image" src="https://arxiv.org/html/2505.15879v1/x1.png"> |[Paper](https://arxiv.org/abs/2505.15879)| [//]: #05/24
- UniVG-R1: Reasoning Guided Universal Visual Grounding with Reinforcement Learning
- Visual Thoughts: A Unified Perspective of Understanding Multimodal Chain-of-Thought
-  <br> Jiaer Xia, Yuhang Zang, Peng Gao, Yixuan Li, Kaiyang Zhou |<img width="1002" alt="image" src="https://arxiv.org/html/2505.14677v1/x3.png"> |[Github](https://github.com/maifoundations/Visionary-R1) <br> [Paper](https://arxiv.org/abs/2505.14677)| [//]: #05/22
-  <br> Yuqi Liu, Tianyuan Qu, Zhisheng Zhong, Bohao Peng, Shu Liu, Bei Yu, Jiaya Jia |<img width="1002" alt="image" src="https://arxiv.org/html/2505.12081v1/x1.png"> |[Github](https://github.com/dvlab-research/VisionReasoner) <br> [Paper](https://arxiv.org/abs/2505.12081)| [//]: #05/20
-  <br> Lingxiao Du, Fanqing Meng, Zongkai Liu, Zhixiang Zhou, Ping Luo, Qiaosheng Zhang, Wenqi Shao |<img width="1002" alt="image" src="figures/mmprm.png"> |[Github](https://github.com/ModalMinds/MM-PRM) <br> [Paper](https://arxiv.org/abs/2505.13427)| [//]: #05/20
- CoT-Vid: Dynamic Chain-of-Thought Routing with Self Verification for Training-Free Video Reasoning
- Visual Planning: Let's Think Only with Images
-  <br> Qianchu Liu, Sheng Zhang, Guanghui Qin, Timothy Ossowski, Yu Gu, Ying Jin, Sid Kiblawi, Sam Preston, Mu Wei, Paul Vozila, Tristan Naumann, Hoifung Poon |<img width="1002" alt="image" src="https://arxiv.org/html/2505.03981v1/x1.png"> |[Github](https://github.com/microsoft/x-reasoner) <br> [Paper](https://arxiv.org/abs/2505.03981)| [//]: #05/18
- Skywork-VL Reward: An Effective Reward Model for Multimodal Understanding and Reasoning
-  <br> Ke Wang, Junting Pan, Linda Wei, Aojun Zhou, Weikang Shi, Zimu Lu, Han Xiao, Yunqiao Yang, Houxing Ren, Mingjie Zhan, Hongsheng Li |<img width="1002" alt="image" src="https://arxiv.org/html/2505.10557v1/x1.png"> |[Paper](https://arxiv.org/abs/2505.10557)| [//]: #05/18
- ![Star
- ![Star - ICML_2025-blue)]()<br>[Bring Reason to Vision: Understanding Perception and Reasoning through Model Merging](https://arxiv.org/abs/2505.05464) <br> Shiqi Chen, Jinghan Zhang, Tongyao Zhu, Wei Liu, Siyang Gao, Miao Xiong, Manling Li, Junxian He |<img width="1002" alt="image" src="https://arxiv.org/html/2505.05464v1/x1.png"> |[Github](https://github.com/shiqichen17/VLM_Merging) <br> [Paper](https://arxiv.org/abs/2505.05464)| [//]: #05/18
- Seed1.5-VL Technical Report
-  <br> Shenshen Li, Kaiyuan Deng, Lei Wang, Hao Yang, Chong Peng, Peng Yan, Fumin Shen, Heng Tao Shen, Xing Xu |<img width="1002" alt="image" src="https://arxiv.org/html/2506.04755v1/x3.png"> |[Github](https://github.com/Leo-ssl/RAP) <br> [Paper](https://arxiv.org/abs/2506.04755)| [//]: #06/16
-  <br> Junfei Wu, Jian Guan, Kaituo Feng, Qiang Liu, Shu Wu, Liang Wang, Wei Wu, Tieniu Tan |<img width="1002" alt="image" src="https://arxiv.org/html/2506.09965v1/x4.png"> |[Github](https://github.com/AntResearchNLP/ViLaSR) <br> [Paper](https://arxiv.org/abs/2506.09965)| [//]: #06/16
- Multi-Step Visual Reasoning with Visual Tokens Scaling and Verification
- VGR: Visual Grounded Reasoning
- Uni-cot: Towards Unified Chain-of-Thought Reasoning Across Text and Vision
-  <br> Yang Chen, Yufan Shen, Wenxuan Huang, Sheng Zhou, Qunshu Lin, Xinyu Cai, Zhi Yu, Jiajun Bu, Botian Shi, Yu Qiao |<img width="1002" alt="image" src="https://arxiv.org/html/2507.20766v4/x1.png"> |[Github](https://github.com/L-O-I/RRVF) <br> [Paper](https://arxiv.org/abs/2507.20766)| [//]: #08/09
-  <br> Qi Yang, Bolin Ni, Shiming Xiang, Han Hu, Houwen Peng, Jie Jiang |<img width="1002" alt="image" src="https://arxiv.org/html/2508.21113v2/x5.png"> |[Github](https://github.com/yannqi/R-4B) <br> [Paper](https://arxiv.org/abs/2508.21113)| [//]: #09/10
- Think Smart, Not Hard: Difficulty Adaptive Reasoning for Large Audio Language Models
-  <br> Sicheng Feng, Kaiwen Tuo, Song Wang, Lingdong Kong, Jianke Zhu, Huan Wang |<img width="1002" alt="image" src="https://arxiv.org/html/2510.02240v1/x2.png"> |[Github](https://github.com/fscdc/RewardMap) <br> [Paper](https://arxiv.org/abs/2510.02240)| [//]: #10/19
-  <br> Chris, Yichen Wei, Yi Peng, Xiaokun Wang, Weijie Qiu, Wei Shen, Tianyidan Xie, Jiangbo Pei, Jianhao Zhang, Yunzhuo Hao, Xuchen Song, Yang Liu, Yahui Zhou |<img width="1002" alt="image" src="https://arxiv.org/html/2504.16656v2/extracted/6389842/figure/ssb_diagram.png"> |[Github](https://github.com/SkyworkAI/Skywork-R1V) <br> [Paper](https://arxiv.org/abs/2504.16656)| [//]: #04/29
-  <br> Qi Yang, Bolin Ni, Shiming Xiang, Han Hu, Houwen Peng, Jie Jiang |<img width="1002" alt="image" src="https://arxiv.org/html/2508.21113v2/x5.png"> |[Github](https://github.com/yannqi/R-4B) <br> [Paper](https://arxiv.org/abs/2508.21113)| [//]: #09/10
-  <br> Song Wang, Gongfan Fang, Lingdong Kong, Xiangtai Li, Jianyun Xu, Sheng Yang, Qiang Li, Jianke Zhu, Xinchao Wang |<img width="1002" alt="image" src="https://arxiv.org/html/2505.23727v1/x2.png"> |[Github](https://github.com/songw-zju/PixelThink) <br> [Paper](https://arxiv.org/abs/2505.23727)| [//]: #06/06
-  <br> Jiaqi Wang, Kevin Qinghong Lin, James Cheng, Mike Zheng Shou |<img width="1002" alt="image" src="https://arxiv.org/html/2505.16854v1/x1.png"> |[Github](https://github.com/kokolerk/TON) <br> [Paper](https://arxiv.org/abs/2505.16854)| [//]: #05/24
-  <br> Sicheng Feng, Kaiwen Tuo, Song Wang, Lingdong Kong, Jianke Zhu, Huan Wang |<img width="1002" alt="image" src="https://arxiv.org/html/2510.02240v1/x2.png"> |[Github](https://github.com/fscdc/RewardMap) <br> [Paper](https://arxiv.org/abs/2510.02240)| [//]: #10/19
-  <br> Yang Chen, Yufan Shen, Wenxuan Huang, Sheng Zhou, Qunshu Lin, Xinyu Cai, Zhi Yu, Jiajun Bu, Botian Shi, Yu Qiao |<img width="1002" alt="image" src="https://arxiv.org/html/2507.20766v4/x1.png"> |[Github](https://github.com/L-O-I/RRVF) <br> [Paper](https://arxiv.org/abs/2507.20766)| [//]: #08/09
-  <br> Junfei Wu, Jian Guan, Kaituo Feng, Qiang Liu, Shu Wu, Liang Wang, Wei Wu, Tieniu Tan |<img width="1002" alt="image" src="https://arxiv.org/html/2506.09965v1/x4.png"> |[Github](https://github.com/AntResearchNLP/ViLaSR) <br> [Paper](https://arxiv.org/abs/2506.09965)| [//]: #06/16
-  <br> Xu Chu, Xinrong Chen, Guanyu Wang, Zhijie Tan, Kui Huang, Wenyu Lv, Tong Mo, Weiping Li |<img width="1002" alt="image" src="https://arxiv.org/html/2505.23558v2/x5.png"> |[Github](https://github.com/Liar406/Look_Again) <br> [Paper](https://arxiv.org/abs/2505.23558)| [//]: #06/11
-  <br> Hongchen Wei, Zhenzhong Chen |<img width="1002" alt="image" src="https://arxiv.org/html/2505.16151v1/x1.png"> |[Github](https://github.com/hcwei13/FRANK-ZERO-Inference) <br> [Paper](https://arxiv.org/abs/2505.16151)| [//]: #05/24
-  <br> Huanjin Yao, Qixiang Yin, Jingyi Zhang, Min Yang, Yibo Wang, Wenhao Wu, Fei Su, Li Shen, Minghui Qiu, Dacheng Tao, Jiaxing Huang |<img width="1002" alt="image" src="https://arxiv.org/html/2505.16673v1/x2.png"> |[Github](https://github.com/HJYao00/R1-ShareVL) <br> [Paper](https://arxiv.org/abs/2505.16673)| [//]: #05/24
-  <br> Kaixuan Fan, Kaituo Feng, Haoming Lyu, Dongzhan Zhou, Xiangyu Yue |<img width="1002" alt="image" src="https://arxiv.org/html/2505.17018v1/x1.png"> |[Github](https://github.com/kxfan2002/SophiaVL-R1) <br> [Paper](https://arxiv.org/abs/2505.17018)| [//]: #05/24
-  <br> Siqu Ou, Hongcheng Liu, Pingjie Wang, Yusheng Liao, Chuan Xuan, Yanfeng Wang, Yu Wang |<img width="1002" alt="image" src="https://arxiv.org/html/2505.16579v1/x1.png"> |[Github](https://github.com/Cratileo/D2R) <br> [Paper](https://arxiv.org/abs/2505.16579)| [//]: #05/24
-  <br> Ling Yang, Ye Tian, Bowen Li, Xinchen Zhang, Ke Shen, Yunhai Tong, Mengdi Wang |<img width="1002" alt="image" src="figures/mmaba.png"> |[Github](https://github.com/Gen-Verse/MMaDA) <br> [Paper](https://arxiv.org/abs/2505.15809)| [//]: #05/22
-  <br> Jiaer Xia, Yuhang Zang, Peng Gao, Yixuan Li, Kaiyang Zhou |<img width="1002" alt="image" src="https://arxiv.org/html/2505.14677v1/x3.png"> |[Github](https://github.com/maifoundations/Visionary-R1) <br> [Paper](https://arxiv.org/abs/2505.14677)| [//]: #05/22
-  <br> Lingxiao Du, Fanqing Meng, Zongkai Liu, Zhixiang Zhou, Ping Luo, Qiaosheng Zhang, Wenqi Shao |<img width="1002" alt="image" src="figures/mmprm.png"> |[Github](https://github.com/ModalMinds/MM-PRM) <br> [Paper](https://arxiv.org/abs/2505.13427)| [//]: #05/20
-
Evaluation and Benchmarks
-  <br> Xinyin Ma, Guangnian Wan, Runpeng Yu, Gongfan Fang, Xinchao Wang |<img width="1002" alt="image" src="figures/cot_valve.png"> |[Github](https://github.com/horseee/CoT-Valve) <br> [Paper](https://arxiv.org/abs/2502.09601)|[//]: #03/16
- THOUGHTTERMINATOR: Benchmarking, Calibrating, and Mitigating Overthinking in Reasoning Models
- Do NOT Think That Much for 2+3=? On the Overthinking of o1-Like LLMs
-  <br> Berk Atil, Sarp Aykent, Alexa Chittams, Lisheng Fu, Rebecca J. Passonneau, Evan Radcliffe, Guru Rajan Rajagopal, Adam Sloan, Tomasz Tudrej, Ferhan Ture, Zhe Wu, Lixinyu Xu, Breck Baldwin |<img width="1002" alt="image" src="https://arxiv.org/html/2408.04667v5/extracted/6331111/max_min_diff.png"> |[Github](https://github.com/breckbaldwin/llm-stability) <br> [Paper](https://arxiv.org/abs/2408.04667)| [//]: #04/08
- The Danger of Overthinking: Examining the Reasoning-Action Dilemma in Agentic Tasks
- Evaluating Large Language Models Trained on Code
- τ-bench: A Benchmark for Tool-Agent-User Interaction in Real-World Domains
-  <br> Junnan Liu, Hongwei Liu, Linchen Xiao, Ziyi Wang, Kuikun Liu, Songyang Gao, Wenwei Zhang, Songyang Zhang, Kai Chen |<img width="1002" alt="image" src="https://arxiv.org/html/2412.13147v3/x1.png"> |[Github](https://github.com/open-compass/GPassK) <br> [Paper](https://arxiv.org/abs/2412.13147)| [//]: #04/08
- LongPerceptualThoughts: Distilling System-2 Reasoning for System-1 Perception - Hong Liao, Sven Elflein, Liu He, Laura Leal-Taixé, Yejin Choi, Sanja Fidler, David Acuna |<img width="1002" alt="image" src="https://arxiv.org/html/2504.15362v1/x1.png"> |[Paper](https://arxiv.org/abs/2504.15362)| [//]: #04/23
-  <br> Shubham Parashar, Blake Olson, Sambhav Khurana, Eric Li, Hongyi Ling, James Caverlee, Shuiwang Ji |<img width="1002" alt="image" src="https://arxiv.org/html/2502.12521v1/x1.png"> |[Github](https://github.com/divelab/sys2bench) <br> [Paper](https://arxiv.org/abs/2502.12521)| [//]: #04/08
-  <br> Fan Liu, Wenshuo Chao, Naiqiang Tan, Hao Liu |<img width="1002" alt="image" src="https://arxiv.org/html/2502.07191v4/x1.png"> |[Github](https://github.com/usail-hkust/benchmark_inference_time_computation_LLM) <br> [Paper](https://arxiv.org/abs/2502.07191)| [//]: #04/08
-  <br> Runze Liu, Junqi Gao, Jian Zhao, Kaiyan Zhang, Xiu Li, Biqing Qi, Wanli Ouyang, Bowen Zhou |<img width="1002" alt="image" src="https://arxiv.org/html/2502.06703v1/x2.png"> |[Github](https://github.com/RyanLiu112/compute-optimal-tts) <br> [Paper](https://arxiv.org/abs/2502.06703)| [//]: #04/08
- DNA Bench: When Silence is Smarter -- Benchmarking Over-Reasoning in Reasoning LLMs
- S1-Bench: A Simple Benchmark for Evaluating System 1 Thinking Capability of Large Reasoning Models
-  <br> Yukun Qi, Yiming Zhao, Yu Zeng, Xikun Bao, Wenxuan Huang, Lin Chen, Zehui Chen, Jie Zhao, Zhongang Qi, Feng Zhao |<img width="1002" alt="image" src="figures/video.png"> |[Github](https://github.com/zhishuifeiqian/VCR-Bench) <br> [Paper](https://arxiv.org/abs/2504.07956)| [//]: #04/16
-  <br> Zhikai Wang, Jiashuo Sun, Wenqi Zhang, Zhiqiang Hu, Xin Li, Fan Wang, Deli Zhao |<img width="1002" alt="image" src="https://arxiv.org/html/2504.18589v1/x1.png"> |[Github](https://github.com/alibaba-damo-academy/VCBench) <br> [Paper](https://arxiv.org/abs/2504.18589)| [//]: #04/29
-  <br> Yu Li, Qizhi Pei, Mengyuan Sun, Honglin Lin, Chenlin Ming, Xin Gao, Jiang Wu, Conghui He, Lijun Wu |<img width="1002" alt="image" src="https://arxiv.org/html/2504.19093v1/x2.png"> |[Github](https://github.com/Goodman-liyu/CipherBank) <br> [Paper](https://arxiv.org/abs/2504.19093)| [//]: #04/29
-  <br> Weiye Xu, Jiahao Wang, Weiyun Wang, Zhe Chen, Wengang Zhou, Aijun Yang, Lewei Lu, Houqiang Li, Xiaohua Wang, Xizhou Zhu, Wenhai Wang, Jifeng Dai, Jinguo Zhu |<img width="1002" alt="image" src="https://arxiv.org/html/2504.15279v1/x1.png"> |[Github](https://github.com/VisuLogic-Benchmark/VisuLogic-Eval) <br> [Paper](https://arxiv.org/abs/2504.15279)| [//]: #04/25
-  <br> Sicheng Feng, Song Wang, Shuyi Ouyang, Lingdong Kong, Zikai Song, Jianke Zhu, Huan Wang, Xinchao Wang |<img width="1002" alt="image" src="https://arxiv.org/html/2505.18675v2/x1.png"> |[Github](https://github.com/fscdc/ReasonMap) <br> [Paper](https://arxiv.org/abs/2505.18675)| [//]: #06/11
- ViC-Bench: Benchmarking Visual-Interleaved Chain-of-Thought Capability in MLLMs with Free-Style Intermediate State Representations
-  <br> Liyan Tang, Grace Kim, Xinyu Zhao, Thom Lake, Wenxuan Ding, Fangcong Yin, Prasann Singhal, Manya Wadhwa, Zeyu Leo Liu, Zayne Sprague, Ramya Namuduri, Bodun Hu, Juan Diego Rodriguez, Puyuan Peng, Greg Durrett |<img width="1002" alt="image" src="figures/chartmuseum.png"> |[Github](https://github.com/Liyan06/ChartMuseum) <br> [Paper](https://arxiv.org/abs/2505.13444)| [//]: #05/20
- Reasoning with OmniThought: A Large CoT Dataset with Verbosity and Cognitive Difficulty Annotations - v2.png"> |[Paper](https://arxiv.org/abs/2505.10937)| [//]: #05/19
- StoryReasoning Dataset: Using Chain-of-Thought for Scene Understanding and Grounded Story Generation
-  <br> Charlie Snell, Jaehoon Lee, Kelvin Xu, Aviral Kumar |<img width="1002" alt="image" src="figures/tts_effective.png"> |[Paper](https://arxiv.org/abs/2408.03314)| [//]: #04/08
- Think Deep, Think Fast: Investigating Efficiency of Verifier-free Inference-time-scaling Methods - Falcon, Ben Athiwaratkun, Qingyang Wu, Jue Wang, Shuaiwen Leon Song, Ce Zhang, Bhuwan Dhingra, James Zou |<img width="1002" alt="image" src="https://arxiv.org/html/2504.14047v1/x1.png"> |[Paper](https://arxiv.org/abs/2504.14047)| [//]: #04/23
-  <br> Ding Chen, Qingchen Yu, Pengyuan Wang, Wentao Zhang, Bo Tang, Feiyu Xiong, Xinchi Li, Minchuan Yang, Zhiyu Li |<img width="1002" alt="image" src="https://arxiv.org/html/2504.10481v1/x1.png"> |[Github](https://github.com/IAAR-Shanghai/xVerify) <br> [Paper](https://arxiv.org/abs/2504.10481)| [//]: #04/17
-  <br> Pranjal Aggarwal, Aman Madaan, Yiming Yang, Mausam |<img width="1002" alt="image" src="figures/asc.png"> |[Github](https://github.com/Pranjal2041/AdaptiveConsistency) <br> [Paper](https://arxiv.org/abs/2305.11860)| [//]: #04/08
- ![Star - ICLR_2024-blue)]()<br>[Escape Sky-high Cost: Early-stopping Self-Consistency for Multi-step Reasoning](https://arxiv.org/abs/2401.10480) <br> Yiwei Li, Peiwen Yuan, Shaoxiong Feng, Boyuan Pan, Xinglin Wang, Bin Sun, Heda Wang, Kan Li |<img width="1002" alt="image" src="https://arxiv.org/html/2401.10480v1/x1.png"> |[Github](https://github.com/Yiwei98/ESC) <br> [Paper](https://arxiv.org/abs/2401.10480)| [//]: #04/08
- ![Star - NAACL_Findings_2025-blue)]()<br>[Make Every Penny Count: Difficulty-Adaptive Self-Consistency for Cost-Efficient Reasoning](https://arxiv.org/abs/2408.13457) <br> Xinglin Wang, Shaoxiong Feng, Yiwei Li, Peiwen Yuan, Yueqi Zhang, Chuyi Tan, Boyuan Pan, Yao Hu, Kan Li |<img width="1002" alt="image" src="https://arxiv.org/html/2408.13457v3/x3.png"> |[Github](https://github.com/WangXinglin/DSC) <br> [Paper](https://arxiv.org/abs/2408.13457)| [//]: #04/08
- Path-Consistency: Prefix Enhancement for Efficient Inference in LLM
- Bridging Internal Probability and Self-Consistency for Effective and Efficient LLM Reasoning - Zhe Guo, Xiaoxing Ma, Yu-Feng Li |<img width="1002" alt="image" src="https://arxiv.org/html/2502.00511v2/x3.png"> |[Paper](https://arxiv.org/abs/2502.00511)| [//]: #04/08
- Confidence Improves Self-Consistency in LLMs
-  <br> Chengsong Huang, Langlin Huang, Jixuan Leng, Jiacheng Liu, Jiaxin Huang |<img width="1002" alt="image" src="https://arxiv.org/html/2503.00031v1/x2.png"> |[Github](https://github.com/Chengsong-Huang/Self-Calibration) <br> [Paper](https://arxiv.org/abs/2503.00031)| [//]: #04/08
- ]()<br>[Fast Best-of-N Decoding via Speculative Rejection](https://arxiv.org/abs/2410.20290) <br> Hanshi Sun, Momin Haider, Ruiqi Zhang, Huitao Yang, Jiahao Qiu, Ming Yin, Mengdi Wang, Peter Bartlett, Andrea Zanette |<img width="1002" alt="image" src="https://arxiv.org/html/2410.20290v2/x1.png"> |[Github](https://github.com/Zanette-Labs/SpeculativeRejection) <br> [Paper](https://arxiv.org/abs/2410.20290)| [//]: #04/08
- Sampling-Efficient Test-Time Scaling: Self-Estimating the Best-of-N Sampling in Early Decoding
- FastMCTS: A Simple Sampling Strategy for Data Synthesis
- ]()<br>[Non-myopic Generation of Language Models for Reasoning and Planning](https://arxiv.org/abs/2410.17195) <br> Chang Ma, Haiteng Zhao, Junlei Zhang, Junxian He, Lingpeng Kong |<img width="1002" alt="image" src="figures/predictive_decoding.png"> |[Github](https://github.com/chang-github-00/LLM-Predictive-Decoding) <br> [Paper](https://arxiv.org/abs/2410.17195)| [//]: #04/08
-  <br> Ethan Mendes, Alan Ritter |<img width="1002" alt="image" src="https://arxiv.org/html/2503.02878v1/x1.png"> |[Github](https://github.com/ethanm88/self-taught-lookahead) <br> [Paper](https://arxiv.org/abs/2503.02878)| [//]: #04/08
-  <br> Fangzhi Xu, Hang Yan, Chang Ma, Haiteng Zhao, Jun Liu, Qika Lin, Zhiyong Wu |<img width="1002" alt="image" src="https://arxiv.org/html/2503.13288v1/x2.png"> |[Github](https://github.com/xufangzhi/phi-Decoding) <br> [Paper](https://arxiv.org/abs/2503.13288)| [//]: #04/08
- Dynamic Parallel Tree Search for Efficient LLM Reasoning
-  <br> Jiayi Pan, Xiuyu Li, Long Lian, Charlie Snell, Yifei Zhou, Adam Yala, Trevor Darrell, Kurt Keutzer, Alane Suhr |<img width="1002" alt="image" src="https://arxiv.org/html/2504.15466v1/x2.png"> |[Github](https://github.com/Parallel-Reasoning/APR) <br> [Paper](https://arxiv.org/abs/2504.15466)| [//]: #04/23
- ]()<br>[Skeleton-of-Thought: Prompting LLMs for Efficient Parallel Generation](https://arxiv.org/abs/2307.15337) <br> Xuefei Ning, Zinan Lin, Zixuan Zhou, Zifu Wang, Huazhong Yang, Yu Wang |<img width="1002" alt="image" src="figures/skeleton_ot.png"> |[Github](https://github.com/imagination-research/sot) <br> [Paper](https://arxiv.org/abs/2307.15337)| [//]: #04/08
- Adaptive Skeleton Graph Decoding
-  <br> Baohao Liao, Yuhui Xu, Hanze Dong, Junnan Li, Christof Monz, Silvio Savarese, Doyen Sahoo, Caiming Xiong |<img width="1002" alt="image" src="figures/rsd.png"> |[Github](https://github.com/BaohaoLiao/RSD) <br> [Paper](https://arxiv.org/abs/2501.19324)| [//]: #04/08
- Meta-Reasoner: Dynamic Guidance for Optimized Inference-time Reasoning in Large Language Models
-  <br> Fengwei Teng, Zhaoyang Yu, Quan Shi, Jiayi Zhang, Chenglin Wu, Yuyu Luo |<img width="1002" alt="image" src="figures/aot.png"> |[Github](https://github.com/qixucen/atom) <br> [Paper](https://arxiv.org/abs/2502.12018)| [//]: #04/08
- DISC: Dynamic Decomposition Improves LLM Inference Scaling
- From Chaos to Order: The Atomic Reasoner Framework for Fine-grained Reasoning in Large Language Models
-  <br> Kun Xiang, Zhili Liu, Zihao Jiang, Yunshuang Nie, Kaixin Cai, Yiyang Yin, Runhui Huang, Haoxiang Fan, Hanhui Li, Weiran Huang, Yihan Zeng, Yu-Jie Yuan, Jianhua Han, Lanqing Hong, Hang Xu, Xiaodan Liang |<img width="1002" alt="image" src="figures/atom.png"> |[Github](https://github.com/Quinn777/AtomThink) <br> [Paper](https://arxiv.org/abs/2503.06252)| [//]: #04/08
- ]()<br>[Inference Scaling Laws: An Empirical Analysis of Compute-Optimal Inference for Problem-Solving with Language Models](https://arxiv.org/abs/2408.00724) <br> Yangzhen Wu, Zhiqing Sun, Shanda Li, Sean Welleck, Yiming Yang |<img width="1002" alt="image" src="figures/scaling_law.png"> |[Github](https://github.com/thu-wyz/inference_scaling) <br> [Paper](https://arxiv.org/abs/2408.00724)| [//]: #04/08
-  <br> Yuxiao Qu, Matthew Y. R. Yang, Amrith Setlur, Lewis Tunstall, Edward Emanuel Beeching, Ruslan Salakhutdinov, Aviral Kumar |<img width="1002" alt="image" src="figures/mrt.png"> |[Github](https://github.com/CMU-AIRe/MRT) <br> [Paper](https://arxiv.org/abs/2503.07572)| [//]: #04/08
-  <br> Rui Pan, Yinwei Dai, Zhihao Zhang, Gabriele Oliaro, Zhihao Jia, Ravi Netravali |<img width="1002" alt="image" src="figures/specreason.png"> |[Github](https://github.com/ruipeterpan/specreason) <br> [Paper](https://arxiv.org/abs/2504.07891)| [//]: #04/14
- Trace-of-Thought: Enhanced Arithmetic Problem Solving via Reasoning Distillation From Large to Small Language Models
-  <br> Jikai Wang, Juntao Li, Lijun Wu, Min Zhang |<img width="1002" alt="image" src="https://arxiv.org/html/2504.19095v1/extracted/6392438/images/scot.png"> |[Github](https://github.com/Jikai0Wang/Speculative_CoT) <br> [Paper](https://arxiv.org/abs/2504.19095)| [//]: #04/29
- Dynamic Early Exit in Reasoning Models
- Reward Reasoning Model
- Control-R: Towards controllable test-time scaling
- Plan and Budget: Effective and Efficient Test-Time Scaling on Large Language Model Reasoning
- First Finish Search: Efficient Test-Time Scaling in Large Language Models
- LIMOPro: Reasoning Refinement for Efficient and Effective Test-time Scaling
-