{"id":27123698,"url":"https://github.com/XiaoYee/Awesome_Efficient_LRM_Reasoning","last_synced_at":"2025-04-07T13:01:51.678Z","repository":{"id":284505808,"uuid":"954640261","full_name":"XiaoYee/Awesome_Efficient_LRM_Reasoning","owner":"XiaoYee","description":"😎 A Survey of Efficient Reasoning for Large Reasoning Models: Language, Multimodality, and Beyond (From Shanghai AI Lab)","archived":false,"fork":false,"pushed_at":"2025-04-06T06:22:06.000Z","size":251,"stargazers_count":135,"open_issues_count":1,"forks_count":2,"subscribers_count":4,"default_branch":"main","last_synced_at":"2025-04-06T06:27:41.218Z","etag":null,"topics":["budget-aware","chain-of-thought","cot","efficient","efficient-reasoning","long-cot","lrm","o1","o3","r1","reasoning","slow-fast"],"latest_commit_sha":null,"homepage":"https://arxiv.org/abs/2503.21614","language":null,"has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/XiaoYee.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2025-03-25T11:49:36.000Z","updated_at":"2025-04-06T06:23:41.000Z","dependencies_parsed_at":null,"dependency_job_id":"6f24a377-5668-4cc1-963d-904d8df3b956","html_url":"https://github.com/XiaoYee/Awesome_Efficient_LRM_Reasoning","commit_stats":null,"previous_names":["xiaoyee/awesome_efficient_lrm_reasoning"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/XiaoYee%2FAwesome_Efficient_LRM_Reasoning","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/XiaoYee%2FAwesome_Efficient_LRM_Reasoning/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/XiaoYee%2FAwesome_Efficient_LRM_Reasoning/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/XiaoYee%2FAwesome_Efficient_LRM_Reasoning/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/XiaoYee","download_url":"https://codeload.github.com/XiaoYee/Awesome_Efficient_LRM_Reasoning/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247657273,"owners_count":20974344,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["budget-aware","chain-of-thought","cot","efficient","efficient-reasoning","long-cot","lrm","o1","o3","r1","reasoning","slow-fast"],"created_at":"2025-04-07T13:01:46.199Z","updated_at":"2025-04-07T13:01:51.649Z","avatar_url":"https://github.com/XiaoYee.png","language":null,"funding_links":[],"categories":["Related Awesome Lists","A01_文本生成_文本对话","Topics","Related Survey","Other Lists","Resources"],"sub_categories":["大语言对话模型及数据","LLM Reasoning","Efficient Reasoning","TeX Lists","Applications"],"readme":"# A Survey of Efficient Reasoning for Large Reasoning Models: Language, Multimodality, and Beyond\n[![Paper](https://img.shields.io/badge/paper-A42C25?style=for-the-badge\u0026logo=arxiv\u0026logoColor=white)](https://arxiv.org/pdf/2503.21614)  [![Github](https://img.shields.io/badge/Github-000000?style=for-the-badge\u0026logo=github\u0026logoColor=000\u0026logoColor=white)](https://github.com/XiaoYee/Awesome_Efficient_LRM_Reasoning)\n[![Twitter](https://img.shields.io/badge/Twitter-%23000000.svg?style=for-the-badge\u0026logo=twitter\u0026logoColor=white)](https://x.com/suzhaochen0110/status/1905461785693749709?s=46)\n\n[![Awesome](https://awesome.re/badge.svg)](https://github.com/XiaoYee/Awesome_Efficient_LRM_Reasoning) \n[![License: MIT](https://img.shields.io/badge/License-MIT-green.svg)](https://opensource.org/licenses/MIT)\n![](https://img.shields.io/github/last-commit/XiaoYee/Awesome_Efficient_LRM_Reasoning?color=green) \n\n---\n\n## 🔔 News\n- [2025-04] We add more Hybrid models (e.g Mamba-Transformer) in Efficient Reasoning during Pre-training. It is more efficient to infer. \n- [2025-04] We add a new \"Model Merge\" category in Efficient Reasoning during Inference. It is feasible to be a promising direction. \n- [2025-03] We released our survey \"[A Survey of Efficient Reasoning for Large Reasoning Models: Language, Multimodality, and Beyond](https://arxiv.org/abs/2503.21614)\". This is the **first survey** for efficient reasoning of **Large Reasoning Models**, covering both language and multimodality.\n- [2025-03] We created this repository to maintain a paper list on Awesome-Efficient-LRM-Reasoning.\n\n---\n\n![Author](figs/author.png)\n\n![Taxonomy](figs/figure2.png)\n\n\u003e If you find our survey useful for your research, please consider citing:\n\n```\n@article{qu2025survey,\n  title={A Survey of Efficient Reasoning for Large Reasoning Models: Language, Multimodality, and Beyond},\n  author={Qu, Xiaoye and Li, Yafu and Su, Zhaochen and Sun, Weigao and Yan, Jianhao and Liu, Dongrui and Cui, Ganqu and Liu, Daizong and Liang, Shuxian and He, Junxian and others},\n  journal={arXiv preprint arXiv:2503.21614},\n  year={2025}\n}\n\n```\n\n\n---\n\n![Category](figs/category.png)\n\n\n## 🔥 Table of Contents\n\n- [Awesome-Efficient-LRM-Reasoning](#awesome-efficient-lrm-reasoning)\n  - [👀 Introduction](#-introduction)\n  - [💭 Efficient Reasoning during Inference](#-efficient-reasoning-during-inference)\n    - [Length Budgeting](#length-budgeting)\n    - [System Switch](#system-switch)\n    - [Model Switch](#model-switch)\n    - [Model Merge](#model-merge)\n    - [Parallel Search](#parallel-search)\n  - [💫 Efficient Reasoning with SFT](#-efficient-reasoning-with-sft)\n    - [Reasoning Chain Compression](#reasoning-chain-compression)\n    - [Latent-Space SFT](#latent-space-sft)\n  - [🧩 Efficient Reasoning with Reinforcement Learning](#-efficient-reasoning-with-reinforcement-learning)\n    - [Efficient Reinforcement Learning with Length Reward](#efficient-reinforcement-learning-with-length-reward)\n    - [Efficient Reinforcement Learning without Length Reward](#efficient-reinforcement-learning-without-length-reward)\n  - [💬 Efficient Reasoning during Pre-training](#-efficient-reasoning-during-pre-training)\n    - [Pretraining with Latent Space](#pretraining-with-latent-space)\n    - [Subquadratic Attention](#subquadratic-attention)\n    - [Linearization](#linearization)\n    - [Efficient Reasoning with Subquadratic Attention](#efficient-reasoning-with-subquadratic-attention)\n  - [🔖 Future Directions](#-future-directions)\n    - [Efficient Multimodal Reasoning and Video Reasoning](#efficient-multimodal-reasoning-and-video-reasoning)\n    - [Efficient Test-time Scaling and Infinity Thinking](#efficient-test-time-scaling-and-infinity-thinking)\n    - [Efficient and Trustworthy Reasoning](#efficient-and-trustworthy-reasoning)\n    - [Building Efficient Reasoning Applications](#building-efficient-reasoning-applications)\n    - [Evaluation and Benchmark](#evaluation-and-benchmark)\n---\n\n## 📜Content\n\n\n### 👀 Introduction\n\nIn the age of LRMs, we propose that \"**Efficiency is the essence of intelligence.**\"\nJust as a wise human knows when to stop thinking and start deciding, a wise model should know when to halt unnecessary deliberation. \nAn intelligent model should manipulate the token economy, i.e., allocating tokens purposefully, skipping redundancy, and optimizing the path to a solution. Rather than naively traversing every possible reasoning path, it should emulate a master strategist, balancing cost and performance with elegant precision.\n\nTo summarize, this survey makes the following key contributions to the literature:\n- Instead of offering a general overview of LRMs, we focus on the emerging and critical topic of **efficient reasoning** in LRMs, providing an in-depth and targeted analysis.\n- We identify and characterize common patterns of reasoning inefficiency, and outline the current challenges that are unique to improving reasoning efficiency in large models.\n- We provide a comprehensive review of recent advancements aimed at enhancing reasoning efficiency, structured across the end-to-end LRM development pipeline, from pretraining and supervised fine-tuning to reinforcement learning and inference.\n\n\n---\n\n## 🚀 Papers\n\n\n### 💭 Efficient Reasoning during Inference\n\n#### Length Budgeting\n\n- [How Well do LLMs Compress Their Own Chain-of-Thought? A Token Complexity Approach](https://arxiv.org/abs/2503.01141) ![](https://img.shields.io/badge/abs-2025.03-red)\n- [Sketch-of-Thought: Efficient LLM Reasoning with Adaptive Cognitive-Inspired Sketching](https://arxiv.org/abs/2503.05179) ![](https://img.shields.io/badge/abs-2025.03-red)\n- [Chain of Draft: Thinking Faster by Writing Less](https://arxiv.org/abs/2502.18600) ![](https://img.shields.io/badge/abs-2025.02-red)\n- [SafeChain: Safety of Language Models with Long Chain-of-Thought Reasoning Capabilities](https://arxiv.org/abs/2502.12025) ![](https://img.shields.io/badge/abs-2025.02-red)\n- [s1: Simple test-time scaling](https://arxiv.org/abs/2501.19393) ![](https://img.shields.io/badge/abs-2025.01-red)\n- [Token-budget-aware llm reasoning](https://arxiv.org/abs/2412.18547) ![](https://img.shields.io/badge/abs-2024.12-red)\n- [Efficiently Serving LLM Reasoning Programs with Certaindex](https://arxiv.org/abs/2412.20993) ![](https://img.shields.io/badge/abs-2024.12-red)\n- [Make every penny count: Difficulty-adaptive self-consistency for cost-efficient reasoning](https://arxiv.org/abs/2408.13457) ![](https://img.shields.io/badge/abs-2024.08-red)\n- [Scaling llm test-time compute optimally can be more effective than scaling model parameters](https://arxiv.org/abs/2408.03314) ![](https://img.shields.io/badge/abs-2024.08-red)\n- [Concise thoughts: Impact of output length on llm reasoning and cost](https://arxiv.org/abs/2407.19825) ![](https://img.shields.io/badge/abs-2024.07-red)\n- [The impact of reasoning step length on large language models](https://arxiv.org/abs/2401.04925v3) ![](https://img.shields.io/badge/abs-2024.01-red)\n- [The benefits of a concise chain of thought on problem-solving in large language models](https://arxiv.org/abs/2401.05618) ![](https://img.shields.io/badge/abs-2024.01-red)\n- [Guiding language model reasoning with planning tokens](https://arxiv.org/abs/2310.05707) ![](https://img.shields.io/badge/abs-2023.10-red)\n\n#### System Switch\n\n- [Think More, Hallucinate Less: Mitigating Hallucinations via Dual Process of Fast and Slow Thinking](https://arxiv.org/abs/2501.01306) ![](https://img.shields.io/badge/abs-2025.01-red)\n- [Dualformer: Controllable Fast and Slow Thinking by Learning with Randomized Reasoning Traces](https://arxiv.org/abs/2410.09918) ![](https://img.shields.io/badge/abs-2024.10-red)\n- [Visual Agents as Fast and Slow Thinkers](https://arxiv.org/abs/2408.08862) ![](https://img.shields.io/badge/abs-2024.08-red)\n- [System-1.x: Learning to Balance Fast and Slow Planning with Language Models](https://arxiv.org/abs/2407.14414) ![](https://img.shields.io/badge/abs-2024.07-red)\n- [DynaThink: Fast or slow? A dynamic decision-making framework for large language models](https://arxiv.org/abs/2407.01009) ![](https://img.shields.io/badge/abs-2024.07-red)\n\n#### Model Switch\n\n- [MixLLM: Dynamic Routing in Mixed Large Language Models](https://arxiv.org/abs/2502.18482) ![](https://img.shields.io/badge/abs-2025.02-red)\n- [Closer Look at Efficient Inference Methods: A Survey of Speculative Decoding](https://arxiv.org/abs/2411.13157) ![](https://img.shields.io/badge/abs-2024.11-red)\n- [EAGLE-2: Faster Inference of Language Models with Dynamic Draft Trees](https://arxiv.org/abs/2406.16858) ![](https://img.shields.io/badge/abs-2024.06-red)\n- [RouteLLM: Learning to Route LLMs with Preference Data](https://arxiv.org/abs/2406.18665) ![](https://img.shields.io/badge/abs-2024.06-red)\n- [LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding](https://arxiv.org/abs/2404.16710) ![](https://img.shields.io/badge/abs-2024.04-red)\n- [EAGLE: Speculative Sampling Requires Rethinking Feature Uncertainty](https://arxiv.org/abs/2401.15077) ![](https://img.shields.io/badge/abs-2024.01-red)\n- [Medusa: Simple LLM Inference Acceleration Framework with Multiple Decoding Heads](https://arxiv.org/abs/2401.10774) ![](https://img.shields.io/badge/abs-2024.01-red)\n- [Routing to the Expert: Efficient Reward-guided Ensemble of Large Language Models](https://arxiv.org/abs/2311.08692) ![](https://img.shields.io/badge/abs-2023.11-red)\n- [Speculative Decoding with Big Little Decoder](https://arxiv.org/abs/2302.07863) ![](https://img.shields.io/badge/abs-2023.02-red)\n\n#### Model Merge\n\n- [Unlocking efficient long-to-short llm reasoning with model merging](https://arxiv.org/abs/2503.20641) ![](https://img.shields.io/badge/abs-2025.03-red)\n\n#### Parallel Search\n\n- [Sampling-Efficient Test-Time Scaling: Self-Estimating the Best-of-N Sampling in Early Decoding](https://arxiv.org/abs/2503.01422) ![](https://img.shields.io/badge/abs-2025.03-red)\n- [Efficient Test-Time Scaling via Self-Calibration](https://arxiv.org/abs/2503.00031) ![](https://img.shields.io/badge/abs-2025.03-red)\n- [Meta-Reasoner: Dynamic Guidance for Optimized Inference-time Reasoning in Large Language Models](https://arxiv.org/abs/2502.19918) ![](https://img.shields.io/badge/abs-2025.02-red)\n- [Test-Time Preference Optimization: On-the-Fly Alignment via Iterative Textual Feedback](https://arxiv.org/abs/2501.12895) ![](https://img.shields.io/badge/abs-2025.01-red)\n- [Fast Best-of-N Decoding via Speculative Rejection](https://arxiv.org/abs/2410.20290) ![](https://img.shields.io/badge/abs-2024.10-red)\n- [TreeBoN: Enhancing Inference-Time Alignment with Speculative Tree-Search and Best-of-N Sampling](https://arxiv.org/abs/2410.16033) ![](https://img.shields.io/badge/abs-2024.10-red)\n- [Scaling llm test-time compute optimally can be more effective than scaling model parameters](https://arxiv.org/abs/2408.03314) ![](https://img.shields.io/badge/abs-2024.08-red)\n\n### 💫 Efficient Reasoning with SFT\n\n#### Reasoning Chain Compression\n\n- [Self-Training Elicits Concise Reasoning in Large Language Models](https://arxiv.org/abs/2502.20122) ![](https://img.shields.io/badge/abs-2025.02-red)\n- [TokenSkip: Controllable Chain-of-Thought Compression in LLMs](https://arxiv.org/abs/2502.12067) ![](https://img.shields.io/badge/abs-2025.02-red)\n- [Stepwise Perplexity-Guided Refinement for Efficient Chain-of-Thought Reasoning in Large Language Models](https://arxiv.org/abs/2502.13260) ![](https://img.shields.io/badge/abs-2025.02-red)\n- [C3oT: Generating Shorter Chain-of-Thought without Compromising Effectiveness](https://arxiv.org/abs/2412.11664) ![](https://img.shields.io/badge/abs-2024.12-red)\n- [Can Language Models Learn to Skip Steps?](https://arxiv.org/abs/2411.01855) ![](https://img.shields.io/badge/abs-2024.11-red)\n- [Distilling System 2 into System 1](https://arxiv.org/abs/2407.06023) ![](https://img.shields.io/badge/abs-2024.07-red)\n\n\n#### Latent-Space SFT\n- [From Explicit CoT to Implicit CoT: Learning to Internalize CoT Step by Step](https://arxiv.org/abs/2405.14838) ![](https://img.shields.io/badge/abs-2024.05-red)\n- [CODI: Compressing Chain-of-Thought into Continuous Space via Self-Distillation](https://arxiv.org/abs/2502.21074) ![](https://img.shields.io/badge/abs-2025.02-red)\n- [Token Assorted: Mixing Latent and Text Tokens for Improved Language Model Reasoning](https://arxiv.org/abs/2502.03275) ![](https://img.shields.io/badge/abs-2025.02-red)\n- [SoftCoT: Soft Chain-of-Thought for Efficient Reasoning with LLMs](https://arxiv.org/abs/2502.12134) ![](https://img.shields.io/badge/abs-2025.02-red)\n- [LightThinker: Thinking Step-by-Step Compression](https://arxiv.org/abs/2502.15589) ![](https://img.shields.io/badge/abs-2025.02-red)\n- [Efficient Reasoning with Hidden Thinking](https://arxiv.org/abs/2501.19201) ![](https://img.shields.io/badge/abs-2025.01-red)\n- [Training Large Language Models to Reason in a Continuous Latent Space](https://arxiv.org/abs/2412.06769) ![](https://img.shields.io/badge/abs-2024.12-red)\n- [Compressed Chain of Thought: Efficient Reasoning Through Dense Representations](https://arxiv.org/abs/2412.13171) ![](https://img.shields.io/badge/abs-2024.12-red)\n\n  \n### 🧩 Efficient Reasoning with Reinforcement Learning\n\n#### Efficient Reinforcement Learning with Length Reward\n\n- [HAWKEYE: Efficient Reasoning with Model Collaboration](https://arxiv.org/pdf/2504.00424v1) ![](https://img.shields.io/badge/abs-2025.04-red)\n- [ThinkPrune: Pruning Long Chain-of-Thought of LLMs via Reinforcement Learning](https://arxiv.org/abs/2504.01296) ![](https://img.shields.io/badge/abs-2025.04-red)\n- [Think When You Need: Self-Adaptive Chain-of-Thought Learning](https://arxiv.org/abs/2504.03234) ![](https://img.shields.io/badge/abs-2025.04-red)\n- [DAST: Difficulty-Adaptive Slow-Thinking for Large Reasoning Models](https://arxiv.org/abs/2503.04472) ![](https://img.shields.io/badge/abs-2025.03-red)\n- [L1: Controlling How Long A Reasoning Model Thinks With Reinforcement Learning](https://www.arxiv.org/abs/2503.04697) ![](https://img.shields.io/badge/abs-2025.03-red)\n- [Demystifying Long Chain-of-Thought Reasoning in LLMs](https://arxiv.org/abs/2502.03373) ![](https://img.shields.io/badge/abs-2025.02-red)\n- [Training Language Models to Reason Efficiently](https://arxiv.org/abs/2502.04463) ![](https://img.shields.io/badge/abs-2025.02-red)\n- [O1-Pruner: Length-Harmonizing Fine-Tuning for O1-Like Reasoning Pruning](https://arxiv.org/abs/2501.12570) ![](https://img.shields.io/badge/abs-2025.01-red)\n- [Kimi k1.5: Scaling Reinforcement Learning with LLMs](https://arxiv.org/abs/2501.12599) ![](https://img.shields.io/badge/abs-2025.01-red)\n  \n#### Efficient Reinforcement Learning without Length Reward\n- [Optimizing Test-Time Compute via Meta Reinforcement Fine-Tuning](https://arxiv.org/abs/2503.07572) ![](https://img.shields.io/badge/abs-2025.03-red)\n- [Think Smarter not Harder: Adaptive Reasoning with Inference Aware Optimization](https://arxiv.org/abs/2501.17974) ![](https://img.shields.io/badge/abs-2025.01-red)\n- [Do NOT Think That Much for 2+3=? On the Overthinking of o1-Like LLMs](https://arxiv.org/abs/2412.21187) ![](https://img.shields.io/badge/abs-2024.12-red)\n\n### 💬 Efficient Reasoning during Pre-training\n\n#### Pretraining with Latent Space\n\n- [LLM Pretraining with Continuous Concepts](https://arxiv.org/abs/2502.08524) ![](https://img.shields.io/badge/abs-2025.02-red)\n- [Scalable Language Models with Posterior Inference of Latent Thought Vectors](https://arxiv.org/abs/2502.01567) ![](https://img.shields.io/badge/abs-2025.02-red)\n- [Byte latent transformer: Patches scale better than tokens](https://arxiv.org/abs/2412.09871) ![](https://img.shields.io/badge/abs-2024.12-red)\n- [Large Concept Models: Language Modeling in a Sentence Representation Space](https://arxiv.org/abs/2412.08821) ![](https://img.shields.io/badge/abs-2024.12-red)\n\n#### Subquadratic Attention\n\n- [RWKV-7 \"Goose\" with Expressive Dynamic State Evolution](https://arxiv.org/abs/2503.14456) ![](https://img.shields.io/badge/abs-2025.03-red)\n- [LASP-2: Rethinking Sequence Parallelism for Linear Attention and Its Hybrid](https://arxiv.org/abs/2502.07563) ![](https://img.shields.io/badge/abs-2025.02-red)\n- [Native sparse attention: Hardware-aligned and natively trainable sparse attention](https://arxiv.org/abs/2502.11089) ![](https://img.shields.io/badge/abs-2025.02-red)\n- [MoBA: Mixture of Block Attention for Long-Context LLMs](https://arxiv.org/abs/2502.13189) ![](https://img.shields.io/badge/abs-2025.02-red)\n- [MoM: Linear Sequence Modeling with Mixture-of-Memories](https://www.arxiv.org/abs/2502.13685) ![](https://img.shields.io/badge/abs-2025.02-red)\n- [Gated Delta Networks: Improving Mamba2 with Delta Rule](https://arxiv.org/abs/2412.06464) ![](https://img.shields.io/badge/abs-2024.12-red)\n- [Transformers are SSMs: Generalized models and efficient algorithms through structured state space duality](https://arxiv.org/abs/2405.21060) ![](https://img.shields.io/badge/abs-2024.05-red)\n- [Various Lengths, Constant Speed: Efficient Language Modeling with Lightning Attention](https://arxiv.org/abs/2405.17381) ![](https://img.shields.io/badge/abs-2024.05-red)\n- [Gated linear attention transformers with hardware-efficient training](https://arxiv.org/abs/2312.06635) ![](https://img.shields.io/badge/abs-2023.12-red)\n\n#### Linearization\n\n- [Liger: Linearizing Large Language Models to Gated Recurrent Structures](https://arxiv.org/abs/2503.01496) ![](https://img.shields.io/badge/abs-2025.03-red)\n- [Llamba: Scaling Distilled Recurrent Models for Efficient Language Processing](https://arxiv.org/abs/2502.14458) ![](https://img.shields.io/badge/abs-2025.02-red)\n- [LoLCATs: On Low-Rank Linearizing of Large Language Models](https://arxiv.org/abs/2410.10254) ![](https://img.shields.io/badge/abs-2024.10-red)\n- [The Mamba in the Llama: Distilling and Accelerating Hybrid Models](https://arxiv.org/abs/2408.15237) ![](https://img.shields.io/badge/abs-2024.08-red)\n\n#### Efficient Reasoning with Subquadratic Attention\n\n- [Nemotron-H: A Family of Accurate and Efficient Hybrid Mamba-Transformer Models](https://arxiv.org/abs/2504.03624v1) ![](https://img.shields.io/badge/abs-2025.04-red)\n- [Compositional Reasoning with Transformers, RNNs, and Chain of Thought](https://arxiv.org/abs/2503.01544) ![](https://img.shields.io/badge/abs-2025.03-red)\n- [Cosmos-Reason1: From Physical Common Sense To Embodied Reasoning](https://arxiv.org/abs/2503.15558v1) ![](https://img.shields.io/badge/abs-2025.03-red)\n- [Thinking Slow, Fast: Scaling Inference Compute with Distilled Reasoners](https://arxiv.org/abs/2502.20339) ![](https://img.shields.io/badge/abs-2025.02-red)\n\n\n### 🔖 Future Directions\n\n#### Efficient Multimodal Reasoning and Video Reasoning\n\n- [Can Atomic Step Decomposition Enhance the Self-structured Reasoning of Multimodal Large Models?](https://arxiv.org/abs/2503.06252) ![](https://img.shields.io/badge/abs-2025.03-red)\n- [Skywork R1V: Pioneering Multimodal Reasoning with Chain-of-Thought](https://huggingface.co/Skywork/Skywork-R1V-38B)\n  \n#### Efficient Test-time Scaling and Infinity Thinking\n\n- [Efficient Test-Time Scaling via Self-Calibration](https://arxiv.org/abs/2503.00031) ![](https://img.shields.io/badge/abs-2025.03-red)\n- [Dynamic self-consistency: Leveraging reasoning paths for efficient llm sampling](https://arxiv.org/abs/2408.17017) ![](https://img.shields.io/badge/abs-2024.08-red)\n\n#### Efficient and Trustworthy Reasoning\n\n- [X-Boundary: Establishing Exact Safety Boundary to Shield LLMs from Multi-Turn Jailbreaks without Compromising Usability](https://arxiv.org/abs/2502.09990) ![](https://img.shields.io/badge/abs-2025.02-red)\n- [Deliberative alignment: Reasoning enables safer language models](https://arxiv.org/abs/2412.16339) ![](https://img.shields.io/badge/abs-2024.12-red)\n\n#### Building Efficient Reasoning Applications\n\n- [The Danger of Overthinking: Examining the Reasoning-Action Dilemma in Agentic Tasks](https://arxiv.org/abs/2502.08235) ![](https://img.shields.io/badge/abs-2025.02-red)\n- [Chain-of-Retrieval Augmented Generation](https://arxiv.org/abs/2501.14342) ![](https://img.shields.io/badge/abs-2025.01-red)\n\n#### Evaluation and Benchmark\n\n- [DNA Bench: When Silence is Smarter -- Benchmarking Over-Reasoning in Reasoning LLMs](https://arxiv.org/abs/2503.15793) ![](https://img.shields.io/badge/abs-2025.03-red)\n- [Do NOT Think That Much for 2+3=? On the Overthinking of o1-Like LLMs](https://arxiv.org/abs/2412.21187) ![](https://img.shields.io/badge/abs-2024.12-red)\n\n\n---\n\n\n\n## Resources\n\n**Reading lists related to Efficient Reasoning**\n\n- [hemingkx/Awesome-Efficient-Reasoning](https://github.com/hemingkx/Awesome-Efficient-Reasoning)\n- [Eclipsess/Awesome-Efficient-Reasoning-LLMs](https://github.com/Eclipsess/Awesome-Efficient-Reasoning-LLMs)\n- [Hongcheng-Gao/Awesome-Long2short-on-LRMs](https://github.com/Hongcheng-Gao/Awesome-Long2short-on-LRMs)\n- [DevoAllen/Awesome-Reasoning-Economy-Papers](https://github.com/DevoAllen/Awesome-Reasoning-Economy-Papers)\n- [Blueyee/Efficient-CoT-LRMs](https://github.com/Blueyee/Efficient-CoT-LRMs)\n- [EIT-NLP/Awesome-Latent-CoT](https://github.com/EIT-NLP/Awesome-Latent-CoT)\n- [yzhangchuck/awesome-llm-reasoning-long2short-papers](https://github.com/yzhangchuck/awesome-llm-reasoning-long2short-papers)\n\n\n## 🎉 Contribution\n\n### Contributing to this paper list\n\n⭐\" **Join us in improving this repository!** If you know of any important works we've missed, please contribute. Your efforts are highly valued!   \"\n\n### Contributors\n\n\u003ca href=\"https://github.com/XiaoYee/Awesome_Efficient_LRM_Reasoning/graphs/contributors\"\u003e\n  \u003cimg src=\"https://contrib.rocks/image?repo=XiaoYee/Awesome_Efficient_LRM_Reasoning\" /\u003e\n\u003c/a\u003e\n\n---\n\n\u003c!-- ## ⭐️ Star History\n\n[![Star History Chart](https://api.star-history.com/svg?repos=XiaoYee/Awesome_Efficient_LRM_Reasoning\u0026type=Date)](https://star-history.com/#XiaoYee/Awesome_Efficient_LRM_Reasoning\u0026Date) --\u003e\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FXiaoYee%2FAwesome_Efficient_LRM_Reasoning","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FXiaoYee%2FAwesome_Efficient_LRM_Reasoning","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FXiaoYee%2FAwesome_Efficient_LRM_Reasoning/lists"}