https://github.com/masamasa59/ai-agent-papers
A collection of AI Agents papers (Updated biweekly)
https://github.com/masamasa59/ai-agent-papers
agents llm paper-list planning reasoning survey
Last synced: 3 months ago
JSON representation
A collection of AI Agents papers (Updated biweekly)
- Host: GitHub
- URL: https://github.com/masamasa59/ai-agent-papers
- Owner: masamasa59
- Created: 2024-11-11T01:48:48.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2026-02-24T01:42:37.000Z (3 months ago)
- Last Synced: 2026-02-24T08:15:38.827Z (3 months ago)
- Topics: agents, llm, paper-list, planning, reasoning, survey
- Homepage:
- Size: 11.9 MB
- Stars: 1,086
- Watchers: 44
- Forks: 80
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- Changelog: newsletters/feb_2026/ideation_trends.md
Awesome Lists containing this project
- awesome-agentic-machine-learning - ai-agent-papers - updated collection of AI agent research papers. |  | (Related Resources / Foundation Models for ML)
- Awesome-Prompt-Engineering - masamasa59/ai-agent-papers
- awesome-autoresearch - masamasa59/ai-agent-papers - agent-papers?style=social) - AI agent research papers updated biweekly via automated arxiv search with curated selection. (π Related resources)
README
# AI Agents Papers
This repository curates the latest research papers on the applications and architectural technologies of AI agents. We perform weekly Arxiv searches using specific keywords and pick only those that are particularly interesting. Rather than striving for comprehensiveness, we add papers when they introduce a distinctively new approach or novel concept that stands out from existing methods.
## AI Agent
An AI Agent is an autonomous system powered by large language models that can perceive its environment, reason through complex tasks, and use tools to take actions in pursuit of specific goals. It combines reasoning, planning, memory, and tool-use capabilities to operate independently or as part of a multi-agent system.
AI Agent Workflows
## Paper Categories
π₯: Recommended papers
π: Survey papers
βοΈ: Benchmark papers
- **Agent Capabilities**
- [Environment](capability-papers/environment.md)
- [Ideation](capability-papers/ideation.md)
- [Planning](capability-papers/planning.md)
- [Reasoning](capability-papers/reasoning.md)
- [Profile](capability-papers/profile.md)
- [Perception](capability-papers/perception.md)
- [Tool Use & Skills](capability-papers/tool-use.md)
- [Self-Correction](capability-papers/self-correction.md)
- [Search](capability-papers/search.md)
- [Memory](capability-papers/memory.md)
- [Self-Evolution](capability-papers/self-evolution.md/#self-evolution-self-improvement)
- [Safety](capability-papers/safety.md)
- [Agent Tuning](capability-papers/learning.md)
- [Agent Evaluation](capability-papers/evaluation.md)
- **AI Agents Architecture**
- [Single-Agent](agent-frameworks/agent-framework.md#single-agents)
- [Multi-Agent](agent-frameworks/agent-framework.md#multi-agents)
- [Agent-Ops](agent-frameworks/agent-framework.md#agent-ops--ux)
- **AI Agents Applications**
- [Embodied Agents](application-papers/embodied-agents.md)
- [Digital Agents](application-papers/digital-agents.md)
- [GUI Agents](application-papers/digital-agents.md/#computer-controlled-app-based-agents)
- [Web Agents](application-papers/digital-agents.md/#web-based-agents)
- [Mobile Agents](application-papers/digital-agents.md/#mobile-based-agents)
- [Software Agents](application-papers/software-agents.md)
- [Data Agents](application-papers/data-agents.md)
- [Research Agents](application-papers/research-agents.md)
- [API Agents](application-papers/api-agents.md)
- [Deep Research Agents](application-papers/deep-research-agents.md)
- [Agentic AI Systems](application-papers/agentic-ai-system.md)
- [Enterprise Agents](application-papers/enterprise-agents.md)
- [Financial Agents](application-papers/finance-agents.md)
- [Multi-Agents](application-papers/multi-agent.md)
- [MAD](application-papers/multi-agent.md#mad)
- [Problem Solving](application-papers/multi-agent.md#problem-solving)
- [World Simulation](application-papers/multi-agent.md#world-simulation)
- **GenAI Agents Presentations**
- [Tutorial & Lecture](lectures/tutorial-lecture.md)
## References
- [LLM Agents Papers](https://github.com/zjunlp/LLMAgentPapers)
- [Awesome LLM-Powered Agent](https://github.com/hyp1231/awesome-llm-powered-agent/)
- [Awesome LLM agents](https://github.com/kaushikb11/awesome-llm-agents)
## Feb/24 Highlights
### Self-Evolving Agents
* **"Self-Consolidation for Self-Evolving Agents"** [[paper](https://arxiv.org/abs/2602.01966)]
* **"Live-Evo: Online Evolution of Agentic Memory from Continuous Feedback"** [[paper](https://arxiv.org/abs/2602.02369)]
* **"MemSkill: Learning and Evolving Memory Skills for Self-Evolving Agents"** [[paper](https://arxiv.org/abs/2602.02474)]
* βοΈ **"AGENTRX: Diagnosing AI Agent Failures from Execution Trajectories"** [[paper](https://arxiv.org/abs/2602.02475)]
* **"Empirical-MCTS: Continuous Agent Evolution via Dual-Experience Monte Carlo Tree Search"** [[paper](https://arxiv.org/abs/2602.04248)]
* **"AdaptEvolve: Improving Efficiency of Evolutionary AI Agents through Adaptive Model Selection"** [[paper](https://arxiv.org/abs/2602.11931)]
* **"AORCHESTRA: Automating Sub-Agent Creation for Agentic Orchestration"** [[paper](https://arxiv.org/abs/2602.03786)]
## Scientific Discovery
* βοΈ **"FIRE-Bench: Evaluating Agents on the Rediscovery of Scientific Insights"** [[paper](https://arxiv.org/abs/2602.02905)]
* **"DeltaEvolve: Accelerating Scientific Discovery through Momentum-Driven Evolution"** [[paper](https://arxiv.org/abs/2602.02919)]
* **"Accelerating Scientific Research with Gemini: Case Studies and Common Techniques"** [[paper](https://arxiv.org/abs/2602.03837)]
* π **"Towards a Science of Collective AI: LLM-based Multi-Agent Systems Need a Transition from Blind Trial-and-Error to Rigorous Science"** [[paper](https://arxiv.org/abs/2602.05289)]
* βοΈ **"AIRS-Bench: a Suite of Tasks for Frontier AI Research Science Agents"** [[paper](https://arxiv.org/abs/2602.06855)]
* **"IV Co-Scientist: Multi-Agent LLM Framework for Causal Instrumental Variable Discovery"** [[paper](https://arxiv.org/abs/2602.07943)]
## Jan/30 Highlights
- [Self-Evolution Trends Report (JA)](news-letters/jan_2026/self_evolution_trends.md)
- [Memory Trends Report (JA)](news-letters/jan_2026/memory_trends.md)
### Agentic Reasoning
* π **"Agentic Reasoning for Large Language Models"** [[paper](https://arxiv.org/abs/2601.12538v1)]
* π **"Toward Efficient Agents: Memory, Tool learning, and Planning"** [[paper](https://arxiv.org/abs/2601.14192v1)]
### Self-Evolving Agents
* **"JENIUS AGENT: Towards Experience-Driven Accuracy Optimization in Real-World Scenarios"** [[paper](https://arxiv.org/abs/2601.01857)]
* **"EvoRoute: Experience-Driven Self-Routing LLM Agent Systems"** [[paper](https://arxiv.org/abs/2601.02695)]
* **"MEMRL: Self-Evolving Agents via Runtime Reinforcement Learning on Episodic Memory"** [[paper](https://arxiv.org/abs/2601.03192)]
* **"PACEvolve: Enabling Long-Horizon Progress-Aware Consistent Evolution"** [[paper](https://arxiv.org/abs/2601.10657v1)]
* **"Beyond Static Tools: Test-Time Tool Evolution for Scientific Reasoning"** [[paper](https://arxiv.org/abs/2601.07641v1)]
* **"WISE-Flow: Workflow-Induced Structured Experience for Self-Evolving Conversational Service Agents"** [[paper](https://arxiv.org/abs/2601.08158v1)]
* **"To Retrieve or To Think? An Agentic Approach for Context Evolution"** [[paper](https://arxiv.org/abs/2601.08747v2)]
* **"Controlled Self-Evolution for Algorithmic Code Optimization"** [[paper](https://arxiv.org/abs/2601.07348v4)]
* **"Learn Like Humans: Use Meta-cognitive Reflection for Efficient Self-Improvement"** [[paper](https://arxiv.org/abs/2601.11974v1)]
* π **"From Storage to Experience: A Survey on the Evolution of LLM Agent Memory Mechanisms"** [[paper](https://www.preprints.org/manuscript/202601.0618)]
* **"Inference-Time Scaling of Verification: Self-Evolving Deep Research Agents via Test-Time Rubric-Guided Verification"** [[paper](https://arxiv.org/abs/2601.15808)]
* **"Optimizing Agentic Workflows using Meta-tools"** [[paper](https://arxiv.org/abs/2601.22037v1)]
* **"Yunjue Agent Tech Report: A Fully Reproducible, Zero-Start In-Situ Self-Evolving Agent System for Open-Ended Tasks"** [[paper](https://arxiv.org/abs/2601.18226)]
* **"Large Language Model Agents Are Not Always Faithful Self-Evolvers"** [[paper](https://arxiv.org/abs/2601.22436)]
### Memory
* **"Agentic Memory: Learning Unified Long-Term and Short-Term Memory Management for Large Language Model Agents"** [[paper](https://arxiv.org/abs/2601.01885)]
* **"SimpleMem: Efficient Lifelong Memory for LLM Agents"** [[paper](https://arxiv.org/abs/2601.02553)]
* **"MEMRL: Self-Evolving Agents via Runtime Reinforcement Learning on Episodic Memory"** [[paper](https://arxiv.org/abs/2601.03192)]
* **"Memory Matters More: Event-Centric Memory as a Logic Map for Agent Searching and Reasoning"** [[paper](https://www.arxiv.org/abs/2601.04726)]
* **"Controllable Memory Usage: Balancing Anchoring and Innovation in Long-Term Human-Agent Interaction"** [[paper](https://arxiv.org/abs/2601.05107)]
* **"Inside Out: Evolving User-Centric Core Memory Trees for Long-Term Personalized Dialogue Systems"** [[paper](https://arxiv.org/abs/2601.05171)]
* **"MineNPC-Task: Task Suite for Memory-Aware Minecraft Agents"** [[paper](https://arxiv.org/abs/2601.05215)]
* **"PACEvolve: Enabling Long-Horizon Progress-Aware Consistent Evolution"** [[paper](https://arxiv.org/abs/2601.10657v1)]
* **"The AI Hippocampus: How Far are We From Human Memory?"** [[paper](https://arxiv.org/abs/2601.09113v1)]
* **"MemoBrain: Executive Memory as an Agentic Brain for Reasoning"** [[paper](https://arxiv.org/abs/2601.08079v1)]
* **"AtomMem : Learnable Dynamic Agentic Memory with Atomic Memory Operation"** [[paper](https://arxiv.org/abs/2601.08323v1)]
* **"Fine-Mem: Fine-Grained Feedback Alignment for Long-Horizon Memory Management"** [[paper](https://arxiv.org/abs/2601.08435v1)]
* **"Structured Episodic Event Memory"** [[paper](https://arxiv.org/abs/2601.06411v1)]
* **"Active Context Compression: Autonomous Memory Management in LLM Agents"**[[paper](https://arxiv.org/abs/2601.07190v1)]
* π **"From Storage to Experience: A Survey on the Evolution of LLM Agent Memory Mechanisms"** [[paper](https://www.preprints.org/manuscript/202601.0618)]
* **"AutoRefine: From Trajectories to Reusable Expertise for Continual LLM Agent Refinement"** [[paper](https://arxiv.org/abs/2601.22758)]
### Creative Task
* **"Progressive Ideation using an Agentic AI Framework for Human-AI Co-Creation"** [[paper](https://arxiv.org/abs/2601.00475)]
* **"OpenNovelty: An LLM-powered Agentic System for Verifiable Scholarly Novelty Assessment"** [[paper](https://www.arxiv.org/abs/2601.01576)]
* **"Sci-Reasoning: A Dataset Decoding AI Innovation Patterns"** [[paper](https://arxiv.org/abs/2601.04577v1)]
* **"SuS: Strategy-aware Surprise for Intrinsic Exploration"** [[paper](https://arxiv.org/abs/2601.10349v1)]
* **"Proof of Time: A Benchmark for Evaluating Scientific Idea Judgments"** [[paper](https://arxiv.org/abs/2601.07606v1)]
* **"LLM Review: Enhancing Creative Writing via Blind Peer Review Feedback"** [[paper](https://arxiv.org/abs/2601.08003v1)]
* **"Agentic AI and Machine Learning for Accelerated Materials Discovery and Applications"** [[paper](https://arxiv.org/abs/2601.09027)]
* **"Who Owns Creativity and Who Does the Work? Trade-offs in LLM-Supported Research Ideation"** [[paper](https://arxiv.org/abs/2601.12152v1)]
* **"Improved Bug Localization with AI Agents Leveraging Hypothesis and Dynamic Cognition"** [[paper](https://arxiv.org/abs/2601.12522v1)]
* **"Rethinking the AI Scientist: Interactive Multi-Agent Workflows for Scientific Discovery"** [[paper](https://www.arxiv.org/abs/2601.12542)]
* **"Learning to Discover at Test Time"** [[paper](https://arxiv.org/abs/2601.16175)]
* **"Insight Agents: An LLM-Based Multi-Agent System for Data Insights"** [[paper](https://arxiv.org/abs/2601.20048)]
* **"Probing the Future of Meta-Analysis: Eliciting Design Principles via an Agentic Research IDE"** [[paper](https://arxiv.org/abs/2601.18239)]
* **"Generating Literature-Driven Scientific Theories at Scale"** [[paper](https://arxiv.org/abs/2601.16282)]
## Coding Agents
* **"Improved Bug Localization with AI Agents Leveraging Hypothesis and Dynamic Cognition"** [[paper](https://arxiv.org/abs/2601.12522v1)]
* **"LLM-in-Sandbox Elicits General Agentic Intelligence"** [[paper](https://arxiv.org/abs/2601.16206v1)]
* **"SERA: Soft-Verified Efficient Repository Agents"** [[paper](https://arxiv.org/abs/2601.20789v1)]
* **"Who Writes the Docs in SE 3.0? Agent vs. Human Documentation Pull Requests"** [[paper](https://arxiv.org/abs/2601.20171)]
* **"How do Agents Refactor: An Empirical Study"** [[paper](https://arxiv.org/abs/2601.20160)]
* **"Beyond Bug Fixes: An Empirical Investigation of Post-Merge Code Quality Issues in Agent-Generated Pull Requests"** [[paper](https://arxiv.org/abs/2601.20109)]
* **"Are We All Using Agents the Same Way? An Empirical Study of Core and Peripheral Developersβ Use of Coding Agents"** [[paper](https://arxiv.org/abs/2601.20106)]
## Dec/25 Highlights (Updated 30 Dec)
### Self-Evolving Agents
* **"Strategic Self-Improvement for Competitive Agents in AI Labour Markets"** [[paper](https://arxiv.org/abs/2512.04988v1)]
* **"Guided Self-Evolving LLMs with Minimal Human Supervision"** [[paper](https://arxiv.org/abs/2512.02472v1)]
* **"Evolving Excellence: Automated Optimization of LLM-based Agents"** [[paper](https://arxiv.org/abs/2512.09108v1)]
* **"Remember Me, Refine Me: A Dynamic Procedural Memory Framework for Experience-Driven Agent Evolution"** [[paper](https://arxiv.org/abs/2512.10696)]
* **"Beyond Training: Enabling Self-Evolution of Agents with MOBIMEM"** [[paper](https://arxiv.org/abs/2512.15784v1)]
* **"SCOPE: Prompt Evolution for Enhancing Agent Effectiveness"** [[paper](https://arxiv.org/abs/2512.15374v1)]
* **"Reinforcement Learning for Self-Improving Agent with Skill Library"** [[paper](https://arxiv.org/abs/2512.17102v1)]
* **"MemEvolve: Meta-Evolution of Agent Memory Systems"** [[paper](https://arxiv.org/abs/2512.18746v1)]
### Hot Topics
* π **"Memory in the Age of AI Agents: A Survey Forms, Functions and Dynamics"** [[paper](https://arxiv.org/abs/2512.13564v1)]
* π **"Adaptation of Agentic AI"** [[paper](https://arxiv.org/abs/2512.16301v1)]
* π **"Deep Research: A Systematic Survey"** [[paper](https://arxiv.org/abs/2512.02038v1)]
* π₯ **"Measuring Agents in Production"** [[paper](https://arxiv.org/abs/2512.04123v1)]
* π₯ **"Towards a Science of Scaling Agent Systems"** [[paper](https://arxiv.org/abs/2512.08296v1)]
* βοΈ **"Evaluating Large Language Models in Scientific Discovery"** [[paper](https://arxiv.org/abs/2512.15567v1)]
* π₯ **"How Far Are We from Genuinely Useful Deep Research Agents?"** [[paper](https://arxiv.org/abs/2512.01948v1)]
* **"Can Agentic AI Match the Performance of Human Data Scientists?"** [[paper](https://arxiv.org/abs/2512.20959v1)]
## 2025 Highlights
04/25 ~ 12/25 [[link](lectures/2025_trend.md)]