awesome-rl-for-agents

A curated list of reinforcement learning (RL) for agents.
https://github.com/necolizer/awesome-rl-for-agents

Last synced: 2 days ago
JSON representation

🕹 Benchmarks
- Others
  - [Preprint'25(1) - AI/Marco-Search-Agent)
  - [Platform - ai/computer-agent-arena)
  - [Blog - d6f7-442e-9508-59515c65e35d/browsecomp.pdf) [[Code]](https://github.com/openai/simple-evals)
  - [Preprint'25
  - [Preprint'25 - plus)
  - [Preprint'25
  - [Preprint'25 - ZH)
  - [Paper - Pro-GUI-Grounding)
  - [NeurIPS'24 - ai/OSWorld)
  - [ACL'24
  - [Preprint'25 - BrowseComp)
- Reinforcement Learning Scaling
  - [Paper - Pro-GUI-Grounding)
  - [NeurIPS'24 - ai/OSWorld)
  - [ACL'24
🧪 Demos & Projects
- RL-based LLM agent tuning
  - [Code
  - [Blog - AI/SkyRL)
  - [Code
  - [Code
  - [Code
  - [Code
  - [Code
  - [Code
  - [Code
- MCP Agents
  - [Code
  - [Code
  - [Code
- RL-based LLM tuning
  - [Preprint'25 - Reasoner-Zero/Open-Reasoner-Zero)
  - [Code
  - [Preprint'25 - Reasoner-Zero/Open-Reasoner-Zero)
  - [Code
📚 Papers & Research
- Survey & Review
  - [Preprint'25 - Search-Agent-Papers)
  - [Preprint'25 - agents-2030/awesome-deep-research-agent)
  - [Preprint'25 - AgenticLLM-RL-Papers)
- RL for Research Agents
  - [Preprint'25
  - [Preprint'25
  - [Preprint'25 - Reasoning)
  - [Preprint'25 - Lab/multimodal-search-r1)
  - [Blog
  - [Preprint'25 - Search)
  - [Preprint'25 - Searcher-plus)
  - [Preprint'25 - Searcher)
  - [Blog
  - [Preprint'25 - Pro)
  - [Preprint'25
  - [Preprint'25 - nlp/ZeroSearch)
  - [Preprint'25 - NLP/DeepResearcher)
  - [Preprint'25 - RL/ReCall)
  - [Preprint'25 - r1)
  - [Preprint'25 - NLP/WebAgent)
  - [Preprint'25 - NLP/WebAgent)
  - [Preprint'25
  - [Blog - NLP/DeepResearch)
- RL for Computer-using Agents
  - [Preprint'25 - R1)
  - [Preprint'25 - R1)
  - [Preprint'25 - agent/digiq)
  - [Preprint'25 - agent/digiq)
  - [Preprint'25 - research/ARPO)
  - [Preprint'25 - Labs/InfiGUI-R1)
  - [Preprint'25
  - [KDD'24
  - [Preprint'25 - ai/OpenCUA)
  - [Preprint'25 - TARS)
- Reinforcement Learning Scaling
  - [Preprint'25 - R1V2-38B)
  - [Blog - OR1)
  - [Preprint'25
  - [Blog
  - [Preprint'25 - SIA/DAPO)
  - [Preprint'25 - NLP/LIMR)
  - [Preprint'25
  - [Preprint'25 - SIA/DAPO)
  - [Preprint'25 - NLP/LIMR)
  - [Preprint'25
  - [Preprint'25
  - [Preprint'25
  - [Preprint'25
  - [Preprint'25
- RL for Tool-using Problem Solver
  - [Preprint'25
  - [Preprint'25 - AI-Lab/verl-tool)
  - [Preprint'25
  - [Preprint'25
  - [Preprint'25
  - [Preprint'25 - BJTU/AutoCoA)
  - [Preprint'25 - NLP/ToRL)
  - [Preprint'25 - BJTU/AutoCoA)
- RL for Agent Memory
  - [Preprint'25
  - [Preprint'25 - SIA/MemAgent)
- Others
  - [Preprint'25 - feedback)
  - [Preprint'25
  - [Preprint'25
- RL for Agent Planning
  - [Preprint'25
- Self-Playing Agent with RL
  - [Preprint'25 - Quark/SSP)
🔗 Related Awesome Lists
- MCP Agents
  - [List - covering search agent papers
  - [List - covering deep research agents and benchmark results
  - [List - covering RL for research agents
  - [List - covering rl and agents before 2023
  - [List - covering RL for research agents
  - [List - covering rl and agents before 2023
  - [List - covering Agentic RL papers in both agentic capabilities and applications
📄 Tutorials & Blog Posts
- MCP Agents
  - [Github
  - [Blog
  - [Blog
🧰 Toolkits & Frameworks
- MCP Agents
  - [Code
  - [Code
  - [Code
  - [Code

Programming Languages

Python 13

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

awesome-rl-for-agents

🕹 Benchmarks

Others

Reinforcement Learning Scaling

🧪 Demos & Projects

RL-based LLM agent tuning

MCP Agents

RL-based LLM tuning

📚 Papers & Research

Survey & Review

RL for Research Agents

RL for Computer-using Agents

Reinforcement Learning Scaling

RL for Tool-using Problem Solver

RL for Agent Memory

Others

RL for Agent Planning

Self-Playing Agent with RL

MCP Agents

📄 Tutorials & Blog Posts

MCP Agents

🧰 Toolkits & Frameworks

MCP Agents

awesome-rl-for-agents

🕹 Benchmarks

Others

Reinforcement Learning Scaling

🧪 Demos & Projects

RL-based LLM agent tuning

MCP Agents

RL-based LLM tuning

📚 Papers & Research

Survey & Review

RL for Research Agents

RL for Computer-using Agents

Reinforcement Learning Scaling

RL for Tool-using Problem Solver

RL for Agent Memory

Others

RL for Agent Planning

Self-Playing Agent with RL

🔗 Related Awesome Lists

MCP Agents

📄 Tutorials & Blog Posts

MCP Agents

🧰 Toolkits & Frameworks

MCP Agents