https://github.com/necolizer/awesome-rl-for-agents
A curated list of reinforcement learning (RL) for agents.
https://github.com/necolizer/awesome-rl-for-agents
List: awesome-rl-for-agents
agents ai-agent awesome awesome-list reinforcement-learning
Last synced: about 2 months ago
JSON representation
A curated list of reinforcement learning (RL) for agents.
- Host: GitHub
- URL: https://github.com/necolizer/awesome-rl-for-agents
- Owner: Necolizer
- License: cc0-1.0
- Created: 2025-04-07T10:42:36.000Z (about 2 months ago)
- Default Branch: main
- Last Pushed: 2025-04-08T07:00:27.000Z (about 2 months ago)
- Last Synced: 2025-04-08T08:21:34.133Z (about 2 months ago)
- Topics: agents, ai-agent, awesome, awesome-list, reinforcement-learning
- Homepage:
- Size: 5.86 KB
- Stars: 2
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
- ultimate-awesome - awesome-rl-for-agents - A curated list of reinforcement learning (RL) for agents. (Other Lists / Julia Lists)
README
# Awesome RL for Agents [](https://awesome.re)
A curated list of reinforcement learning (RL) for agents.
> This list collects papers, tools, and demos that demonstrate how reinforcement learning can be applied to train or tune agents โ with a primary focus on computer-using agents (e.g. GUI, web, and MCP agents), and supplementary coverage of related topics within the broader scope of RL for agents.
---
## Table of Contents
- [๐ Papers & Research](#-papers--research)
- [๐น๏ธ Benchmarks](#-benchmarks)
- [๐งช Demos & Projects](#-demos--projects)
- [๐งฐ Toolkits & Frameworks](#-toolkits--frameworks)
- [๐ Tutorials & Blog Posts](#-tutorials--blog-posts)
- [๐ Related Awesome Lists](#-related-awesome-lists)
- [๐ค Contributing](#-contributing)---
## ๐ Papers & Research
### RL for Computer-using Agents
- **UI-R1**: Enhancing Action Prediction of GUI Agents by Reinforcement Learning [[Preprint'25]](https://arxiv.org/abs//2503.21620) [[Code]](https://github.com/lll6gg/UI-R1)
- **Digi-Q**: Learning Q-Value Functions for Training Device-Control Agents [[Preprint'25]](https://arxiv.org/abs/2502.15760) [[Code]](https://github.com/DigiRL-agent/digiq)### RL for Tool-using Problem Solver
- **Agent models**: Internalizing Chain-of-Action Generation into Reasoning models [[Preprint'25]](https://arxiv.org/abs/2503.06580) [[Code]](https://github.com/ADaM-BJTU/AutoCoA)
- **TORL**: Scaling Tool-Integrated RL [[Preprint'25]](https://arxiv.org/pdf/2503.23383) [[Code]](https://github.com/GAIR-NLP/ToRL)### RL for Agent Planning
- **MPO**: Boosting LLM Agents with Meta Plan Optimization [[Preprint'25]](https://arxiv.org/abs/2503.02682) [[Code]](https://github.com/WeiminXiong/MPO)### Reinforcement Learning Scaling
- **VAPO**: Efficient and Reliable Reinforcement Learning for Advanced Reasoning Tasks [[Preprint'25]](https://arxiv.org/abs/2504.05118)
- **DAPO**: An Open-Source LLM Reinforcement Learning System at Scale [[Preprint'25]](https://arxiv.org/abs/2503.14476v1) [[Code]](https://github.com/BytedTsinghua-SIA/DAPO)
- **LIMR**: Less is More for RL Scaling [[Preprint'25]](https://arxiv.org/abs/2502.11886) [[Code]](https://github.com/GAIR-NLP/LIMR)
- **DeepSeek-R1**: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning [[Preprint'25]](https://arxiv.org/abs/2501.12948)
- **Kimi k1.5**: Scaling Reinforcement Learning with LLMs [[Preprint'25]](https://arxiv.org/abs/2501.12599)## ๐น Benchmarks
- **ScreenSpot-Pro**: GUI Grounding for Professional High-Resolution Computer Use [[Paper]](https://likaixin2000.github.io/papers/ScreenSpot_Pro.pdf) [[Code]](https://github.com/likaixin2000/ScreenSpot-Pro-GUI-Grounding)
- **OSWorld**: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments [[NeurIPS'24]](https://proceedings.neurips.cc/paper_files/paper/2024/hash/5d413e48f84dc61244b6be550f1cd8f5-Abstract-Datasets_and_Benchmarks_Track.html) [[Code]](https://github.com/xlang-ai/OSWorld)
- **SeeClick**: Harnessing GUI Grounding for Advanced Visual GUI Agents [[ACL'24]](https://aclanthology.org/2024.acl-long.505.pdf) [[Code]](https://github.com/njucckevin/SeeClick)## ๐งช Demos & Projects
### RL-based LLM agent tuning
- **OpenManus-RL** [[Code]](https://github.com/OpenManus/OpenManus-RL) & **OpenManus** [[Code]](https://github.com/mannaandpoem/OpenManus)
- **RAGEN**: Training Agents by Reinforcing Reasoning [[Code]](https://github.com/ZihanWang314/ragen)### RL-based LLM tuning
- **Open-Reasoner-Zero**: An Open Source Approach to Scaling Up Reinforcement Learning on the Base Model [[Preprint'25]](https://arxiv.org/abs/2503.24290) [[Code]](https://github.com/Open-Reasoner-Zero/Open-Reasoner-Zero)
- **simple_GRPO** [[Code]](https://github.com/lsdefine/simple_GRPO)### MCP Agents
- **mcp-agent** [[Code]](https://github.com/lastmile-ai/mcp-agent)## ๐งฐ Toolkits & Frameworks
- **verl**: Volcano Engine Reinforcement Learning for LLM [[Code]](https://github.com/volcengine/verl)## ๐ Tutorials & Blog Posts
> (Coming soon...)## ๐ Related Awesome Lists
- **Awesome-Agent-RL** [[List]](https://github.com/0russwest0/Awesome-Agent-RL) - covering RL for research agents
- **awesome-ml-agents** [[List]](https://github.com/tokarev-i-v/awesome-llm-rl-agents) - covering rl and agents before 2023## ๐ค Contributing
Contributions are warmly welcome!
If you know a paper, tool, environment, or demo relevant to **RL for Agents**, feel free to open a pull request.
### Guidelines:
- Make sure the resource is publicly accessible and active.
- Use the same format as existing entries: `- **Name**: Title [Paper](link) [Code](link) โ short description (optional).`
- Add entries under the most appropriate section.
- Avoid duplicates or resources that are already well-covered elsewhere.We aim to keep this list high-quality, practical, and focused. Thank you for helping improve it! โจ