https://github.com/necolizer/awesome-rl-for-agents

A curated list of reinforcement learning (RL) for agents.
https://github.com/necolizer/awesome-rl-for-agents

List: awesome-rl-for-agents

agents ai-agent awesome awesome-list reinforcement-learning

Last synced: 2 months ago
JSON representation

A curated list of reinforcement learning (RL) for agents.

Host: GitHub
URL: https://github.com/necolizer/awesome-rl-for-agents
Owner: Necolizer
License: cc0-1.0
Created: 2025-04-07T10:42:36.000Z (2 months ago)
Default Branch: main
Last Pushed: 2025-04-08T07:00:27.000Z (2 months ago)
Last Synced: 2025-04-08T08:21:34.133Z (2 months ago)
Topics: agents, ai-agent, awesome, awesome-list, reinforcement-learning
Homepage:
Size: 5.86 KB
Stars: 2
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

ultimate-awesome - awesome-rl-for-agents - A curated list of reinforcement learning (RL) for agents. (Other Lists / Julia Lists)

README

        # Awesome RL for Agents [![Awesome](https://awesome.re/badge.svg)](https://awesome.re)

A curated list of reinforcement learning (RL) for agents.

> This list collects papers, tools, and demos that demonstrate how reinforcement learning can be applied to train or tune agents — with a primary focus on computer-using agents (e.g. GUI, web, and MCP agents), and supplementary coverage of related topics within the broader scope of RL for agents.

---

## Table of Contents

- [📚 Papers & Research](#-papers--research)

- [🕹️ Benchmarks](#-benchmarks)

- [🧪 Demos & Projects](#-demos--projects)

- [🧰 Toolkits & Frameworks](#-toolkits--frameworks)

- [📄 Tutorials & Blog Posts](#-tutorials--blog-posts)

- [🔗 Related Awesome Lists](#-related-awesome-lists)

- [🤝 Contributing](#-contributing)

---

## 📚 Papers & Research

### RL for Computer-using Agents

- **UI-R1**: Enhancing Action Prediction of GUI Agents by Reinforcement Learning [[Preprint'25]](https://arxiv.org/abs//2503.21620) [[Code]](https://github.com/lll6gg/UI-R1)

- **Digi-Q**: Learning Q-Value Functions for Training Device-Control Agents [[Preprint'25]](https://arxiv.org/abs/2502.15760) [[Code]](https://github.com/DigiRL-agent/digiq)

### RL for Tool-using Problem Solver

- **Agent models**: Internalizing Chain-of-Action Generation into Reasoning models [[Preprint'25]](https://arxiv.org/abs/2503.06580) [[Code]](https://github.com/ADaM-BJTU/AutoCoA)

- **TORL**: Scaling Tool-Integrated RL [[Preprint'25]](https://arxiv.org/pdf/2503.23383) [[Code]](https://github.com/GAIR-NLP/ToRL)

### RL for Agent Planning

- **MPO**: Boosting LLM Agents with Meta Plan Optimization [[Preprint'25]](https://arxiv.org/abs/2503.02682) [[Code]](https://github.com/WeiminXiong/MPO)

### Reinforcement Learning Scaling

- **VAPO**: Efficient and Reliable Reinforcement Learning for Advanced Reasoning Tasks [[Preprint'25]](https://arxiv.org/abs/2504.05118)

- **DAPO**: An Open-Source LLM Reinforcement Learning System at Scale [[Preprint'25]](https://arxiv.org/abs/2503.14476v1) [[Code]](https://github.com/BytedTsinghua-SIA/DAPO)

- **LIMR**: Less is More for RL Scaling [[Preprint'25]](https://arxiv.org/abs/2502.11886) [[Code]](https://github.com/GAIR-NLP/LIMR)

- **DeepSeek-R1**: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning [[Preprint'25]](https://arxiv.org/abs/2501.12948)

- **Kimi k1.5**: Scaling Reinforcement Learning with LLMs [[Preprint'25]](https://arxiv.org/abs/2501.12599)

## 🕹 Benchmarks

- **ScreenSpot-Pro**: GUI Grounding for Professional High-Resolution Computer Use [[Paper]](https://likaixin2000.github.io/papers/ScreenSpot_Pro.pdf) [[Code]](https://github.com/likaixin2000/ScreenSpot-Pro-GUI-Grounding)

- **OSWorld**: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments [[NeurIPS'24]](https://proceedings.neurips.cc/paper_files/paper/2024/hash/5d413e48f84dc61244b6be550f1cd8f5-Abstract-Datasets_and_Benchmarks_Track.html) [[Code]](https://github.com/xlang-ai/OSWorld)

- **SeeClick**: Harnessing GUI Grounding for Advanced Visual GUI Agents [[ACL'24]](https://aclanthology.org/2024.acl-long.505.pdf) [[Code]](https://github.com/njucckevin/SeeClick)

## 🧪 Demos & Projects

### RL-based LLM agent tuning

- **OpenManus-RL** [[Code]](https://github.com/OpenManus/OpenManus-RL) & **OpenManus** [[Code]](https://github.com/mannaandpoem/OpenManus)

- **RAGEN**: Training Agents by Reinforcing Reasoning [[Code]](https://github.com/ZihanWang314/ragen)

### RL-based LLM tuning

- **Open-Reasoner-Zero**: An Open Source Approach to Scaling Up Reinforcement Learning on the Base Model [[Preprint'25]](https://arxiv.org/abs/2503.24290) [[Code]](https://github.com/Open-Reasoner-Zero/Open-Reasoner-Zero)

- **simple_GRPO** [[Code]](https://github.com/lsdefine/simple_GRPO)

### MCP Agents

- **mcp-agent** [[Code]](https://github.com/lastmile-ai/mcp-agent)

## 🧰 Toolkits & Frameworks

- **verl**: Volcano Engine Reinforcement Learning for LLM [[Code]](https://github.com/volcengine/verl)

## 📄 Tutorials & Blog Posts

> (Coming soon...)

## 🔗 Related Awesome Lists

- **Awesome-Agent-RL** [[List]](https://github.com/0russwest0/Awesome-Agent-RL) - covering RL for research agents

- **awesome-ml-agents** [[List]](https://github.com/tokarev-i-v/awesome-llm-rl-agents) - covering rl and agents before 2023

## 🤝 Contributing

Contributions are warmly welcome!

If you know a paper, tool, environment, or demo relevant to **RL for Agents**, feel free to open a pull request.

### Guidelines:

- Make sure the resource is publicly accessible and active.

- Use the same format as existing entries: `- **Name**: Title [Paper](link) [Code](link) – short description (optional).`

- Add entries under the most appropriate section.

- Avoid duplicates or resources that are already well-covered elsewhere.

We aim to keep this list high-quality, practical, and focused. Thank you for helping improve it! ✨

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/necolizer/awesome-rl-for-agents

Awesome Lists containing this project

README