awesome-rl-for-agents
A curated list of reinforcement learning (RL) for agents.
https://github.com/necolizer/awesome-rl-for-agents
Last synced: 2 days ago
JSON representation
-
๐น Benchmarks
-
Others
- [Preprint'25(1) - AI/Marco-Search-Agent)
- [Platform - ai/computer-agent-arena)
- [Blog - d6f7-442e-9508-59515c65e35d/browsecomp.pdf) [[Code]](https://github.com/openai/simple-evals)
- [Preprint'25
- [Preprint'25 - plus)
- [Preprint'25
- [Preprint'25 - ZH)
- [Paper - Pro-GUI-Grounding)
- [NeurIPS'24 - ai/OSWorld)
- [ACL'24
- [Preprint'25 - BrowseComp)
-
Reinforcement Learning Scaling
- [Paper - Pro-GUI-Grounding)
- [NeurIPS'24 - ai/OSWorld)
- [ACL'24
-
-
๐งช Demos & Projects
-
๐ Papers & Research
-
Survey & Review
- [Preprint'25 - Search-Agent-Papers)
- [Preprint'25 - agents-2030/awesome-deep-research-agent)
- [Preprint'25 - AgenticLLM-RL-Papers)
-
RL for Research Agents
- [Preprint'25
- [Preprint'25
- [Preprint'25 - Reasoning)
- [Preprint'25 - Lab/multimodal-search-r1)
- [Blog
- [Preprint'25 - Search)
- [Preprint'25 - Searcher-plus)
- [Preprint'25 - Searcher)
- [Blog
- [Preprint'25 - Pro)
- [Preprint'25
- [Preprint'25 - nlp/ZeroSearch)
- [Preprint'25 - NLP/DeepResearcher)
- [Preprint'25 - RL/ReCall)
- [Preprint'25 - r1)
- [Preprint'25 - NLP/WebAgent)
- [Preprint'25 - NLP/WebAgent)
- [Preprint'25
- [Blog - NLP/DeepResearch)
-
RL for Computer-using Agents
- [Preprint'25 - R1)
- [Preprint'25 - R1)
- [Preprint'25 - agent/digiq)
- [Preprint'25 - agent/digiq)
- [Preprint'25 - research/ARPO)
- [Preprint'25 - Labs/InfiGUI-R1)
- [Preprint'25
- [KDD'24
- [Preprint'25 - ai/OpenCUA)
- [Preprint'25 - TARS)
-
Reinforcement Learning Scaling
- [Preprint'25 - R1V2-38B)
- [Blog - OR1)
- [Preprint'25
- [Blog
- [Preprint'25 - SIA/DAPO)
- [Preprint'25 - NLP/LIMR)
- [Preprint'25
- [Preprint'25 - SIA/DAPO)
- [Preprint'25 - NLP/LIMR)
- [Preprint'25
- [Preprint'25
- [Preprint'25
- [Preprint'25
- [Preprint'25
-
RL for Tool-using Problem Solver
- [Preprint'25
- [Preprint'25 - AI-Lab/verl-tool)
- [Preprint'25
- [Preprint'25
- [Preprint'25
- [Preprint'25 - BJTU/AutoCoA)
- [Preprint'25 - NLP/ToRL)
- [Preprint'25 - BJTU/AutoCoA)
-
RL for Agent Memory
- [Preprint'25
- [Preprint'25 - SIA/MemAgent)
-
Others
- [Preprint'25 - feedback)
- [Preprint'25
- [Preprint'25
-
RL for Agent Planning
-
Self-Playing Agent with RL
- [Preprint'25 - Quark/SSP)
-
-
๐ Related Awesome Lists
-
MCP Agents
- [List - covering search agent papers
- [List - covering deep research agents and benchmark results
- [List - covering RL for research agents
- [List - covering rl and agents before 2023
- [List - covering RL for research agents
- [List - covering rl and agents before 2023
- [List - covering Agentic RL papers in both agentic capabilities and applications
-
-
๐ Tutorials & Blog Posts
-
๐งฐ Toolkits & Frameworks
Programming Languages
Categories
Sub Categories