An open API service indexing awesome lists of open source software.

Projects in Awesome Lists tagged with rl

A curated list of projects in awesome lists tagged with rl .

https://github.com/LlamaChinese/Llama-Chinese

Llama中文社区,实时汇总最新Llama学习资料,构建最好的中文Llama大模型开源生态,完全开源可商用

agent llama llama4 llm pretraining rl

Last synced: 06 Apr 2026

https://github.com/llamafamily/llama-chinese

Llama中文社区,实时汇总最新Llama学习资料,构建最好的中文Llama大模型开源生态,完全开源可商用

agent llama llama4 llm pretraining rl

Last synced: 14 May 2025

https://github.com/google/dopamine

Dopamine is a research framework for fast prototyping of reinforcement learning algorithms.

ai google ml rl tensorflow

Last synced: 12 May 2025

https://google.github.io/dopamine/

Dopamine is a research framework for fast prototyping of reinforcement learning algorithms.

ai google ml rl tensorflow

Last synced: 01 May 2025

https://github.com/thu-ml/tianshou

An elegant PyTorch deep reinforcement learning library.

a2c atari bcq cql ddpg double-dqn dqn drl imitation-learning mujoco npg policy-gradient ppo pytorch rl sac td3 transferlab trpo

Last synced: 13 May 2025

https://github.com/areal-project/AReaL

The RL Bridge for LLM-based Agent Applications. Made Simple & Flexible.

agent llm llm-agent llm-reasoning machine-learning-systems mlsys reinforcement-learning rl

Last synced: 16 May 2026

https://github.com/inclusionai/areal

Lightning-Fast RL for LLM Reasoning and Agents. Made Simple & Flexible.

agent llm llm-agent llm-reasoning machine-learning-systems mlsys reinforcement-learning rl

Last synced: 04 Mar 2026

https://github.com/junxiaosong/AlphaZero_Gomoku

An implementation of the AlphaZero algorithm for Gomoku (also called Gobang or Five in a Row)

alphago alphago-zero alphazero board-game gobang gomoku mcts monte-carlo-tree-search pytorch reinforcement-learning rl self-learning tensorflow

Last synced: 12 May 2025

https://github.com/junxiaosong/alphazero_gomoku

An implementation of the AlphaZero algorithm for Gomoku (also called Gobang or Five in a Row)

alphago alphago-zero alphazero board-game gobang gomoku mcts monte-carlo-tree-search pytorch reinforcement-learning rl self-learning tensorflow

Last synced: 13 Apr 2025

https://github.com/pytorch/ELF

ELF: a platform for game research with AlphaGoZero/AlphaZero reimplementation

alpha-zero alphago-zero go reinforcement-learning rl rl-environment

Last synced: 01 Apr 2025

https://github.com/pytorch/elf

ELF: a platform for game research with AlphaGoZero/AlphaZero reimplementation

alpha-zero alphago-zero go reinforcement-learning rl rl-environment

Last synced: 30 Sep 2025

https://github.com/OpenPipe/ART

Agent Reinforcement Trainer: train multi-step agents for real-world tasks using GRPO. Give your agents on-the-job training. Reinforcement learning for Qwen2.5, Qwen3, Llama, Kimi, and more!

agent agentic-ai grpo kimi-ai llms lora qwen qwen3 reinforcement-learning rl

Last synced: 28 Jul 2025

https://github.com/dlr-rm/rl-baselines3-zoo

A training framework for Stable Baselines3 reinforcement learning agents, with hyperparameter optimization and pre-trained agents included.

deep-reinforcement-learning gym hyperparameter-optimization hyperparameter-search hyperparameter-tuning lab openai optimization pybullet pybullet-environments pytorch reinforcement-learning rl robotics sde stable-baselines tuning-hyperparameters

Last synced: 23 Apr 2025

https://github.com/intellabs/coach

Reinforcement Learning Coach by Intel AI Lab enables easy experimentation with state of the art Reinforcement Learning algorithms

carla coach deep-learning distributed-reinforcement-learning hierarchical-reinforcement-learning imitation-learning mujoco mxnet onnx openai-gym reinforcement-learning rl roboschool starcraft starcraft2 starcraft2-ai tensorflow

Last synced: 01 Oct 2025

https://github.com/IntelLabs/coach

Reinforcement Learning Coach by Intel AI Lab enables easy experimentation with state of the art Reinforcement Learning algorithms

carla coach deep-learning distributed-reinforcement-learning hierarchical-reinforcement-learning imitation-learning mujoco mxnet onnx openai-gym reinforcement-learning rl roboschool starcraft starcraft2 starcraft2-ai tensorflow

Last synced: 28 Mar 2025

https://github.com/DLR-RM/rl-baselines3-zoo

A training framework for Stable Baselines3 reinforcement learning agents, with hyperparameter optimization and pre-trained agents included.

deep-reinforcement-learning gym hyperparameter-optimization hyperparameter-search hyperparameter-tuning lab openai optimization pybullet pybullet-environments pytorch reinforcement-learning rl robotics sde stable-baselines tuning-hyperparameters

Last synced: 01 May 2025

https://github.com/pathak22/noreward-rl

[ICML 2017] TensorFlow code for Curiosity-driven Exploration for Deep Reinforcement Learning

curiosity deep-learning deep-neural-networks deep-reinforcement-learning doom exploration mario openai-gym rl self-supervised tensorflow

Last synced: 16 May 2025

https://github.com/araffin/rl-baselines-zoo

A collection of 100+ pre-trained RL agents using Stable Baselines, training and hyperparameter optimization included.

gym hyperparameter-optimization hyperparameter-search hyperparameter-tuning hyperparameters openai openai-gym optimization pybullet reinforcement-learning rl stable-baselines zoo

Last synced: 08 Apr 2025

https://github.com/erlerobot/gym-gazebo

Refer to https://github.com/AcutronicRobotics/gym-gazebo2 for the new version

deep-reinforcement-learning drl gazebo openai-gym reinforcement-learning rl robotics ros

Last synced: 05 May 2025

https://github.com/google-research/seed_rl

SEED RL: Scalable and Efficient Deep-RL with Accelerated Central Inference. Implements IMPALA and R2D2 algorithms in TF2 with SEED's architecture.

atari deepmind-lab gcp google-research-football impala r2d2 rl tf2

Last synced: 18 Mar 2025

https://github.com/fareedkhan-dev/all-rl-algorithms

Implementation of all RL algorithms in a simpler way

agent llm openai python reinforcement-learning rl

Last synced: 07 May 2025

https://github.com/google-research/rliable

[NeurIPS'21 Outstanding Paper] Library for reliable evaluation on RL and ML benchmarks, even with only a handful of seeds.

benchmarking evaluation-metrics google machine-learning reinforcement-learning rl

Last synced: 08 Apr 2025

https://github.com/araffin/rl-tutorial-jnrr19

Stable-Baselines tutorial for Journées Nationales de la Recherche en Robotique 2019

colab-notebook notebook python reinforcement-learning rl stable-baselines tutorial

Last synced: 04 Apr 2025

https://github.com/ashishpatel26/real-time-ml-project

A curated list of applied machine learning and data science notebooks and libraries across different industries.

application deep-learning deeplearning dl keras machine-learning machine-learning-algorithms machinelearning ml ml-application project pytorch real-time real-time-data rl tensorflow theano

Last synced: 13 Apr 2025

https://github.com/ashishpatel26/Real-time-ML-Project

A curated list of applied machine learning and data science notebooks and libraries across different industries.

application deep-learning deeplearning dl keras machine-learning machine-learning-algorithms machinelearning ml ml-application project pytorch real-time real-time-data rl tensorflow theano

Last synced: 29 Mar 2025

https://github.com/Toni-SM/skrl

Modular reinforcement learning library (on PyTorch and JAX) with support for NVIDIA Isaac Gym, Omniverse Isaac Gym and Isaac Lab

deep-learning deepmind gym gymnasium isaac-gym isaac-lab isaac-orbit isaac-sim isaaclab jax machine-learning nvidia-omniverse openai-gym python pytorch reinforcement-learning rl robosuite robotics skrl

Last synced: 02 Apr 2025

https://github.com/facebookresearch/meta-agents-research-environments

Meta Agents Research Environments is a comprehensive platform designed to evaluate AI agents in dynamic, realistic scenarios. Unlike static benchmarks, this platform introduces evolving environments where agents must adapt their strategies as new information becomes available, mirroring real-world challenges.

agents ai autonomous-agents benchmark evaluation large-language-models llm meta multi-agent-systems natural-language-processing reinforcement-learning rl simulation

Last synced: 16 Jun 2026

https://github.com/AcutronicRobotics/gym-gazebo2

gym-gazebo2 is a toolkit for developing and comparing reinforcement learning algorithms using ROS 2 and Gazebo

deep-reinforcement-learning drl gazebo gym reinforcement-learning rl robotics ros ros2

Last synced: 01 Apr 2025

https://github.com/google-deepmind/spriteworld

Spriteworld: a flexible, configurable python-based reinforcement learning environment

deepmind environment generative procedural rl sprites

Last synced: 05 Jul 2025

https://github.com/lucasalegre/morl-baselines

Multi-Objective Reinforcement Learning algorithms implementations.

gym gymnasium mo-gymnasium morl multi-objective multi-objective-rl pytorch reinforcement-learning rl rl-algorithms

Last synced: 15 May 2025

https://github.com/modelscope/awesome-deep-reasoning

Collect every awesome work about r1!

collection deepseek grpo o1 qwen r1 reasoning rl

Last synced: 14 Jun 2025

https://github.com/princeton-nlp/webshop

[NeurIPS 2022] 🛒WebShop: Towards Scalable Real-World Web Interaction with Grounded Language Agents

decision-making language language-grounding ml nlp rl rl-environment shopping sim-to-real web-based

Last synced: 05 Apr 2025

https://github.com/gbionics/amp-rsl-rl

🔁 AMP-RSL-RL: Adversarial Motion Priors for robotic RL (PPO + motion imitation)

amp isaaclab pytorch rl rls-rl

Last synced: 14 May 2026

https://github.com/stable-baselines-team/stable-baselines

Mirror of Stable-Baselines: a fork of OpenAI Baselines, implementations of reinforcement learning algorithms

baselines gym machine-learning openai openai-gym python reinforcement-learning rl stable-baselines toolbox

Last synced: 07 Oct 2025

https://github.com/araffin/learning-to-drive-in-5-minutes

Implementation of reinforcement learning approach to make a car learn to drive smoothly in minutes

donkey-car gym openai reinforcement-learning rl sac self-driving-car simulator soft-actor-critic srl stable-baselines state-representation-learning unity vae

Last synced: 19 Jul 2025

https://github.com/tsinghuac3i/awesome-rl-reasoning-recipes

Awesome RL Reasoning Recipes ("Triple R")

awesome-list deepseek-r1 llm open-source reasoning rl

Last synced: 10 Apr 2025

https://github.com/princeton-nlp/WebShop

[NeurIPS 2022] 🛒WebShop: Towards Scalable Real-World Web Interaction with Grounded Language Agents

decision-making language language-grounding ml nlp rl rl-environment shopping sim-to-real web-based

Last synced: 22 Apr 2025

https://github.com/bakkesmodorg/bakkesmodsdk

The current BakkesModSDK (Unofficial SDK for Rocket League)

api game league mod modding plugin plugins rl rocket rocket-league sdk unofficial

Last synced: 12 Apr 2025

https://github.com/instadeepai/flashbax

⚡ Flashbax: Accelerated Replay Buffers in JAX

buffers hpc jax machine-learning off-policy reinforcement-learning rl

Last synced: 03 Nov 2025

https://github.com/mihirp1998/VADER

Video Diffusion Alignment via Reward Gradients. We improve a variety of video diffusion models such as VideoCrafter, OpenSora, ModelScope and StableVideoDiffusion by finetuning them using various reward models such as HPS, PickScore, VideoMAE, VJEPA, YOLO, Aesthetics etc.

alignment diffusion reinforcement-learning reinforcement-learning-human-feedback rl rlhf vader video-diffusion video-diffusion-alignment

Last synced: 28 Mar 2025

https://github.com/learnables/cherry

A PyTorch Library for Reinforcement Learning Research

learning pytorch reinforcement reinforcement-learning rl

Last synced: 05 Apr 2025

https://github.com/internlm/oreal

Exploring the Limit of Outcome Reward for Learning Mathematical Reasoning

llm mathematics o1 reasoning rl

Last synced: 26 Jun 2025

https://github.com/lasgroup/SDPO

Reinforcement Learning via Self-Distillation (SDPO)

distillation llm reasoning rl

Last synced: 02 Jun 2026

https://github.com/axon-rl/gem

A Gym for Agentic LLMs

gym llm rl

Last synced: 05 Oct 2025

https://github.com/Draichi/T-1000

:zap: :zap: 𝘋𝘦𝘦𝘱 𝘙𝘓 𝘈𝘭𝘨𝘰𝘵𝘳𝘢𝘥𝘪𝘯𝘨 𝘸𝘪𝘵𝘩 𝘙𝘢𝘺 𝘈𝘗𝘐

algotrading bot ray reinforcement-learning-bot rl rllib trading trading-bot

Last synced: 29 Mar 2025

https://github.com/utilforever/baba-is-auto

Baba Is You simulator using C++ with some reinforcement learning

baba-is-you babaisyou cplusplus cpp cpp17 python-api reinforcement-learning rl rl-environment simulator-game

Last synced: 03 Nov 2025

https://github.com/gordicaleksa/pytorch-learn-reinforcement-learning

A collection of various RL algorithms like policy gradients, DQN and PPO. The goal of this repo will be to make it a go-to resource for learning about RL. How to visualize, debug and solve RL problems. I've additionally included playground.py for learning more about OpenAI gym, etc.

deep-learning deep-q-network dqn jupyter policy-gradient ppo python pytorch pytorch-dqn pytorch-implementation pytorch-policy-gradient pytorch-ppo reinforcement-learning reinforcement-learning-algorithms rl

Last synced: 12 Sep 2025

https://github.com/tiger-ai-lab/general-reasoner

General Reasoner: Advancing LLM Reasoning Across All Domains

llm reasoning rl

Last synced: 13 Jun 2025

https://github.com/opendilab/generativerl

Python library for solving reinforcement learning (RL) problems using generative models (e.g. Diffusion Models).

diffusion diffusion-models diffusion-policy flow-model generative-ai generative-model offline-rl reinforcement-learning rl

Last synced: 30 Oct 2025

https://github.com/CJReinforce/PURE

Official code for the paper, "Stop Summation: Min-Form Credit Assignment Is All Process Reward Model Needs for Reasoning"

llm mathematics o1 r1 reasoning reinforcement-finetuning reinforcement-learning rl

Last synced: 08 May 2025

https://github.com/ami-iit/amp-rsl-rl

🔁 AMP-RSL-RL: Adversarial Motion Priors for robotic RL (PPO + motion imitation)

amp isaaclab pytorch rl rls-rl

Last synced: 25 Jun 2025

https://github.com/mit-acl/rl_collision_avoidance

Training code for GA3C-CADRL algorithm (collision avoidance with deep RL)

collision-avoidance deep-reinforcement-learning multiagent rl robotics

Last synced: 21 Sep 2025

https://github.com/shuaibinli/rl_carla

Train auto_car in CARLA simulator with RL algorithms(SAC).

carla rl

Last synced: 11 Jun 2025

https://github.com/cy69855522/ai-paper-drawer

人工智能论文关键点集结。This project aims to collect key points of AI papers.

ai-papers cv deep-learning gans gcn gnn graph nlp rl

Last synced: 21 Aug 2025

https://github.com/bakkesmodorg/bakkesmod2-plugins

Default plugins for BakkesMod 2 (A Rocket League training framework)

api bakkes bakkesmod framework league mod modding plugin plugins rl rocket rocket-league sdk training

Last synced: 11 Aug 2025

https://github.com/woooodyy/llm-reverse-curriculum-rl

Implementation of the ICML 2024 paper "Training Large Language Models for Reasoning through Reverse Curriculum Reinforcement Learning" presented by Zhiheng Xi et al.

llm reasoning rl

Last synced: 12 Aug 2025

https://github.com/rkinas/rlhf_thinking_model

This repository serves as a collection of research notes and resources on training large language models (LLMs) and Reinforcement Learning from Human Feedback (RLHF). It focuses on the latest research, methodologies, and techniques for fine-tuning language models.

llm rl rlhf

Last synced: 09 Apr 2025

https://github.com/chendrag/mujoco-benchmark

Provide full reinforcement learning benchmark on mujoco environments, including ddpg, sac, td3, pg, a2c, ppo, library

baseline benchmark ddpg drl mujoco performance ppo pytorch results rl sac tianshou

Last synced: 17 Mar 2025

https://github.com/mymusise/trading-gym

A Trading environment base on Gym

drl gym python3 reinforcement-learning rl trading

Last synced: 15 Apr 2025

https://github.com/stillonearth/bevy_rl

Reinforcement Learning environments with Bevy

bevy gym rl

Last synced: 30 Oct 2025

https://github.com/maluuba/hra

Hybrid Reward Architecture

reinforcement-learning rl

Last synced: 14 Apr 2025

https://github.com/mathiswellmann/gym-rs

OpenAI's Gym written in pure Rust for blazingly fast performance

ai ml openai-gym reinforcement-learning rl

Last synced: 19 Sep 2025

https://github.com/princeton-nlp/calm-textgame

[EMNLP 2020] Keep CALM and Explore: Language Models for Action Generation in Text-based Games

calm gpt n-gram nlp rl text-based-game

Last synced: 27 Apr 2025

https://github.com/MathisWellmann/gym-rs

OpenAI's Gym written in pure Rust for blazingly fast performance

ai ml openai-gym reinforcement-learning rl

Last synced: 31 Mar 2025

https://github.com/gair-nlp/octothinker

Revisiting Mid-training in the Era of RL Scaling

llama llm mid-training post-training pre-training qwen reasoning rl verl

Last synced: 30 Jun 2025

https://github.com/fareedkhan-dev/rag-with-rl

Maximizing the Performance of a Simple RAG using RL

llm openai python rag reinforcement-learning rl

Last synced: 04 Apr 2026

https://github.com/aim-uofa/active-o3

ACTIVE-O3: Empowering Multimodal Large Language Models with Active Perception via GRPO

active-perception active-vision grpo mllms o3 rl thinking-with-image

Last synced: 28 Jun 2025

https://github.com/aim-uofa/omni-r1

Official Repo of Omni-R1: Reinforcement Learning for Omnimodal Reasoning via Two-System Collaboration

grpo mllms omnimodal rl

Last synced: 28 Jun 2025

https://github.com/traffic-alpha/illm-tsc

This repository contains the code for the paper“iLLM-TSC: Integration reinforcement learning and large language model for traffic signal control policy improvement”

llm reinforcement-learning rl tsc

Last synced: 22 Aug 2025

https://github.com/benbaarber/rl

A rust reinforcement learning library

burn deep-learning machine-learning ml reinforcement-learning rl rust

Last synced: 25 Dec 2025

https://github.com/blackhc/mdp

Make it easy to specify simple MDPs that are compatible with the OpenAI Gym.

mdp openai-gym rl

Last synced: 28 Apr 2025

https://github.com/crumblyliquid/bakkeslinux

Guide for running BakkesMod on Linux

bakkes bakkesmod league linux mod rl rocket rocket-league training

Last synced: 23 Oct 2025