Projects in Awesome Lists tagged with ppo

https://github.com/datawhalechina/easy-rl

强化学习中文教程（蘑菇书🍄），在线阅读地址：https://datawhalechina.github.io/easy-rl/

a3c ddpg deep-reinforcement-learning double-dqn dqn dueling-dqn easy-rl imitation-learning policy-gradient ppo q-learning reinforcement-learning sarsa td3

Last synced: 30 Sep 2024

https://github.com/morvanzhou/reinforcement-learning-with-tensorflow

Simple Reinforcement learning tutorials, 莫烦Python 中文AI教学

a3c actor-critic asynchronous-advantage-actor-critic ddpg deep-deterministic-policy-gradient deep-q-network double-dqn dqn dueling-dqn machine-learning policy-gradient ppo prioritized-replay proximal-policy-optimization q-learning reinforcement-learning sarsa sarsa-lambda tensorflow-tutorials tutorial

Last synced: 26 Sep 2024

https://github.com/MorvanZhou/Reinforcement-learning-with-tensorflow

Simple Reinforcement learning tutorials, 莫烦Python 中文AI教学

a3c actor-critic asynchronous-advantage-actor-critic ddpg deep-deterministic-policy-gradient deep-q-network double-dqn dqn dueling-dqn machine-learning policy-gradient ppo prioritized-replay proximal-policy-optimization q-learning reinforcement-learning sarsa sarsa-lambda tensorflow-tutorials tutorial

Last synced: 01 Aug 2024

https://github.com/thu-ml/tianshou

An elegant PyTorch deep reinforcement learning library.

a2c atari bcq cql ddpg double-dqn dqn drl imitation-learning mujoco npg policy-gradient ppo pytorch rl sac td3 transferlab trpo

Last synced: 29 Sep 2024

https://github.com/vwxyzjn/cleanrl

High-quality single file implementation of Deep Reinforcement Learning algorithms with research-friendly features (PPO, DQN, C51, DDPG, TD3, SAC, PPG)

a2c actor-critic advantage-actor-critic ale atari deep-learning deep-reinforcement-learning gym machine-learning phasic-policy-gradient ppo proximal-policy-optimization python pytorch reinforcement-learning wandb

Last synced: 27 Sep 2024

https://github.com/udacity/deep-reinforcement-learning

Repo for the Deep Reinforcement Learning Nanodegree program

cross-entropy ddpg deep-reinforcement-learning dqn dynamic-programming hill-climbing ml-agents neural-networks openai-gym openai-gym-solutions ppo pytorch pytorch-rl reinforcement-learning reinforcement-learning-algorithms rl-algorithms

Last synced: 07 Aug 2024

https://github.com/andri27-ts/reinforcement-learning

Learn Deep Reinforcement Learning in 60 days! Lectures & Code in Python. Reinforcement Learning + Deep Learning

a2c artificial-intelligence deep-learning deep-reinforcement-learning deepmind dqn evolution-strategies machine-learning policy-gradients ppo qlearning reinforcement-learning

Last synced: 30 Sep 2024

https://github.com/andri27-ts/Reinforcement-Learning

Learn Deep Reinforcement Learning in 60 days! Lectures & Code in Python. Reinforcement Learning + Deep Learning

a2c artificial-intelligence deep-learning deep-reinforcement-learning deepmind dqn evolution-strategies machine-learning policy-gradients ppo qlearning reinforcement-learning

Last synced: 31 Jul 2024

https://github.com/sweetice/deep-reinforcement-learning-with-pytorch

PyTorch implementation of DQN, AC, ACER, A2C, A3C, PG, DDPG, TRPO, PPO, SAC, TD3 and ....

a2c a3c actor-critic actor-critic-algorithm algorithm alphago deep-learning deep-reinforcement-learning dqn policy-gradient ppo pytorch reinforce resnet sac sarsa td3 trpo

Last synced: 30 Sep 2024

https://github.com/simoninithomas/deep_reinforcement_learning_course

Implementations from the free course Deep Reinforcement Learning with Tensorflow and PyTorch

a2c actor-critic deep-learning deep-q-learning deep-q-network deep-reinforcement-learning ppo pytorch qlearning tensorflow tensorflow-tutorials unity

Last synced: 26 Sep 2024

https://github.com/simoninithomas/Deep_reinforcement_learning_Course

Implementations from the free course Deep Reinforcement Learning with Tensorflow and PyTorch

a2c actor-critic deep-learning deep-q-learning deep-q-network deep-reinforcement-learning ppo pytorch qlearning tensorflow tensorflow-tutorials unity

Last synced: 07 Aug 2024

https://github.com/sweetice/Deep-reinforcement-learning-with-pytorch

PyTorch implementation of DQN, AC, ACER, A2C, A3C, PG, DDPG, TRPO, PPO, SAC, TD3 and ....

a2c a3c actor-critic actor-critic-algorithm algorithm alphago deep-learning deep-reinforcement-learning dqn policy-gradient ppo pytorch reinforce resnet sac sarsa td3 trpo

Last synced: 02 Aug 2024

https://github.com/ai4finance-foundation/elegantrl

Massively Parallel Deep Reinforcement Learning. 🔥

a2c bipedalwalkerhardcore ddpg dqn drl-pytorch efficient gae lightweight model-free-rl multiple-gpu per ppo pytorch reinforcement-learning sac stable td3

Last synced: 30 Sep 2024

https://github.com/AI4Finance-Foundation/ElegantRL

Massively Parallel Deep Reinforcement Learning. 🔥

a2c bipedalwalkerhardcore ddpg dqn drl-pytorch efficient gae lightweight model-free-rl multiple-gpu per ppo pytorch reinforcement-learning sac stable td3

Last synced: 01 Aug 2024

https://github.com/ikostrikov/pytorch-a2c-ppo-acktr-gail

PyTorch implementation of Advantage Actor Critic (A2C), Proximal Policy Optimization (PPO), Scalable trust-region method for deep reinforcement learning using Kronecker-factored approximation (ACKTR) and Generative Adversarial Imitation Learning (GAIL).

a2c acktr actor-critic advantage-actor-critic ale atari continuous-control deep-learning deep-reinforcement-learning hessian kfac kronecker-factored-approximation mujoco natural-gradients ppo proximal-policy-optimization pytorch reinforcement-learning roboschool second-order

Last synced: 30 Sep 2024

https://github.com/shangtongzhang/deeprl

Modularized Implementation of Deep RL Algorithms in PyTorch

a2c categorical-dqn ddpg deep-reinforcement-learning deeprl double-dqn dqn dueling-network-architecture option-critic option-critic-architecture ppo prioritized-experience-replay pytorch quantile-regression rainbow td3

Last synced: 30 Sep 2024

https://github.com/ShangtongZhang/DeepRL

Modularized Implementation of Deep RL Algorithms in PyTorch

a2c categorical-dqn ddpg deep-reinforcement-learning deeprl double-dqn dqn dueling-network-architecture option-critic option-critic-architecture ppo prioritized-experience-replay pytorch quantile-regression rainbow td3

Last synced: 01 Aug 2024

https://github.com/seungeunrho/minimalrl

Implementations of basic RL algorithms with minimal lines of codes! (pytorch based)

a2c a3c acer ddpg deep-learning deep-reinforcement-learning dqn machine-learning policy-gradients ppo pytorch reinforce reinforcement-learning sac simple

Last synced: 30 Sep 2024

https://github.com/seungeunrho/minimalRL

Implementations of basic RL algorithms with minimal lines of codes! (pytorch based)

a2c a3c acer ddpg deep-learning deep-reinforcement-learning dqn machine-learning policy-gradients ppo pytorch reinforce reinforcement-learning sac simple

Last synced: 01 Aug 2024

https://github.com/AI4Finance-Foundation/FinRL-Trading

For trading. Please star.

a2c-algorithm automated-stock-trading ddpg deep-reinforcement-learning ensemble-strategy openai-gym ppo sharpe-ratio stock-trading stock-trading-strategy

Last synced: 02 Aug 2024

https://github.com/ai4finance-foundation/finrl-trading

For trading. Please star.

a2c-algorithm automated-stock-trading ddpg deep-reinforcement-learning ensemble-strategy openai-gym ppo sharpe-ratio stock-trading stock-trading-strategy

Last synced: 30 Sep 2024

https://github.com/nikhilbarhate99/ppo-pytorch

Minimal implementation of clipped objective Proximal Policy Optimization (PPO) in PyTorch

deep-learning deep-reinforcement-learning policy-gradient ppo ppo-pytorch proximal-policy-optimization pytorch pytorch-implmention pytorch-tutorial reinforcement-learning reinforcement-learning-algorithms

Last synced: 30 Sep 2024

https://github.com/nikhilbarhate99/PPO-PyTorch

Minimal implementation of clipped objective Proximal Policy Optimization (PPO) in PyTorch

deep-learning deep-reinforcement-learning policy-gradient ppo ppo-pytorch proximal-policy-optimization pytorch pytorch-implmention pytorch-tutorial reinforcement-learning reinforcement-learning-algorithms

Last synced: 02 Aug 2024

https://github.com/marlbenchmark/on-policy

This is the official implementation of Multi-Agent PPO (MAPPO).

algorithms hanabi mappo mpes multi-agent ppo smac starcraftii

Last synced: 30 Sep 2024

https://github.com/kengz/slm-lab

Modular Deep Reinforcement Learning framework in PyTorch. Companion library of the book "Foundations of Deep Reinforcement Learning".

a2c a3c benchmark deep-reinforcement-learning dqn policy-gradient ppo pytorch reinforcement-learning sac

Last synced: 25 Sep 2024

https://github.com/kengz/SLM-Lab

Modular Deep Reinforcement Learning framework in PyTorch. Companion library of the book "Foundations of Deep Reinforcement Learning".

a2c a3c benchmark deep-reinforcement-learning dqn policy-gradient ppo pytorch reinforcement-learning sac

Last synced: 01 Aug 2024

https://github.com/khrylx/pytorch-rl

PyTorch implementation of Deep Reinforcement Learning: Policy Gradient methods (TRPO, PPO, A2C) and Generative Adversarial Imitation Learning (GAIL). Fast Fisher vector product TRPO.

a2c deep-reinforcement-learning fisher-vectors generative-adversarial-network policy-gradient ppo proximal-policy-optimization pytorch pytorch-rl reinforcement-learning trpo

Last synced: 30 Sep 2024

https://github.com/uvipen/super-mario-bros-ppo-pytorch

Proximal Policy Optimization (PPO) algorithm for Super Mario Bros

ai deep-learning gym mario openai openai-gym ppo ppo2 proximal-policy-optimization python python3 pytorch reinforcement-learning super-mario-bros

Last synced: 30 Sep 2024

https://github.com/Khrylx/PyTorch-RL

PyTorch implementation of Deep Reinforcement Learning: Policy Gradient methods (TRPO, PPO, A2C) and Generative Adversarial Imitation Learning (GAIL). Fast Fisher vector product TRPO.

a2c deep-reinforcement-learning fisher-vectors generative-adversarial-network policy-gradient ppo proximal-policy-optimization pytorch pytorch-rl reinforcement-learning trpo

Last synced: 02 Aug 2024

https://github.com/uvipen/Super-mario-bros-PPO-pytorch

Proximal Policy Optimization (PPO) algorithm for Super Mario Bros

ai deep-learning gym mario openai openai-gym ppo ppo2 proximal-policy-optimization python python3 pytorch reinforcement-learning super-mario-bros

Last synced: 09 Aug 2024

https://github.com/qfettes/deeprl-tutorials

Contains high quality implementations of Deep Reinforcement Learning algorithms written in PyTorch

a2c actor-critic advantage-actor-critic categorical-dqn deep-q-network deep-recurrent-q-network deep-reinforcement-learning deeprl-tutorials double-dqn dueling-dqn gae multi-step-learning noisy-networks ppo prioritized-experience-replay python3 pytorch quantile-regression rainbow reinforcement-learning

Last synced: 30 Sep 2024

https://github.com/qfettes/DeepRL-Tutorials

Contains high quality implementations of Deep Reinforcement Learning algorithms written in PyTorch

a2c actor-critic advantage-actor-critic categorical-dqn deep-q-network deep-recurrent-q-network deep-reinforcement-learning deeprl-tutorials double-dqn dueling-dqn gae multi-step-learning noisy-networks ppo prioritized-experience-replay python3 pytorch quantile-regression rainbow reinforcement-learning

Last synced: 02 Aug 2024

https://github.com/sudharsan13296/Hands-On-Reinforcement-Learning-With-Python

Master Reinforcement and Deep Reinforcement Learning using OpenAI Gym and TensorFlow

asynchronous-advantage-actor-critic deep-deterministic-policy-gradient deep-learning-algorithms deep-q-network deep-recurrent-q-network deep-reinforcement-learning double-dqn drqn dueling-dqn hindsight-experience-replay markov-decision-processes monte-carlo openai-gym policy-gradient policy-gradients ppo q-learning reinforcement-learning sarsa trpo

Last synced: 01 Aug 2024

https://github.com/TianhongDai/reinforcement-learning-algorithms

This repository contains most of pytorch implementation based classic deep reinforcement learning algorithms, including - DQN, DDQN, Dueling Network, DDPG, SAC, A2C, PPO, TRPO. (More algorithms are still in progress)

a2c actor-critic algorithm atari2600 ddpg deep-learning deep-reinforcement-learning dqn dueling-dqn flappy-bird ppo proximal-policy-optimization pytorch sac soft-actor-critic trpo trust-region-policy-optimization

Last synced: 07 Aug 2024

https://github.com/cpnota/autonomous-learning-library

A PyTorch library for building deep reinforcement learning agents.

a2c advantage-actor-critic ddpg deep-deterministic-policy-gradient deep-q-learning deep-reinforcement-learning dqn dqn-pytorch ppo proximal-policy-optimization reinforcement-learning reinforcement-learning-algorithms sac soft-actor-critic

Last synced: 01 Aug 2024

https://github.com/luchris429/purejaxrl

Really Fast End-to-End Jax RL Implementations

deep-reinforcement-learning jax ppo reinforcement-learning reinforcement-learning-algorithms

Last synced: 31 Jul 2024

https://github.com/lcswillems/rl-starter-files

RL starter files in order to immediately train, visualize and evaluate an agent without writing any line of code

a2c a3c minigrid multi-process ppo preprocessed-observations pytorch reward-shaping

Last synced: 07 Aug 2024

https://github.com/jianzhnie/llamatuner

Easy and Efficient Finetuning LLMs. (Supported LLama, LLama2, LLama3, Qwen, Baichuan, GLM , Falcon) 大模型高效量化训练+部署.

chatgpt dpo llama llama3 mixtral ppo qlora qwen rlhf

Last synced: 01 Oct 2024

https://github.com/jianzhnie/LLamaTuner

Easy and Efficient Finetuning LLMs. (Supported LLama, LLama2, LLama3, Qwen, Baichuan, GLM , Falcon) 大模型高效量化训练+部署.

chatgpt dpo llama llama3 mixtral ppo qlora qwen rlhf

Last synced: 02 Aug 2024

https://github.com/dongminlee94/deep_rl

PyTorch implementation of deep reinforcement learning algorithms

a2c ddpg ddqn deep-reinforcement-learning dqn model-free-rl npg ppo pytorch sac sac-aea td3 trpo vpg

Last synced: 02 Aug 2024

https://github.com/iffix/machin

Reinforcement learning library(framework) designed for PyTorch, implements DQN, DDPG, A2C, PPO, SAC, MADDPG, A3C, APEX, IMPALA ...

a3c-pytorch ddpg deep-learning distributed dqn ppo prioritized-experience-replay python pytorch pytorch-lightning pytorch-reinforcement-learning reinforcement-learning sac td3

Last synced: 02 Oct 2024

https://github.com/zuoxingdong/lagom

lagom: A PyTorch infrastructure for rapid prototyping of reinforcement learning algorithms.

artificial-intelligence cem cmaes ddpg deep-deterministic-policy-gradient deep-learning deep-reinforcement-learning evolution-strategies machine-learning mujoco policy-gradient ppo proximal-policy-optimization python pytorch reinforcement-learning research sac soft-actor-critic td3

Last synced: 03 Aug 2024

https://github.com/rlgraph/rlgraph

RLgraph: Modular computation graphs for deep reinforcement learning

deep-learning deep-reinforcement-learning dqn machine-learning neural-networks ppo pytorch reinforcement-learning tensorflow

Last synced: 01 Aug 2024

https://github.com/huawei-noah/xingtian

xingtian is a componentized library for the development and verification of reinforcement learning algorithms

dqn impala muzero ppo qmix reinforcement-learning-algorithms

Last synced: 08 Aug 2024

https://github.com/idreesshaikh/Autonomous-Driving-in-Carla-using-Deep-Reinforcement-Learning

Deep Reinforcement Learning (PPO) in Autonomous Driving (Carla) [from scratch]

autonomous-driving carla-driving-simulator carla-environment carla-simulator ddqn deep-learning deep-learning-algorithms deep-reinforcement-learning openai ppo proximal-policy-optimization pytorch reinforcement-learning self-driving self-driving-car self-driving-car-simulation self-driving-cars

Last synced: 31 Jul 2024

https://github.com/miroblog/tf_deep_rl_trader

Trading Environment(OpenAI Gym) + PPO(TensorForce)

ppo proximal-policy-optimization stock-market tensorflow tensorforce trading

Last synced: 31 Jul 2024

https://github.com/lcswillems/torch-ac

Recurrent and multi-process PyTorch implementation of deep reinforcement Actor-Critic algorithms A2C and PPO

a2c a3c actor-critic advantage-actor-critic deep-reinforcement-learning minigrid multi-process ppo proximal-policy-optimization pytorch recurrent recurrent-neural-networks reinforcement-learning reward-shaping

Last synced: 07 Aug 2024

https://github.com/jianzhnie/open-chatgpt

The open source implementation of ChatGPT, Alpaca, Vicuna and RLHF Pipeline. 从0开始实现一个ChatGPT.

chatgpt gpt llama llm lora peft ppo rlhf stanford-alpaca

Last synced: 31 Jul 2024

https://github.com/yongzhuo/chatglm-maths

chatglm-6b微调/LORA/PPO/推理, 样本为自动生成的整数/小数加减乘除运算, 可gpu/cpu

chatglm chatgpt fine-tuning maths maths-problem ppo

Last synced: 03 Aug 2024

https://github.com/adik993/ppo-pytorch

Proximal Policy Optimization(PPO) with Intrinsic Curiosity Module(ICM)

cartpole-v1 deep-learning generalized-advantage-estimation icm intrinsic-curiosity-module mountaincar-v0 pendulum-v0 ppo proximal-policy-optimization pytorch reinforcement-learning

Last synced: 07 Aug 2024

https://github.com/lorenmt/minimal-isaac-gym

A Minimal Example of Isaac Gym with DQN and PPO.

dqn isaac-gym ppo pytorch

Last synced: 01 Aug 2024

https://github.com/jcwleo/mario_rl

a2c actor-critic curiosity-driven deep-learning icm ppo pytorch reinforcement-learning supermariobros

Last synced: 01 Aug 2024

https://github.com/CN-UPB/DeepCoMP

Dynamic multi-cell selection for cooperative multipoint (CoMP) using (multi-agent) deep reinforcement learning

cell-selection cellular comp mobile multi-agent-reinforcement-learning ppo python ray reinforcement-learning rllib simulation wireless

Last synced: 04 Aug 2024

https://github.com/guichristmann/thormang3-gogoro-PPO

Steering-based control of a two-wheeled vehicle using RL-PPO and NVIDIA Isaac Gym.

isaac-gym ppo pytorch

Last synced: 01 Aug 2024

https://github.com/bilalkabas/DRL-Nav

Deep Reinforcement Learning based autonomous navigation in realistic simulation environments.

airsim deep-reinforcement-learning gym ppo unreal-engine

Last synced: 29 Jul 2024

https://github.com/sharif1093/digideep

A Deep Reinforcement Learning (DeepRL) package for RL algorithm developers.

ddpg deep-reinforcement-learning dm-control openai-gym ppo pytorch reinforcement-learning sac

Last synced: 07 Aug 2024

https://github.com/ajaysub110/rlin200lines

PyTorch implementations of Reinforcement Learning algorithms in less than 200 lines

deep-reinforcement-learning dqn machine-learning policy-gradient ppo pytorch-implementations reinforcement-learning reinforcement-learning-algorithms soft-actor-critic

Last synced: 02 Oct 2024

https://github.com/rsgoksel/mechopter

PyGame-based quadcopter simulator & Reinforcement Learning Project

a2c dqn ipynb ppo pygame python quadcopter reinforcement-learning

Last synced: 27 Sep 2024

https://github.com/qdsang/rocket_landing_simulation

Rocket landing simulation using RL PPO

landing ppo rl rocket spacex

Last synced: 26 Sep 2024

https://github.com/indutny/haggling_rl

RL learning model for Hola's Haggling Challenge

a2c haggle ppo reinforcement-learning

Last synced: 01 Oct 2024