Projects in Awesome Lists tagged with policy-optimization
A curated list of projects in awesome lists tagged with policy-optimization .
https://github.com/chauncygu/multi-agent-constrained-policy-optimisation
Multi-Agent Constrained Policy Optimisation (MACPO; MAPPO-L).
multi-agent-reinforcement-learning policy-optimization safe-reinforcement-learning
Last synced: 04 Mar 2026
https://github.com/lucidrains/dmpo
Implementation and explorations into MPO / DMPO
artificial-intelligence deep-learning policy-optimization reinforcement-learning
Last synced: 06 Jun 2026
https://github.com/shaheennabi/reinforcement-or-deep-reinforcement-learning-practices-and-mini-projects
Reinforcement Learning (RL)! This repository is your hands-on guide to implementing RL algorithms, from Markov Decision Processes (MDPs) to advanced methods like PPO and DDPG. Build smart agents, learn the math behind policies, and experiment with real-world applications!
actor-critic-algorithm agent markov-decision-processes model-based-rl model-free-rl monte-carlo policy-gradient policy-optimization proximal-policy-optimization reinforcement-learning research temporal-differencing-learning
Last synced: 11 Oct 2025
https://github.com/shaheennabi/rlvr_grpo-experiment-with-math500
A small experiment repository comparing a base reasoning model against RLVR-GRPO checkpoints on the Math500 dataset. It includes evaluation results, short-form observations, and a local temp_clone of the full open-posttraining-system codebase for reference.
evaluating-models grpo-checkpoint math500 open-posttraining-system policy-optimization post-training reasoning-models reinforcement-learning rlvr-grpo sparse-rewards
Last synced: 18 Jun 2026
https://github.com/sahel13/particle-pomdp
Code accompanying the NeurIPS 2025 paper "Sequential Monte Carlo for Policy Optimization in Continuous POMDPs".
policy-optimization pomdps reinforcement-learning sequential-monte-carlo
Last synced: 15 May 2026
https://github.com/mehdishahbazi/reinforce-cart-pole-gymnasium
This repo implements the REINFORCE algorithm for solving the Cart Pole V1 environment of the Gymnasium library using Python 3.8 and PyTorch 2.0.1.
cart cart-pole cart-pole-balancing cart-pole-v1 deep-learning deep-reinforcement-learning drl drl-pytorch gym gymnasium pendulum policy policy-based policy-gradient policy-optimization python pytorch reinforce reinforce-algorithm reinforcement-learning
Last synced: 03 May 2026