https://github.com/stratismarkou/continuous-psrl
https://github.com/stratismarkou/continuous-psrl
Last synced: about 1 year ago
JSON representation
- Host: GitHub
- URL: https://github.com/stratismarkou/continuous-psrl
- Owner: stratisMarkou
- License: mit
- Created: 2021-06-08T15:19:45.000Z (almost 5 years ago)
- Default Branch: main
- Last Pushed: 2021-07-22T18:54:43.000Z (almost 5 years ago)
- Last Synced: 2025-02-01T14:11:09.714Z (over 1 year ago)
- Language: Python
- Size: 568 KB
- Stars: 0
- Watchers: 3
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Continuous PSRL using GPs
## Infrastructure notes
Things we would like to add to the codebase:
* Add Pendulum and Cartpole eenvironments
* A working version of policy gradients using the exact models.
* Unit tests for initial distribution
* Add LBFGS optimisation
* Add rng to various classes: models, agents, policy
* Add oracle agent for low-dimensional environments
* Agent snapshot saving and loading
* Consider adding stateful optimiser
## Fixes to get the agent working
Modifications, sanity checks and fixes to get the basic agent working:
* Ensure models and policy train properly:
- Train GP models via LBFGS
- Give lots of data to the models, try to optimise policy
## General ideas
* Updating the models using continual learning
* Multi-output GPs, correlated output models