Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/kohlerhector/tree-mbpo
Study Model-Based Policy Optimization by varying the model estimator classes (e.g Decision Trees vs MLP)
https://github.com/kohlerhector/tree-mbpo
decision-tree mbpo mbrl mlp rl sac scikit-learn stable-baselines3
Last synced: about 5 hours ago
JSON representation
Study Model-Based Policy Optimization by varying the model estimator classes (e.g Decision Trees vs MLP)
- Host: GitHub
- URL: https://github.com/kohlerhector/tree-mbpo
- Owner: KohlerHECTOR
- Created: 2024-02-03T15:09:41.000Z (9 months ago)
- Default Branch: main
- Last Pushed: 2024-02-07T10:40:46.000Z (9 months ago)
- Last Synced: 2024-02-14T11:28:33.694Z (9 months ago)
- Topics: decision-tree, mbpo, mbrl, mlp, rl, sac, scikit-learn, stable-baselines3
- Language: Python
- Homepage:
- Size: 3.09 MB
- Stars: 2
- Watchers: 2
- Forks: 0
- Open Issues: 3
-
Metadata Files:
- Readme: readme.md
Awesome Lists containing this project
README
###### For Tree-Based-Exploration see: https://github.com/KohlerHECTOR/TREX-Tree-Reward-EXploration
## Only Continuous actionsInstall scikit-learn and SB3
```pip3 install -r requirements.txt```
![trees-mlp](https://github.com/KohlerHECTOR/MBPO-Scikit-Stable/blob/main/mbpo_schematics_rdme/evals.png?raw=true)
![trees-mlp-times](https://github.com/KohlerHECTOR/MBPO-Scikit-Stable/blob/main/mbpo_schematics_rdme/times.png?raw=true)
![trees](https://github.com/KohlerHECTOR/MBPO-Scikit-Stable/blob/main/mbpo_schematics_rdme/evals-gsteps.png?raw=true)
### Available Models are Decision Trees, best CV Trees, and MLPs
### Available Policy Optim Algos are SAC and TD3Launch MBPO for 100 iterations on InvertedPendulum with Decision Trees as Model estimators and SAC as policy optim.
Results are saved in 'Experience_Results/pendul-tree-sac/':```python3 experience.py InvertedPendulum-v4 tree sac 100 pendul-tree-sac```
Launch MBPO for 100 iterations on InvertedPendulum with 2x64 MLP as Model estimators and SAC as policy optim.
Results are saved in 'Experience_Results/pendul-mlp-sac/':```python3 experience.py InvertedPendulum-v4 mlp sac 100 pendul-mlp-sac```
Save Plots of comparisons 'Experience_Results/Comparison-date-time/':
```python3 compare_experiences.py pendul-tree-sac pendul-mlp-sac```
Save Plots of results in 'Experience_Results/pendul-tree-sac/':
```python3 plot_experience.py pendul-tree-sac```
MBPO: https://arxiv.org/abs/1906.08253
![MBPO-structure](https://github.com/KohlerHECTOR/MBPO-Scikit-Stable/blob/main/mbpo_schematics_rdme/mbpo-structure.png?raw=true)
![MBPO-rollout](https://github.com/KohlerHECTOR/MBPO-Scikit-Stable/blob/main/mbpo_schematics_rdme/mbpo-rollout.png?raw=true)