Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/mikelma/componet
Source code of the ICML24 paper "Self-Composing Policies for Scalable Continual Reinforcement Learning" (selected for oral presentation)
https://github.com/mikelma/componet
continual-learning continual-reinforcement-learning deep-learning icml icml-2024 pytorch reinforcement-learning
Last synced: 5 days ago
JSON representation
Source code of the ICML24 paper "Self-Composing Policies for Scalable Continual Reinforcement Learning" (selected for oral presentation)
- Host: GitHub
- URL: https://github.com/mikelma/componet
- Owner: mikelma
- License: gpl-3.0
- Created: 2024-05-06T10:09:08.000Z (6 months ago)
- Default Branch: main
- Last Pushed: 2024-07-20T09:53:07.000Z (4 months ago)
- Last Synced: 2024-07-20T10:58:59.010Z (4 months ago)
- Topics: continual-learning, continual-reinforcement-learning, deep-learning, icml, icml-2024, pytorch, reinforcement-learning
- Language: Python
- Homepage:
- Size: 73.1 MB
- Stars: 7
- Watchers: 3
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Self-Composing Policies for Scalable Continual Reinforcement Learning
This repository is part of the supplementary material of the paper [*Self-Composing Policies for Scalable Continual Reinforcement Learning*](https://openreview.net/pdf?id=f5gtX2VWSB), published in [ICML 2024](https://icml.cc/virtual/2024/poster/33472) and selected for [oral presentation](https://icml.cc/virtual/2024/oral/35492).
To cite this project in publications:
```bibtex
@inproceedings{malagon2024selfcomp,
title={Self-Composing Policies for Scalable Continual Reinforcement Learning},
author={Malagon, Mikel and Ceberio, Josu and Lozano, Jose A},
booktitle={International Conference on Machine Learning (ICML)},
year={2024}
}
```## Structure of the repo π³
The repository is organized into three main parts: `componet`, that
holds the implementation of the proposal of the paper;
`experiments/atari`, where the experiments of the SpaceInvaders and
Freeway sequences are located; and `experiments/meta-world`, that
contains the experiments of the Meta-World sequence.Click here to unfold the structure π³ of the repo.
```bash
βββ componet/ # The implementation of the proposed CompoNet architecture
β
βββ experiments/
β βββ atari/ # Contains all the code related to the SpaceInvaders and Freeway sequences
β β βββ data.tar.xz # Contains the compressed CSV files used for the figures
β β βββ models/ # Implements PPO agents for all of the considered methods
β β βββ process_results.py # Processes the runs generating the metrics and plots
β β βββ run_experiments.py # Utility script to call `run_ppo.py` for multiple settings
β β βββ run_ppo.py # Main script to run the PPO experiments
β β βββ task_utils.py # Implements several task-related utils
β β βββ test_agent.py # Main script to evaluate trained agents
β β βββ plot_ablation_input_head.py # Plots input attention head ablation results
β β βββ plot_ablation_output_head.py # Plots output attention head ablation results
β β βββ plot_arch_val.py # Plots architecture validation results
β β βββ plot_dino_vs_cnn.py # Plots results of the comparison between DINO and CNN-based agents
β β βββ transfer_matrix.py # Computes and plots the transfer matrices of SpaceInvaders and Freeway
β β βββ requirements.txt # Requirements file for these experiments
β β
β βββ meta-world/ # Contains all the experiments in the Meta-World tasks
β βββ data.tar.xz # Contains the compressed CSV files used for the figures
β βββ benchmarking.py # Benchmarks CompoNet and ProgNet and plots the results
β βββ models/ # Contains the implementations of the SAC agents
β βββ process_results.py # Processes the runs generating the metrics and plots
β βββ run_experiments.py # Utility script for running experiments
β βββ run_sac.py # Main script to run SAC experiments
β βββ tasks.py # Contains the definitions of the tasks
β βββ test_agent.py # Main script used to test trained agents
β βββ transferer_matrix.py # Computes and plots the transfer matrix of Meta-World
β βββ requirements.txt # Requirements file for these experiments
β
βββ utils/ # Contains utilities used across multiple files
βββ LICENSE # Text file with the license of the repo
βββ README.md
```Note that all of the CLI options available in the training and
visualization (`plot_*`) scripts can be seen using `--help`.Finally, all PPO and SAC scripts are based on the excellent
[CleanRL](https://github.com/vwxyzjn/cleanrl) project, that provides
high-quality implementations of many RL algorithms.## Requirements π
Likewise the experimentation, the requirements are divided in two
sets, each containing the packages required for each group of
experiments: `experiments/atari/requirements.txt` and
`experiments/meta-world/requirements.txt`.To install the requirements:
```setup
pip install -r experiments/atari/requirements.txt
```or,
```setup
pip install -r experiments/meta-world/requirements.txt
```Note that the `atari` experiments use the `ALE` environments from the
[gymnasium](https://gymnasium.farama.org/) project, while `meta-world`
employs [meta-world](https://github.com/Farama-Foundation/Metaworld).## Reproducing the results π
If you want to reproduce any of the results that appear in the paper,
just call the corresponding training script with the default CLI
options (just change the environment and task options if needed).All of the CLI options have the default value that was used in the
paper βΊοΈ.## License π
This repository is distributed under the terms of the GLPv3
license. See the [LICENSE](./LICENSE) file for more details, or visit
the [GPLv3 homepage](https://www.gnu.org/licenses/gpl-3.0.en.html).