{"id":19053912,"url":"https://github.com/mikelma/componet","last_synced_at":"2025-04-24T03:08:49.989Z","repository":{"id":238472606,"uuid":"796624169","full_name":"mikelma/componet","owner":"mikelma","description":"Source code of the ICML24 paper \"Self-Composing Policies for Scalable Continual Reinforcement Learning\" (selected for oral presentation)","archived":false,"fork":false,"pushed_at":"2024-07-20T09:53:07.000Z","size":76695,"stargazers_count":20,"open_issues_count":1,"forks_count":3,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-04-24T03:08:41.572Z","etag":null,"topics":["continual-learning","continual-reinforcement-learning","deep-learning","icml","icml-2024","pytorch","reinforcement-learning"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/mikelma.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-05-06T10:09:08.000Z","updated_at":"2025-03-11T13:02:17.000Z","dependencies_parsed_at":"2024-07-20T11:04:19.188Z","dependency_job_id":null,"html_url":"https://github.com/mikelma/componet","commit_stats":null,"previous_names":["mikelma/componet"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mikelma%2Fcomponet","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mikelma%2Fcomponet/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mikelma%2Fcomponet/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mikelma%2Fcomponet/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/mikelma","download_url":"https://codeload.github.com/mikelma/componet/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":250552076,"owners_count":21449165,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["continual-learning","continual-reinforcement-learning","deep-learning","icml","icml-2024","pytorch","reinforcement-learning"],"created_at":"2024-11-08T23:35:42.382Z","updated_at":"2025-04-24T03:08:49.945Z","avatar_url":"https://github.com/mikelma.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Self-Composing Policies for Scalable Continual Reinforcement Learning\n\nThis repository is part of the supplementary material of the paper [*Self-Composing Policies for Scalable Continual Reinforcement Learning*](https://openreview.net/pdf?id=f5gtX2VWSB), published in [ICML 2024](https://icml.cc/virtual/2024/poster/33472) and selected for [oral presentation](https://icml.cc/virtual/2024/oral/35492).\n\n\u003cbr\u003e\n\u003cp align=\"center\"\u003e\n  \u003cimg src=\"./componet.png\" alt=\"CompoNet\" width=\"700\" align=\"center\"\u003e\n\u003c/p\u003e\n\u003cbr\u003e\n\nTo cite this project in publications:\n\n```bibtex\n@inproceedings{malagon2024selfcomp,\n  title={Self-Composing Policies for Scalable Continual Reinforcement Learning},\n  author={Malagon, Mikel and Ceberio, Josu and Lozano, Jose A},\n  booktitle={International Conference on Machine Learning (ICML)},\n  year={2024}\n}\n```\n\n## Structure of the repo 🌳\n\nThe repository is organized into three main parts: `componet`, that\nholds the implementation of the proposal of the paper;\n`experiments/atari`, where the experiments of the SpaceInvaders and\nFreeway sequences are located; and `experiments/meta-world`, that\ncontains the experiments of the Meta-World sequence.\n\n\u003cdetails\u003e\n\n\u003csummary\u003eClick here to unfold the structure 🌳 of the repo.\u003c/summary\u003e\n\n```bash\n├── componet/ # The implementation of the proposed CompoNet architecture\n│\n├── experiments/\n│   ├── atari/        # Contains all the code related to the SpaceInvaders and Freeway sequences\n│   │   ├── data.tar.xz # Contains the compressed CSV files used for the figures\n│   │   ├── models/   # Implements PPO agents for all of the considered methods\n│   │   ├── process_results.py  # Processes the runs generating the metrics and plots\n│   │   ├── run_experiments.py  # Utility script to call `run_ppo.py` for multiple settings\n│   │   ├── run_ppo.py          # Main script to run the PPO experiments\n│   │   ├── task_utils.py       # Implements several task-related utils\n│   │   ├── test_agent.py       # Main script to evaluate trained agents\n│   │   ├── plot_ablation_input_head.py  # Plots input attention head ablation results\n│   │   ├── plot_ablation_output_head.py # Plots output attention head ablation results\n│   │   ├── plot_arch_val.py      # Plots architecture validation results\n│   │   ├── plot_dino_vs_cnn.py   # Plots results of the comparison between DINO and CNN-based agents\n│   │   ├── transfer_matrix.py    # Computes and plots the transfer matrices of SpaceInvaders and Freeway\n│   │   └── requirements.txt      # Requirements file for these experiments\n│   │\n│   └── meta-world/          # Contains all the experiments in the Meta-World tasks\n│       ├── data.tar.xz      # Contains the compressed CSV files used for the figures\n│       ├── benchmarking.py  # Benchmarks CompoNet and ProgNet and plots the results\n│       ├── models/          # Contains the implementations of the SAC agents\n│       ├── process_results.py    # Processes the runs generating the metrics and plots\n│       ├── run_experiments.py    # Utility script for running experiments\n│       ├── run_sac.py            # Main script to run SAC experiments\n│       ├── tasks.py              # Contains the definitions of the tasks\n│       ├── test_agent.py         # Main script used to test trained agents\n│       ├── transferer_matrix.py  # Computes and plots the transfer matrix of Meta-World\n│       └── requirements.txt      # Requirements file for these experiments\n│\n├── utils/    # Contains utilities used across multiple files\n├── LICENSE   # Text file with the license of the repo\n└── README.md\n```\n\u003c/details\u003e\n\nNote that all of the CLI options available in the training and\nvisualization (`plot_*`) scripts can be seen using `--help`.\n\nFinally, all PPO and SAC scripts are based on the excellent\n[CleanRL](https://github.com/vwxyzjn/cleanrl) project, that provides\nhigh-quality implementations of many RL algorithms.\n\n## Requirements 📋\n\nLikewise the experimentation, the requirements are divided in two\nsets, each containing the packages required for each group of\nexperiments: `experiments/atari/requirements.txt` and\n`experiments/meta-world/requirements.txt`.\n\nTo install the requirements:\n\n```setup\npip install -r experiments/atari/requirements.txt\n```\n\nor,\n\n```setup\npip install -r experiments/meta-world/requirements.txt\n```\n\nNote that the `atari` experiments use the `ALE` environments from the\n[gymnasium](https://gymnasium.farama.org/) project, while `meta-world`\nemploys [meta-world](https://github.com/Farama-Foundation/Metaworld).\n\n\n## Reproducing the results 🔄\n\nIf you want to reproduce any of the results that appear in the paper,\njust call the corresponding training script with the default CLI\noptions (just change the environment and task options if needed).\n\nAll of the CLI options have the default value that was used in the\npaper ☺️.\n\n## License 🐃\n\nThis repository is distributed under the terms of the GLPv3\nlicense. See the [LICENSE](./LICENSE) file for more details, or visit\nthe [GPLv3 homepage](https://www.gnu.org/licenses/gpl-3.0.en.html).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmikelma%2Fcomponet","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmikelma%2Fcomponet","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmikelma%2Fcomponet/lists"}