{"id":22771793,"url":"https://github.com/stonet2000/rl-boilerplate","last_synced_at":"2025-10-25T00:04:01.429Z","repository":{"id":106468669,"uuid":"469628905","full_name":"StoneT2000/rl-boilerplate","owner":"StoneT2000","description":"a repo that I copy for new rl projects","archived":false,"fork":false,"pushed_at":"2025-03-20T22:52:11.000Z","size":3,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-03-20T23:31:14.239Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/StoneT2000.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2022-03-14T07:44:55.000Z","updated_at":"2025-03-20T22:52:15.000Z","dependencies_parsed_at":null,"dependency_job_id":"3ef4db4f-b7cb-45b4-ab35-ef9b2d892c4e","html_url":"https://github.com/StoneT2000/rl-boilerplate","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/StoneT2000%2Frl-boilerplate","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/StoneT2000%2Frl-boilerplate/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/StoneT2000%2Frl-boilerplate/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/StoneT2000%2Frl-boilerplate/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/StoneT2000","download_url":"https://codeload.github.com/StoneT2000/rl-boilerplate/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":246314149,"owners_count":20757463,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-12-11T16:17:21.592Z","updated_at":"2025-10-25T00:04:01.423Z","avatar_url":"https://github.com/StoneT2000.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# rl-boilerplate\n\nA repo that I copy for any project that uses some RL, with some focus on robotics/continuous control.\n\nIt is fairly minimal (perhaps a mid-point between CleanRL and StableBaselines). It is also nearly completely typed, runs quite fast, and is torch based.\n\nWhat's included:\n- [Tyro](https://github.com/brentyi/tyro) based CLI control with dataclass based python configuration (no yamls!)\n- PPO and SAC algorithms (with torch.compile/cudagraphs support)\n- Basic models (CNNs, MLPs etc.) configurable with python dataclasses\n- Loggers for tensorboard and wandb\n- A general purpose `make_env_from_config` function to replace `gym.make` that handles all dependency and wrapper madness for a bunch of continuous control environments.\n\nGeneral practices:\n- All code is typed when possible\n- All data is batched whenever possible, even if its just some sample input to figure out shapes. Batched is the default\n- Dictionary of torch tensors are moved to tensordict when possible\n\nWhile some repos like to make things as modular as possible, this repo does not do that. The only objects that are intended to be imported from the library in `rl` are environment configurations and a function to create environments from configs, neural network models and a function to build models from configs, and replay buffer designs. Things that are generally standardizable or often are never changed in RL experiments (e.g. neural net architectures). \n\nAlgorithm specific code is in the `scripts/\u003calgo\u003e` folder and is organized on a per-algorithm basis. Typically they each have at least a `scripts/\u003calgo\u003e/config.py` file with some default configs and configs for the algorithm itself. There is also a `scripts/\u003calgo\u003e/README.md` file which contains\n- High-level overview of algorithm\n- Problems with algorithm\n- Tricks/modifications used in the code base (for better performance / faster training etc. )\n- Citation bibtex\n\n## To Use\n\nCopy the contents of this repo somewhere into your project.\n\nMerge your setup.py file / update your dependencies to include the dependencies in this repo's setup.py file.\n\u003c!-- \nReplace / update the following files\n\n`environment.yml` - change the name and add/remove pkgs\n\n`pkgname` - rename folder to the actual project name\n\nThen `mamba create env` or `conda create env` --\u003e\n\nOr to test this repo directly clone it and run\n\n```bash\nmamba create -n \"rl\" \"python==3.11\"\nmamba activate rl\npip install -e . torch # pick your torch version\n```\n\nExample train script\n\n```bash\npython scripts/ppo/train.py --help\npython scripts/ppo/train.py ms3-state --env.env-id PickCube-v1 --seed 1 --logger.exp-name \"ppo-PickCube-v1-state-1\"\n\npython scripts/sac/train.py --help\npython scripts/sac/train.py ms3-state --env.env-id PickCube-v1 --seed 1 --logger.exp-name \"sac-PickCube-v1-state-1\"\n\npython scripts/sac/train.py ms3-state --env.env-id PegInsertionSide-v1 --seed 1 --logger.exp-name \"sac-PegInsertionSide-v1-state-1\" --num_eval_steps 100 \\\n  --total_timesteps 30_000_000 \\\n  --buffer_size 1_000_000 \\\n  --learning_starts 32768 \\\n  --batch_size 4096 \\\n  --grad_steps_per_iteration 20 \\\n  --sac.gamma 0.99 \\\n  --sac.tau 5e-3 \\\n  --sac.policy_frequency 5 \\\n  --env.ignore_terminations\n\npython scripts/sac/train.py ms3-rgb --env.env-id PickCube-v1 --seed 1 \\\n  --logger.exp-name \"sac-PickCube-v1-rgb-1\"\n```\n\nThe way it works with tyro CLI configuration is you can first specify a default config (train.py and config.py defines ms3-state and ms3-rgb for now), and then override all the other things as needed.\n\nIf you want to make your own changes/algorithm etc. recommend you read how ppo/train.py and ppo/config.py is written for a fairly clean example of fully typed configs and easy experimentation:\n- Define a TrainConfig dataclass that contains all other dataclass configs (e.g. PPO hyperparemters, env configs, network configs, logger configs etc.)\n- save a pickle file of the python config (which can include non json serializable objects such as gym wrapper classes)\n- save a JSON readable version of the config\n- save evaluation videos of the agent\n- log things to wandb\n\n## Organization\n\n`rl/` the directory for all commonly re-used code. Code that is standardizable (e.g. replay buffers) because there is an obvious goal (e.g. max memory efficiency / read/write speeds for replay buffers) and many algos use it, or code that is often re-used and is rarely ever heavily experimented with outside some common choices (e.g. neural net models), are candidates for being placed here.\n\n`rl/logger` - code for logging utilities\n\n`rl/models` - contains all neural network models\n\n`rl/envs` - folder for any custom environments and utilities to create environments from standardized environment configs\n\n`scripts/\u003calgo\u003e` - all files related for running an algorithm\n\n`scripts/\u003calgo\u003e/config.py` - all relevant configs as python files\n\n## Citation\n\nIf you use this codebase no need to cite it (you are welcome to cite the github repo), just make sure to cite the algorithm(s) you are using.","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fstonet2000%2Frl-boilerplate","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fstonet2000%2Frl-boilerplate","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fstonet2000%2Frl-boilerplate/lists"}