{"id":14959610,"url":"https://github.com/0xangelo/raylab","last_synced_at":"2025-05-02T12:31:38.942Z","repository":{"id":37185471,"uuid":"208062430","full_name":"0xangelo/raylab","owner":"0xangelo","description":"Reinforcement learning algorithms in RLlib","archived":false,"fork":false,"pushed_at":"2024-05-03T20:24:54.000Z","size":4737,"stargazers_count":59,"open_issues_count":13,"forks_count":10,"subscribers_count":4,"default_branch":"master","last_synced_at":"2025-04-23T00:35:26.911Z","etag":null,"topics":["bokeh","deep-learning","generative-models","machine-learning","model-based-rl","normalizing-flows","pytorch","reinforcement-learning","rllib","streamlit"],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/0xangelo.png","metadata":{"files":{"readme":"README.rst","changelog":null,"contributing":"CONTRIBUTING.rst","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":"AUTHORS.rst","dei":null,"publiccode":null,"codemeta":null}},"created_at":"2019-09-12T13:50:53.000Z","updated_at":"2025-04-16T00:19:17.000Z","dependencies_parsed_at":"2024-05-03T21:52:29.767Z","dependency_job_id":null,"html_url":"https://github.com/0xangelo/raylab","commit_stats":{"total_commits":2247,"total_committers":8,"mean_commits":280.875,"dds":0.1370716510903427,"last_synced_commit":"2984b3d835d8135435bea0efed604a786c3b635f"},"previous_names":["angelolovatto/raylab"],"tags_count":124,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/0xangelo%2Fraylab","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/0xangelo%2Fraylab/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/0xangelo%2Fraylab/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/0xangelo%2Fraylab/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/0xangelo","download_url":"https://codeload.github.com/0xangelo/raylab/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":252038181,"owners_count":21684641,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["bokeh","deep-learning","generative-models","machine-learning","model-based-rl","normalizing-flows","pytorch","reinforcement-learning","rllib","streamlit"],"created_at":"2024-09-24T13:20:13.089Z","updated_at":"2025-05-02T12:31:37.600Z","avatar_url":"https://github.com/0xangelo.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"======\nraylab\n======\n\n|PyPI| |Tests| |Dependabot| |License| |CodeStyle|\n\n.. |PyPI| image:: https://img.shields.io/pypi/v/raylab?logo=PyPi\u0026logoColor=white\u0026color=blue\n      :alt: PyPI\n\n.. |Tests| image:: https://img.shields.io/github/workflow/status/angelolovatto/raylab/Poetry%20package?label=tests\u0026logo=GitHub\n       :alt: GitHub Workflow Status\n\n.. |Dependabot| image:: https://api.dependabot.com/badges/status?host=github\u0026repo=angelolovatto/raylab\n        :target: https://dependabot.com\n\n.. |License| image:: https://img.shields.io/github/license/angelolovatto/raylab?color=blueviolet\u0026logo=github\n         :alt: GitHub\n\n.. |CodeStyle| image:: https://img.shields.io/badge/code%20style-black-000000.svg\n           :target: https://github.com/psf/black\n\n\nReinforcement learning algorithms in `RLlib \u003chttps://github.com/ray-project/ray/tree/master/rllib\u003e`_\nand `PyTorch \u003chttps://pytorch.org\u003e`_.\n\n\nInstallation\n------------\n\n.. code:: bash\n\n          pip install raylab\n\n\nQuickstart\n----------\n\nRaylab provides agents and environments to be used with a normal RLlib/Tune setup.\nYou can an agent's name (from the `Algorithms`_ section) to :code:`raylab info list` to list its top-level configurations:\n\n.. code-block:: zsh\n\n    raylab info list SoftAC\n\n.. code-block::\n\n    learning_starts: 0\n        Hold this number of timesteps before first training operation.\n    policy: {}\n        Sub-configurations for the policy class.\n    wandb: {}\n        Configs for integration with Weights \u0026 Biases.\n\n        Accepts arbitrary keyword arguments to pass to `wandb.init`.\n        The defaults for `wandb.init` are:\n        * name: `_name` property of the trainer.\n        * config: full `config` attribute of the trainer\n        * config_exclude_keys: `wandb` and `callbacks` configs\n        * reinit: True\n\n        Don't forget to:\n          * install `wandb` via pip\n          * login to W\u0026B with the appropriate API key for your\n            team/project.\n          * set the `wandb/project` name in the config dict\n\n        Check out the Quickstart for more information:\n        `https://docs.wandb.com/quickstart`\n\nYou can add the :code:`--rllib` flag to get the descriptions for all the options common to RLlib agents\n(or :code:`Trainer`\\s)\n\nLaunching experiments can be done via the command line using :code:`raylab experiment` passing a file path\nwith an agent's configuration through the :code:`--config` flag.\nThe following command uses the cartpole `example \u003cexamples/PG/cartpole_defaults.py\u003e`_ configuration file\nto launch an experiment using the vanilla Policy Gradient agent from the RLlib library.\n\n.. code-block:: zsh\n\n    raylab experiment PG --name PG -s training_iteration 10 --config examples/PG/cartpole_defaults.py\n\nYou can also launch an experiment from a Python script normally using Ray and Tune.\nThe following shows how you may use Raylab to perform an experiment comparing different\ntypes of exploration for the NAF agent.\n\n.. code-block:: python\n\n             import ray\n             from ray import tune\n             import raylab\n\n             def main():\n                 raylab.register_all_agents()\n                 raylab.register_all_environments()\n                 ray.init()\n                 tune.run(\n                     \"NAF\",\n                     local_dir=\"data/NAF\",\n                     stop={\"timesteps_total\": 100000},\n                     config={\n                         \"env\": \"CartPoleSwingUp-v0\",\n                         \"exploration_config\": {\n                             \"type\": tune.grid_search([\n                                 \"raylab.utils.exploration.GaussianNoise\",\n                                 \"raylab.utils.exploration.ParameterNoise\"\n                             ])\n                         }\n                     },\n                     num_samples=10,\n                 )\n\n             if __name__ == \"__main__\":\n                 main()\n\n\nOne can then visualize the results using :code:`raylab dashboard`, passing the :code:`local_dir` used in the\nexperiment. The dashboard lets you filter and group results in a quick way.\n\n.. code-block:: zsh\n\n    raylab dashboard data/NAF/\n\n\n.. image:: https://i.imgur.com/bVc6WC5.png\n        :align: center\n\n\nYou can find the best checkpoint according to a metric (:code:`episode_reward_mean` by default)\nusing :code:`raylab find-best`.\n\n.. code-block:: zsh\n\n    raylab find-best data/NAF/\n\nFinally, you can pass a checkpoint to :code:`raylab rollout` to see the returns collected by the agent and\nrender it if the environment supports a visual :code:`render()` method. For example, you\ncan use the output of the :code:`find-best` command to see the best agent in action.\n\n\n.. code-block:: zsh\n\n    raylab rollout $(raylab find-best data/NAF/) --agent NAF\n\n\nAlgorithms\n----------\n\n+--------------------------------------------------------+-------------------------+\n| Paper                                                  | Agent Name              |\n+--------------------------------------------------------+-------------------------+\n| `Actor Critic using Kronecker-factored Trust Region`_  | ACKTR                   |\n+--------------------------------------------------------+-------------------------+\n| `Trust Region Policy Optimization`_                    | TRPO                    |\n+--------------------------------------------------------+-------------------------+\n| `Normalized Advantage Function`_                       | NAF                     |\n+--------------------------------------------------------+-------------------------+\n| `Stochastic Value Gradients`_                          | SVG(inf)/SVG(1)/SoftSVG |\n+--------------------------------------------------------+-------------------------+\n| `Soft Actor-Critic`_                                   | SoftAC                  |\n+--------------------------------------------------------+-------------------------+\n| `Streamlined Off-Policy`_ (DDPG)                       | SOP                     |\n+--------------------------------------------------------+-------------------------+\n| `Model-Based Policy Optimization`_                     | MBPO                    |\n+--------------------------------------------------------+-------------------------+\n| `Model-based Action-Gradient-Estimator`_               | MAGE                    |\n+--------------------------------------------------------+-------------------------+\n\n\n.. _`Actor Critic using Kronecker-factored Trust Region`: https://arxiv.org/abs/1708.05144\n.. _`Trust Region Policy Optimization`: http://proceedings.mlr.press/v37/schulman15.html\n.. _`Normalized Advantage Function`: http://proceedings.mlr.press/v48/gu16.html\n.. _`Stochastic Value Gradients`: http://papers.nips.cc/paper/5796-learning-continuous-control-policies-by-stochastic-value-gradients\n.. _`Soft Actor-Critic`: http://proceedings.mlr.press/v80/haarnoja18b.html\n.. _`Model-Based Policy Optimization`: http://arxiv.org/abs/1906.08253\n.. _`Streamlined Off-Policy`: https://arxiv.org/abs/1910.02208\n.. _`Model-based Action-Gradient-Estimator`: https://arxiv.org/abs/2004.14309\n\n\nCommand-line interface\n----------------------\n\n.. role:: bash(code)\n   :language: bash\n\nFor a high-level description of the available utilities, run :bash:`raylab --help`\n\n.. code:: bash\n\n    Usage: raylab [OPTIONS] COMMAND [ARGS]...\n\n      RayLab: Reinforcement learning algorithms in RLlib.\n\n    Options:\n      --help  Show this message and exit.\n\n    Commands:\n      dashboard    Launch the experiment dashboard to monitor training progress.\n      episodes     Launch the episode dashboard to monitor state and action...\n      experiment   Launch a Tune experiment from a config file.\n      find-best    Find the best experiment checkpoint as measured by a metric.\n      info         View information about an agent's config parameters.\n      rollout      Wrap `rllib rollout` with customized options.\n      test-module  Launch dashboard to test generative models from a checkpoint.\n\n\nPackages\n--------\n\nThe project is structured as follows\n::\n\n    raylab\n    |-- agents            # Trainer and Policy classes\n    |-- cli               # Command line utilities\n    |-- envs              # Gym environment registry and utilities\n    |-- logger            # Tune loggers\n    |-- policy            # Extensions and customizations of RLlib's policy API\n    |   |-- losses        # RL loss functions\n    |   |-- modules       # PyTorch neural network modules for TorchPolicy\n    |-- pytorch           # PyTorch extensions\n    |-- utils             # miscellaneous utilities\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2F0xangelo%2Fraylab","html_url":"https://awesome.ecosyste.ms/projects/github.com%2F0xangelo%2Fraylab","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2F0xangelo%2Fraylab/lists"}