{"id":13689250,"url":"https://github.com/google-deepmind/bsuite","last_synced_at":"2025-02-22T20:31:00.421Z","repository":{"id":40894244,"uuid":"200259166","full_name":"google-deepmind/bsuite","owner":"google-deepmind","description":"bsuite is a collection of carefully-designed experiments that investigate core capabilities of a reinforcement learning (RL) agent","archived":true,"fork":false,"pushed_at":"2024-04-13T07:22:48.000Z","size":2059,"stargazers_count":1518,"open_issues_count":20,"forks_count":187,"subscribers_count":59,"default_branch":"main","last_synced_at":"2025-02-14T08:47:02.110Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/google-deepmind.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2019-08-02T15:36:05.000Z","updated_at":"2025-02-13T18:56:47.000Z","dependencies_parsed_at":"2023-09-07T20:34:45.364Z","dependency_job_id":"9656d465-5586-4840-b171-0807f8e35cf3","html_url":"https://github.com/google-deepmind/bsuite","commit_stats":{"total_commits":167,"total_committers":14,"mean_commits":"11.928571428571429","dds":0.437125748502994,"last_synced_commit":"6d8f64997ca256473c3d10be021431facc5a14d7"},"previous_names":["google-deepmind/bsuite","deepmind/bsuite"],"tags_count":8,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/google-deepmind%2Fbsuite","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/google-deepmind%2Fbsuite/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/google-deepmind%2Fbsuite/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/google-deepmind%2Fbsuite/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/google-deepmind","download_url":"https://codeload.github.com/google-deepmind/bsuite/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":240235415,"owners_count":19769558,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-08-02T15:01:40.049Z","updated_at":"2025-02-22T20:31:00.080Z","avatar_url":"https://github.com/google-deepmind.png","language":"Python","readme":"# Behaviour Suite for Reinforcement Learning (`bsuite`)\n\n![PyPI Python version](https://img.shields.io/pypi/pyversions/bsuite)\n![PyPI version](https://badge.fury.io/py/bsuite.svg)\n![pytest](https://github.com/deepmind/bsuite/workflows/pytest/badge.svg)\n\n![radar plot](reports/standalone/images/radar_plot.png)\n\n## Introduction\n\n`bsuite` is a collection of carefully-designed experiments that investigate core\ncapabilities of a reinforcement learning (RL) agent with two main objectives.\n\n1.  To collect clear, informative and scalable problems that capture key issues\n    in the design of efficient and general learning algorithms.\n2.  To study agent behavior through their performance on these shared\n    benchmarks.\n\nThis library automates evaluation and analysis of any agent on these benchmarks.\nIt serves to facilitate reproducible, and accessible, research on the core\nissues in RL, and ultimately the design of superior learning algorithms.\n\nGoing forward, we hope to incorporate more excellent experiments from the\nresearch community, and commit to a periodic review of the experiments from a\ncommittee of prominent researchers.\n\nFor a more comprehensive overview, see the accompanying [paper].\n\n## Technical overview\n\n`bsuite` is a collection of _experiments_, defined in the [`experiments`]\nsubdirectory. Each subdirectory corresponds to one experiment and contains:\n\n-   A file defining an RL environment, which may be configurable to provide\n    different levels of difficulty or different random seeds (for example).\n-   A sequence of keyword arguments for this environment, defined in the\n    `SETTINGS` variable found in the experiment's `sweep.py` file.\n-   A file `analysis.py` defining plots used in the provided Jupyter notebook.\n\n`bsuite` works by logging results from \"within\" each environment, when loading\nenvironment via a\n[`load_and_record*` function](#loading-an-environment-with-logging-included).\nThis means any experiment will automatically output data in the correct format\nfor analysis using the notebook, without any constraints on the structure of\nagents or algorithms.\n\nWe collate all of the results and analysis in a pre-made jupyter notebook [bit.ly/bsuite-colab](https://bit.ly/bsuite-colab).\n\n## Getting started\n\nIf you are new to `bsuite` you can get started in our\n[colab tutorial](https://colab.research.google.com/drive/1rU20zJ281sZuMD1DHbsODFr1DbASL0RH).\nThis Jupyter notebook is hosted with a free cloud server, so you can start\ncoding right away without installing anything on your machine. After this, you\ncan follow the instructions below to get `bsuite` running on your local machine.\n\n### Installation\n\nWe have tested `bsuite` on Python 3.6 \u0026 3.7. To install the dependencies:\n\n1.  **Optional**: We recommend using a\n    [Python virtual environment](https://docs.python.org/3/tutorial/venv.html)\n    to manage your dependencies, so as not to clobber your system installation:\n\n    ```bash\n    python3 -m venv bsuite\n    source bsuite/bin/activate\n    pip install --upgrade pip setuptools\n    ```\n\n1.  Install `bsuite` directly from [PyPI](https://pypi.org/project/bsuite):\n\n    ```bash\n    pip install bsuite\n    ```\n\n1.  **Optional**: To also install dependencies for the [`baselines`] examples\n    (excluding OpenAI and Dopamine examples), run:\n\n    ```bash\n    pip install bsuite[baselines]\n    ```\n\n## Environments\n\nComplete descriptions of each environment and their corresponding experiments\nare found in the [`analysis/results.ipynb`] Jupyter notebook.\n\nThese environments all have small observation sizes, allowing for reasonable performance with a small network on a CPU.\n\n### Loading an environment\n\nEnvironments are specified by a `bsuite_id` string, for example `\"deep_sea/7\"`.\nThis string denotes the experiment and the (index of the) environment settings\nto use, as described in the [technical overview section](#technical-overview).\n\nFor a full description of each environment and its corresponding experiment settings, see the [paper].\n\n```python\nimport bsuite\n\nenv = bsuite.load_from_id('catch/0')\n```\n\nThe sequence of `bsuite_id`s required to run all experiments can be accessed\nprogrammatically via:\n\n```python\nfrom bsuite import sweep\n\nsweep.SWEEP\n```\n\nThis module also contains `bsuite_id`s for each experiment individually via\nuppercase constants corresponding to the experiment name, for example:\n\n```python\nsweep.DEEP_SEA\nsweep.DISCOUNTING_CHAIN\n```\n\nIn addition, sequences of `bsuite_id`s with the same tag can be loaded via:\n\n```python\nfrom bsuite import sweep\n\nsweep.TAGS\n```\n\nThe `TAGS` variable groups `bsuite` environments together by their underlying\ntag, so all the `basic` tasks or `scale` tasks can be loaded with:\n\n```python\nsweep.TAGS['basic']\nsweep.TAGS['scale']\n```\n\n### Loading an environment with logging included\n\nWe include two implementations of automatic logging, available via:\n\n*   [`bsuite.load_and_record_to_csv`]. This outputs one CSV file per\n    `bsuite_id`, so is suitable for running a set of bsuite experiments split\n    over multiple machines. The implementation is in [`logging/csv_logging.py`]\n*   [`bsuite.load_and_record_to_sqlite`]. This outputs a single file, and is\n    best suited when running a set of bsuite experiments via multiple processes\n    on a single workstation. The implementation is in\n    [`logging/sqlite_logging.py`].\n\nWe also include a terminal logger in [`logging/terminal_logging.py`], exposed\nvia `bsuite.load_and_record_to_terminal`.\n\nIt is easy to write your own logging mechanism, if you need to save results to a\ndifferent storage system. See the CSV implementation for the simplest reference.\n\n### Interacting with an environment\n\nOur environments implement the Python interface defined in\n[`dm_env`](https://github.com/deepmind/dm_env).\n\nMore specifically, all our environments accept a discrete, zero-based integer\naction (or equivalently, a scalar numpy array with shape `()`).\n\nTo determine the number of actions for a specific environment, use\n\n```python\nnum_actions = env.action_spec().num_values\n```\n\nEach environment returns observations in the form of a numpy array.\n\nWe also expose a `bsuite_num_episodes` property for each environment in bsuite.\nThis allows users to run exactly the number of episodes required for bsuite's\nanalysis, which may vary between environments used in different experiments.\n\nExample run loop for a hypothetical agent with a `step()` method.\n\n```python\nfor _ in range(env.bsuite_num_episodes):\n  timestep = env.reset()\n  while not timestep.last():\n    action = agent.step(timestep)\n    timestep = env.step(action)\n  agent.step(timestep)\n```\n\n### Using `bsuite` in 'OpenAI Gym' format\n\nTo use `bsuite` with a codebase that uses the\n[OpenAI Gym](https://github.com/openai/gym) interface, use the `GymFromDMEnv`\nclass in [`utils/gym_wrapper.py`]:\n\n```python\nimport bsuite\nfrom bsuite.utils import gym_wrapper\n\nenv = bsuite.load_and_record_to_csv('catch/0', results_dir='/path/to/results')\ngym_env = gym_wrapper.GymFromDMEnv(env)\n```\n\nNote that `bsuite` does not include Gym in its default dependencies, so you may\nneed to pip install it separately.\n\n## Baseline agents\n\nWe include implementations of several common agents in the [`baselines/`]\nsubdirectory, along with a minimal run-loop.\n\nSee the [installation](#installation) section for how to include the required\ndependencies at install time. These\ndependencies are not installed by default, since `bsuite` does not require users\nto use any specific machine learning library.\n\n## Running the entire suite of experiments\n\nEach of the agents in the `baselines` folder contains a `run` script which\nserves as an example which can run against a single environment or against the\nentire suite of experiments, by passing the `--bsuite_id=SWEEP` flags; this will\nstart a pool of processes with which to run as many experiments in parallel as\nthe host machine allows. On a 12 core machine, this will complete overnight for\nmost agents. Alternatively, it is possible to run on Google Compute Platform\nusing `run_on_gcp.sh`, steps of which are outlined below.\n\n### Running experiments on Google Cloud Platform\n\n[`run_on_gcp.sh`](run_on_gcp.sh) does the following in order:\n\n1.  Create an instance with specified specs (by default 64-core CPU optimized).\n1.  `git clone`s `bsuite` and installs it together with other dependencies.\n1.  Runs the specified agent (currently limited to `/baselines`) on a specified\n    environment.\n1.  Copies the resulting SQLite file to `/tmp/bsuite.db` from the remote\n    instance to you local machine.\n1.  Shuts down the created instance.\n\nIn order to run the script, you first need to create a billing account. Then\nfollow the instructions\n[here](https://cloud.google.com/sdk/docs/quickstart-debian-ubuntu) to setup and\ninitialize Cloud SDK. After completing `gcloud init`, you are ready to run\n`bsuite` on Google Cloud.\n\nFor this make [`run_on_gcp.sh`](run_on_gcp.sh) executable and run it:\n\n```bash\nchmod +x run_on_gcp.sh\n./run_on_gcp.sh\n```\n\nAfter the instance is created, the instance name will be printed. Then you can\nssh into the instance by selecting `Compute Engine -\u003e Instances` and clicking\n`SSH`. Note that this is not necessary, as the result will be copied to your\nlocal machine once it is ready. However, `ssh`ing might be convenient if you\nwant to make local changes to agent and environments. In this case, after\n`ssh`ing, do\n\n```bash\n~/bsuite_env/bin/activate\n```\n\nto activate the virtual environment. Then you can run agents via\n\n```bash\npython ~/bsuite/bsuite/baselines/dqn/run.py --bsuite_id=SWEEP\n```\n\nfor instance.\n\n### Analysis\n\n`bsuite` comes with a ready-made analysis Jupyter notebook included in\n[`analysis/results.ipynb`]. This notebook loads and processes logged data, and\nproduces the scores and plots for each experiment. We recommend using this\nnotebook in conjunction with [Colaboratory](https://colab.research.google.com).\n\nWe provide an example of a such `bsuite` report\n[here](https://colab.research.google.com/drive/1RYWJaMEHVeN8yI83QtL35GOSFQBRgLaX).\n\n### `bsuite` Report\n\nYou can use `bsuite` to generate an automated 1-page appendix, that summarizes\nthe core capabilities of your RL algorithm. This appendix is compatible with\nmost major ML conference formats. For example output run,\n\n```bash\npdflatex bsuite/reports/neurips_2019/neurips_2019.tex\n```\n\nMore examples of bsuite reports can be found in the `reports/` subdirectory.\n\n## Citing\n\nIf you use `bsuite` in your work, please cite the accompanying [paper]:\n\n```bibtex\n@inproceedings{osband2020bsuite,\n    title={Behaviour Suite for Reinforcement Learning},\n    author={Osband, Ian and\n            Doron, Yotam and\n            Hessel, Matteo and\n            Aslanides, John and\n            Sezener, Eren and\n            Saraiva, Andre and\n            McKinney, Katrina and\n            Lattimore, Tor and\n            {Sz}epesv{\\'a}ri, Csaba and\n            Singh, Satinder and\n            Van Roy, Benjamin and\n            Sutton, Richard and\n            Silver, David and\n            van Hasselt, Hado},\n    booktitle={International Conference on Learning Representations},\n    year={2020},\n    url={https://openreview.net/forum?id=rygf-kSYwH}\n}\n```\n\n[`analysis/results.ipynb`]: bsuite/analysis/results.ipynb\n[`baselines`]:  bsuite/baselines/\n[`bsuite.load_and_record_to_csv`]: bsuite/bsuite.py\n[`bsuite.load_and_record_to_sqlite`]: bsuite/bsuite.py\n[`experiments`]:  bsuite/experiments/\n[`logging/csv_logging.py`]: bsuite/logging/csv_logging.py\n[`logging/sqlite_logging.py`]: bsuite/logging/sqlite_logging.py\n[`logging/terminal_logging.py`]: bsuite/logging/terminal_logging.py\n[`utils/gym_wrapper.py`]: bsuite/utils/gym_wrapper.py\n\n[paper]: https://openreview.net/forum?id=rygf-kSYwH\n","funding_links":[],"categories":["Papers","Python"],"sub_categories":["ICLR 2024"],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgoogle-deepmind%2Fbsuite","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fgoogle-deepmind%2Fbsuite","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgoogle-deepmind%2Fbsuite/lists"}