{"id":13492997,"url":"https://github.com/facebookresearch/torchbeast","last_synced_at":"2025-03-28T11:31:24.354Z","repository":{"id":49418434,"uuid":"205436943","full_name":"facebookresearch/torchbeast","owner":"facebookresearch","description":"A PyTorch Platform for Distributed RL","archived":true,"fork":false,"pushed_at":"2021-09-15T11:57:50.000Z","size":5838,"stargazers_count":746,"open_issues_count":17,"forks_count":115,"subscribers_count":16,"default_branch":"main","last_synced_at":"2025-02-23T00:14:20.320Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/facebookresearch.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2019-08-30T18:30:09.000Z","updated_at":"2025-02-22T17:07:00.000Z","dependencies_parsed_at":"2022-09-06T07:30:39.393Z","dependency_job_id":null,"html_url":"https://github.com/facebookresearch/torchbeast","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/facebookresearch%2Ftorchbeast","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/facebookresearch%2Ftorchbeast/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/facebookresearch%2Ftorchbeast/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/facebookresearch%2Ftorchbeast/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/facebookresearch","download_url":"https://codeload.github.com/facebookresearch/torchbeast/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":246020903,"owners_count":20710839,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-07-31T19:01:11.180Z","updated_at":"2025-03-28T11:31:23.808Z","avatar_url":"https://github.com/facebookresearch.png","language":"Python","funding_links":[],"categories":["Uncategorized","Pytorch \u0026 related libraries｜Pytorch \u0026 相关库","Pytorch \u0026 related libraries","Python"],"sub_categories":["Uncategorized","Other libraries｜其他库:","Other libraries:"],"readme":"\n# TorchBeast\nA PyTorch implementation of [IMPALA: Scalable Distributed\nDeep-RL with Importance Weighted Actor-Learner Architectures\nby Espeholt, Soyer, Munos et al.](https://arxiv.org/abs/1802.01561)\n\nTorchBeast comes in two variants:\n[MonoBeast](#getting-started-monobeast) and\n[PolyBeast](#faster-version-polybeast). While\nPolyBeast is more powerful (e.g. allowing training across machines),\nit's somewhat harder to install. MonoBeast requires only Python and\nPyTorch (we suggest using PyTorch version 1.2 or newer).\n\nFor further details, see our [paper](https://arxiv.org/abs/1910.03552).\n\n\n## BibTeX\n\n```\n@article{torchbeast2019,\n  title={{TorchBeast: A PyTorch Platform for Distributed RL}},\n  author={Heinrich K\\\"{u}ttler and Nantas Nardelli and Thibaut Lavril and Marco Selvatici and Viswanath Sivakumar and Tim Rockt\\\"{a}schel and Edward Grefenstette},\n  year={2019},\n  journal={arXiv preprint arXiv:1910.03552},\n  url={https://github.com/facebookresearch/torchbeast},\n}\n```\n\n## Getting started: MonoBeast\n\nMonoBeast is a pure Python + PyTorch implementation of IMPALA.\n\nTo set it up, create a new conda environment and install MonoBeast's\nrequirements:\n\n```bash\n$ conda create -n torchbeast\n$ conda activate torchbeast\n$ conda install pytorch -c pytorch\n$ pip install -r requirements.txt\n```\n\nThen run MonoBeast, e.g. on the [Pong Atari\nenvironment](https://gym.openai.com/envs/Pong-v0/):\n\n```shell\n$ python -m torchbeast.monobeast --env PongNoFrameskip-v4\n```\n\nBy default, MonoBeast uses only a few actors (each with their instance\nof the environment). Let's change the default settings (try this on a\nbeefy machine!):\n\n```shell\n$ python -m torchbeast.monobeast \\\n     --env PongNoFrameskip-v4 \\\n     --num_actors 45 \\\n     --total_steps 30000000 \\\n     --learning_rate 0.0004 \\\n     --epsilon 0.01 \\\n     --entropy_cost 0.01 \\\n     --batch_size 4 \\\n     --unroll_length 80 \\\n     --num_buffers 60 \\\n     --num_threads 4 \\\n     --xpid example\n```\n\nResults are logged to `~/logs/torchbeast/latest` and a checkpoint file is\nwritten to `~/logs/torchbeast/latest/model.tar`.\n\nOnce training finished, we can test performance on a few episodes:\n\n```shell\n$ python -m torchbeast.monobeast \\\n     --env PongNoFrameskip-v4 \\\n     --mode test \\\n     --xpid example\n```\n\nMonoBeast is a simple, single-machine version of IMPALA.\nEach actor runs in a separate process with its dedicated instance of\nthe environment and runs the PyTorch model on the CPU to create\nactions. The resulting rollout trajectories\n(environment-agent interactions) are sent to the learner. In the main\nprocess, the learner consumes these rollouts and uses them to update\nthe model's weights.\n\n\n## Faster version: PolyBeast\n\nPolyBeast provides a faster and more scalable implementation of\nIMPALA.\n\nThe easiest way to build and install all of PolyBeast's dependencies\nand run it is to use Docker:\n\n```shell\n$ docker build -t torchbeast .\n$ docker run --name torchbeast torchbeast\n```\n\nTo run PolyBeast directly on Linux or MacOS, follow this guide.\n\n\n### Installing PolyBeast\n\n#### Linux\n\nCreate a new Conda environment, and install PolyBeast's requirements:\n\n```shell\n$ conda create -n torchbeast python=3.7\n$ conda activate torchbeast\n$ pip install -r requirements.txt\n```\n\nInstall PyTorch either [from\nsource](https://github.com/pytorch/pytorch#from-source) or as per its\n[website](https://pytorch.org/get-started/locally/) (select Conda).\n\nPolyBeast also requires gRPC and other third-party software, which can\nbe installed by running:\n\n```shell\n$ git submodule update --init --recursive\n```\n\nFinally, let's compile the C++ parts of PolyBeast:\n\n```\n$ pip install nest/\n$ python setup.py install\n```\n\n#### MacOS\n\nCreate a new Conda environment, and install PolyBeast's requirements:\n\n```shell\n$ conda create -n torchbeast\n$ conda activate torchbeast\n$ pip install -r requirements.txt\n```\n\nPyTorch can be installed as per its\n[website](https://pytorch.org/get-started/locally/) (select Conda).\n\nPolyBeast also requires gRPC and other third-party software, which can\nbe installed by running:\n\n```shell\n$ git submodule update --init --recursive\n```\n\nFinally, let's compile the C++ parts of PolyBeast:\n\n```\n$ pip install nest/\n$ python setup.py install\n```\n\n### Running PolyBeast\n\nTo start both the environment servers and the learner process, run\n\n```shell\n$ python -m torchbeast.polybeast\n```\n\nThe environment servers and the learner process can also be started separately:\n\n```shell\npython -m torchbeast.polybeast_env --num_servers 10\n```\n\nStart another terminal and run:\n\n```shell\n$ python3 -m torchbeast.polybeast_learner\n```\n\n\n## (Very rough) overview of the system\n\n```\n|-----------------|     |-----------------|                  |-----------------|\n|     ACTOR 1     |     |     ACTOR 2     |                  |     ACTOR n     |\n|-------|         |     |-------|         |                  |-------|         |\n|       |  .......|     |       |  .......|     .   .   .    |       |  .......|\n|  Env  |\u003c-.Model.|     |  Env  |\u003c-.Model.|                  |  Env  |\u003c-.Model.|\n|       |-\u003e.......|     |       |-\u003e.......|                  |       |-\u003e.......|\n|-----------------|     |-----------------|                  |-----------------|\n   ^     I                 ^     I                              ^     I\n   |     I                 |     I                              |     I Actors\n   |     I rollout         |     I rollout               weights|     I send\n   |     I                 |     I                     /--------/     I rollouts\n   |     I          weights|     I                     |              I (frames,\n   |     I                 |     I                     |              I  actions\n   |     I                 |     v                     |              I  etc)\n   |     L=======\u003e|--------------------------------------|\u003c===========J\n   |              |.........      LEARNER                |\n   \\--------------|..Model.. Consumes rollouts, updates  |\n     Learner      |.........       model weights         |\n      sends       |--------------------------------------|\n     weights\n```\n\nThe system has two main components, actors and a learner.\n\nActors generate rollouts (tensors from a number of steps of\nenvironment-agent interactions, including environment frames, agent\nactions and policy logits, and other data).\n\nThe learner consumes that experience, computes a loss and updates the\nweights. The new weights are then propagated to the actors.\n\n\n## Learning curves on Atari\n\nWe ran TorchBeast on Atari, using the same hyperparamaters and neural\nnetwork as in the [IMPALA\npaper](https://arxiv.org/abs/1802.01561). For comparison, we also ran\nthe [open source TensorFlow implementation of\nIMPALA](https://github.com/deepmind/scalable_agent), using the [same\nenvironment\npreprocessing](https://github.com/heiner/scalable_agent/releases/tag/gym). The\nresults are equivalent; see our paper for details.\n\n![deep_network](./plot.png)\n\n\n## Repository contents\n\n`libtorchbeast`: C++ library that allows efficient learner-actor\ncommunication via queueing and batching mechanisms. Some functions are\nexported to Python using pybind11. For PolyBeast only.\n\n`nest`: C++ library that allows to manipulate complex\nnested structures. Some functions are exported to Python using\npybind11.\n\n`tests`: Collection of python tests.\n\n`third_party`: Collection of third-party dependencies as Git\nsubmodules. Includes [gRPC](https://grpc.io/).\n\n`torchbeast`: Contains `monobeast.py`, and `polybeast.py`,\n`polybeast_learner.py` and `polybeast_env.py`.\n\n\n## Hyperparamaters\n\nBoth MonoBeast and PolyBeast have flags and hyperparameters. To\ndescribe a few of them:\n\n* `num_actors`: The number of actors (and environment instances). The\n  optimal number of actors depends on the capabilities of the machine\n  (e.g. you would not have 100 actors on your laptop). In default\n  PolyBeast this should match the number of servers started.\n* `batch_size`: Determines the size of the learner inputs.\n* `unroll_length`: Length of a rollout (i.e., number of steps that an\n  actor has to be perform before sending its experience to the\n  learner). Note that every batch will have dimensions\n  `[unroll_length, batch_size, ...]`.\n\n\n## Contributing\n\nWe would love to have you contribute to TorchBeast or use it for your\nresearch. See the [CONTRIBUTING.md](CONTRIBUTING.md) file for how to help\nout.\n\n## License\n\nTorchBeast is released under the Apache 2.0 license.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffacebookresearch%2Ftorchbeast","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ffacebookresearch%2Ftorchbeast","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ffacebookresearch%2Ftorchbeast/lists"}