{"id":13472204,"url":"https://github.com/aidudezzz/deepbots","last_synced_at":"2025-03-26T15:31:36.870Z","repository":{"id":38628341,"uuid":"222138225","full_name":"aidudezzz/deepbots","owner":"aidudezzz","description":"A wrapper framework for Reinforcement Learning in the Webots robot simulator using Python 3.","archived":false,"fork":false,"pushed_at":"2023-09-30T14:35:31.000Z","size":1304,"stargazers_count":255,"open_issues_count":14,"forks_count":54,"subscribers_count":8,"default_branch":"dev","last_synced_at":"2025-03-01T00:58:06.630Z","etag":null,"topics":["openai-gym-environment","python","reinforcement-learning","robotics","webots"],"latest_commit_sha":null,"homepage":"https://deepbots.readthedocs.io/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/aidudezzz.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null}},"created_at":"2019-11-16T17:55:31.000Z","updated_at":"2025-02-22T02:02:01.000Z","dependencies_parsed_at":"2024-01-16T07:22:16.616Z","dependency_job_id":"d9a71b41-adee-4254-9201-b21b7d03be72","html_url":"https://github.com/aidudezzz/deepbots","commit_stats":null,"previous_names":[],"tags_count":25,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aidudezzz%2Fdeepbots","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aidudezzz%2Fdeepbots/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aidudezzz%2Fdeepbots/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aidudezzz%2Fdeepbots/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/aidudezzz","download_url":"https://codeload.github.com/aidudezzz/deepbots/tar.gz/refs/heads/dev","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":245681352,"owners_count":20655176,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["openai-gym-environment","python","reinforcement-learning","robotics","webots"],"created_at":"2024-07-31T16:00:52.884Z","updated_at":"2025-03-26T15:31:36.531Z","avatar_url":"https://github.com/aidudezzz.png","language":"Python","funding_links":[],"categories":["Python","Integrations"],"sub_categories":[],"readme":"\u003cp align=\"left\"\u003e\n    \u003cimg src=\"https://raw.githubusercontent.com/aidudezzz/deepbots-swag/main/logo/deepbots_full.png\"\u003e\n\u003c/p\u003e\n\n[![Version](https://img.shields.io/pypi/v/deepbots?color=green)](https://pypi.org/project/deepbots/)\n[![Dev Version](https://img.shields.io/github/v/tag/aidudezzz/deepbots?include_prereleases\u0026label=test-pypi\u0026color=green)](https://test.pypi.org/project/deepbots/)\n[![Downloads](https://static.pepy.tech/personalized-badge/deepbots?period=total\u0026units=international_system\u0026left_color=grey\u0026right_color=green\u0026left_text=Downloads)](https://pepy.tech/project/deepbots)\n[![License](https://img.shields.io/github/license/aidudezzz/deepbots?color=green)](https://github.com/aidudezzz/deepbots/blob/dev/LICENSE)\n[![All Contributors](https://img.shields.io/badge/all_contributors-6-orange.svg?style=flat-square)](#contributors-)\n\nDeepbots is a simple framework which is used as \"middleware\" between the free\nand open-source [Cyberbotics' Webots](https://cyberbotics.com/) robot simulator\nand Reinforcement Learning algorithms. When it comes to Reinforcement Learning\nthe [OpenAI gym](https://gym.openai.com/) environment has been established as\nthe most used interface between the actual application and the RL algorithm.\nDeepbots is a framework which follows the OpenAI gym environment interface\nlogic in order to be used by Webots applications.\n\n## Installation\n\n### Prerequisites\n\n1. [Install Webots](https://cyberbotics.com/doc/guide/installing-webots)\n   - [Windows](https://cyberbotics.com/doc/guide/installation-procedure#installation-on-windows)\n   - [Linux](https://cyberbotics.com/doc/guide/installation-procedure#installation-on-linux)\n   - [macOS](https://cyberbotics.com/doc/guide/installation-procedure#installation-on-macos)\n2. [Install Python version 3.X](https://www.python.org/downloads/) (please\n   refer to\n   [Using Python](https://cyberbotics.com/doc/guide/using-python#introduction)\n   to select the proper Python version for your system)\n3. Follow the [Using Python](https://cyberbotics.com/doc/guide/using-python)\n   guide provided by Webots\n4. Webots provides a basic code editor, but if you want to use\n   [PyCharm](https://www.jetbrains.com/pycharm/) as your IDE refer to\n   [using PyCharm IDE](https://cyberbotics.com/doc/guide/using-your-ide#pycharm)\n   provided by Webots\n\nYou will probably also need a backend library to implement the neural networks,\nsuch as [PyTorch](https://pytorch.org/) or\n[TensorFlow](https://www.tensorflow.org/). Deepbots interfaces with RL agents\nusing the OpenAI gym logic, so it can work with any backend library you choose\nto implement the agent with and any agent that already works with gym.\n\n### Install deepbots\n\nDeepbots can be installed through the package installer\n[pip](https://pip.pypa.io/en/stable/) running the following command:\n\n`pip install deepbots`\n\n## Official resources\n\n- On\n  [the deepbots-tutorials repository](https://github.com/aidudezzz/deepbots-tutorials)\n  you can find the official tutorials for deepbots\n- On [the deepworlds repository](https://github.com/aidudezzz/deepworlds) you\n  can find examples of deepbots being used. \u003cbr\u003eFeel free to contribute your\n  own!\n\n## Citation\n\nConference paper (AIAI2020):\nhttps://link.springer.com/chapter/10.1007/978-3-030-49186-4_6\n\n```bibtex\n@InProceedings{10.1007/978-3-030-49186-4_6,\n    author=\"Kirtas, M.\n    and Tsampazis, K.\n    and Passalis, N.\n    and Tefas, A.\",\n    title=\"Deepbots: A Webots-Based Deep Reinforcement Learning Framework for Robotics\",\n    booktitle=\"Artificial Intelligence Applications and Innovations\",\n    year=\"2020\",\n    publisher=\"Springer International Publishing\",\n    address=\"Cham\",\n    pages=\"64--75\",\n    isbn=\"978-3-030-49186-4\"\n}\n\n```\n\n## How it works\n\nFirst of all let's set up a simple glossary:\n\n- `World`: Webots uses a tree structure to represent the different entities in\n  the scene. The World is the root entity which contains all the\n  entities/nodes. For example, the world contains the Supervisor and Robot\n  entities as well as other objects which might be included in the scene.\n\n- `Supervisor`: The Supervisor is an entity which has access to all other\n  entities of the world, while having no physical presence in it. For example,\n  the Supervisor knows the exact position of all the entities of the world and\n  can manipulate them. Additionally, the Supervisor has the Supervisor\n  Controller as one of its child nodes.\n\n- `Supervisor Controller`: The Supervisor Controller is a python script which\n  is responsible for the Supervisor. For example, in the Supervisor Controller\n  script the distance between two entities in the world can be calculated.\n\n- `Robot`: The Robot is an entity that represents a robot in the world. It\n  might have sensors and other active components, like motors, etc. as child\n  entities. Also, one of its children is the Robot Controller. For example,\n  [epuck](https://cyberbotics.com/doc/guide/epuck) and\n  [TIAGo](https://cyberbotics.com/doc/guide/tiago-iron) are robots.\n\n- `Robot Controller`: The Robot Controller is a python script which is\n  responsible for the Robot's movement and sensors. With the Robot Controller\n  it is possible to observe the world and act accordingly.\n- `Environment`: The Environment is the interface as described by the OpenAI\n  gym. The Environment interface has the following methods:\n\n  - `get_observations()`: Return the observations of the robot. For example,\n    metrics from sensors, a camera image etc.\n\n  - step(action): Each timestep, the agent chooses an action, and the\n    environment returns the observation, the reward and the state of the\n    problem (done or not).\n\n  - `get_reward(action)`: The reward the agent receives as a result of their\n    action.\n  - `is_done()`: Whether it’s time to reset the environment. Most (but not all)\n    tasks are divided up into well-defined episodes, and done being True\n    indicates the episode has terminated. For example, if a robot has the task\n    to reach a goal, then the done condition might happen when the robot\n    \"touches\" the goal.\n  - `reset()`: Used to reset the world to the initial state.\n\nIn order to set up a task in Deepbots it is necessary to understand the\nintention of the OpenAI gym environment. According to the OpenAI gym\ndocumentation, the framework follows the classic “agent-environment loop”.\n\"Each timestep, the agent chooses an `action`, and the environment returns an\n`observation` and a `reward`. The process gets started by calling `reset()`,\nwhich returns an initial `observation`.\"\n\n\u003cp align=\"center\"\u003e\n    \u003cimg src=\"https://raw.githubusercontent.com/aidudezzz/deepbots/dev/doc/img/agent_env_loop.svg\"\u003e\n\u003c/p\u003e\n\nDeepbots follows this exact agent-environment loop with the only difference\nbeing that the agent, which is responsible to choose an action, runs on the\nSupervisor and the observations are acquired by the robot. The goal of the\ndeepbots framework is to hide this communication from the user, especially from\nthose who are familiar with the OpenAI gym environment. More specifically,\n`SupervisorEnv` is the interface which is used by the Reinforcement Learning\nalgorithms and follows the OpenAI Gym environment logic. The Deepbots framework\nprovides different levels of abstraction according to the user's needs.\nMoreover, a goal of the framework is to provide different wrappers for a wide\nrange of robots.\n\nDeepbots also provides a default implementation of the `reset()` method,\nleveraging Webots' built-in simulation reset functions, removing the need for\nthe user to implement reset procedures for simpler use-cases. It is always\npossible to override this method and implement any custom reset procedure, as\nneeded.\n\n#### Emitter - receiver scheme\n\nCurrently, the communication between the `Supervisor` and the `Robot` is\nachieved via an `emitter` and a `receiver`. Separating the `Supervisor` from\nthe `Robot`, deepbots can fit a variety of use-cases, e.g. multiple `Robots`\ncollecting experience and a `Supervisor` controlling them with a single agent.\nThe way Webots implements `emitter`/`receiver` communication requires messages\nto be packed and unpacked, which introduces an overhead that becomes\nprohibiting in use-cases where the observations are high-dimensional or long,\nsuch as camera images. Deepbots provides another partially abstract class that\ncombines the `Supervisor` and the `Robot` into one controller and circumvents\nthat issue, while being less flexible, which is discussed\n[later](#combined-robot-supervisor-scheme).\n\n\u003cp align=\"center\"\u003e\n    \u003cimg src=\"https://raw.githubusercontent.com/aidudezzz/deepbots/dev/doc/img/deepbots_overview.png\"\u003e\n\u003c/p\u003e\n\nOn one hand, the `emitter` is an entity which is provided by Webots, that\nbroadcasts messages to the world. On the other hand, the `receiver` is an\nentity that is used to receive messages from the `World`. Consequently, the\nagent-environment loop is transformed accordingly. Firstly, the `Robot` uses\nits sensors to retrieve the observation from the `World` and in turn uses the\n`emitter` component to broadcast this observation. Secondly, the `Supervisor`\nreceives the observation via the `receiver` component and in turn, the agent\nuses it to choose an action. It should be noted that the observation the agent\nuses might be extended from the `Supervisor`. For example, a model might use\nLiDAR sensors installed on the `Robot`, but also the Euclidean distance between\nthe `Robot` and an object. As it is expected, the `Robot` does not know the\nEuclidean distance, only the `Supervisor` can calculate it, because it has\naccess to all entities in the `World`.\n\nYou can follow the \n[emitter-receiver scheme tutorial](https://github.com/aidudezzz/deepbots-tutorials/blob/master/emitterReceiverSchemeTutorial/README.md)\nto get started and work your way up from there.\n\n\u003cp align=\"center\"\u003e\n    \u003cimg src=\"https://raw.githubusercontent.com/aidudezzz/deepbots/dev/doc/img/workflow_diagram.png\"\u003e\n\u003c/p\u003e\n\n#### Combined Robot-Supervisor scheme\n\nAs mentioned earlier, in use-cases where the observation transmitted between\nthe `Robot` and the `Supervisor` is high-dimensional or long, e.g. high\nresolution images taken from a camera, a significant overhead is introduced.\nThis is circumvented by inheriting and implementing the partially abstract\n`RobotSupervisor` that combines the `Robot controller` and the\n`Supervisor Controller` into one, forgoing all `emitter`/`receiver`\ncommunication. This new controller runs on the `Robot`, but requires\n`Supervisor` privileges and is limited to one `Robot`, one `Supervisor`.\n\nYou can follow the \n[robot-supervisor scheme tutorial](https://github.com/aidudezzz/deepbots-tutorials/tree/master/robotSupervisorSchemeTutorial)\nto get started and work your way up from there. We recommended this\ntutorial to get started with deepbots.\n\n### Abstraction Levels\n\nThe deepbots framework has been created mostly for educational purposes. The\naim of the framework is to enable people to use Reinforcement Learning in\nWebots. More specifically, we can consider deepbots as a wrapper of Webots\nexposing an OpenAI gym style interface. For this reason there are multiple\nlevels of abstraction. For example, a user can choose if they want to use CSV\n`emitter`/`receiver` or if they want to make an implementation from scratch. In\nthe top level of the abstraction hierarchy is the `SupervisorEnv` which is the\nOpenAI gym interface. Below that level there are partially implemented classes\nwith common functionality. These implementations aim to hide the communication\nbetween the `Supervisor` and the `Robot`, as described in the two different\nschemes ealier. Similarly, in the `emitter`/`receiver` scheme the `Robot` also\nhas different abstraction levels. According to their needs, users can choose\neither to process the messages received from the `Supervisor` themselves or use\nthe existing implementations.\n\n### Acknowledgments\n\nThis project has received funding from the European Union's Horizon 2020\nresearch and innovation programme under grant agreement No 871449 (OpenDR).\nThis publication reflects the authors’ views only. The European Commission is\nnot responsible for any use that may be made of the information it contains.\n\n## Contributors ✨\n\nThanks goes to these wonderful people\n([emoji key](https://allcontributors.org/docs/en/emoji-key)):\n\n\u003c!-- ALL-CONTRIBUTORS-LIST:START - Do not remove or modify this section --\u003e\n\u003c!-- prettier-ignore-start --\u003e\n\u003c!-- markdownlint-disable --\u003e\n\u003ctable\u003e\n  \u003ctr\u003e\n    \u003ctd align=\"center\"\u003e\u003ca href=\"http://eakirtas.webpages.auth.gr/\"\u003e\u003cimg src=\"https://avatars.githubusercontent.com/u/10010230?v=4?s=100\" width=\"100px;\" alt=\"\"/\u003e\u003cbr /\u003e\u003csub\u003e\u003cb\u003eManos Kirtas\u003c/b\u003e\u003c/sub\u003e\u003c/a\u003e\u003cbr /\u003e\u003ca href=\"https://github.com/aidudezzz/deepbots/commits?author=ManosMagnus\" title=\"Code\"\u003e💻\u003c/a\u003e\u003c/td\u003e\n    \u003ctd align=\"center\"\u003e\u003ca href=\"https://github.com/tsampazk\"\u003e\u003cimg src=\"https://avatars.githubusercontent.com/u/27914645?v=4?s=100\" width=\"100px;\" alt=\"\"/\u003e\u003cbr /\u003e\u003csub\u003e\u003cb\u003eKostas Tsampazis\u003c/b\u003e\u003c/sub\u003e\u003c/a\u003e\u003cbr /\u003e\u003ca href=\"https://github.com/aidudezzz/deepbots/commits?author=tsampazk\" title=\"Code\"\u003e💻\u003c/a\u003e\u003c/td\u003e\n    \u003ctd align=\"center\"\u003e\u003ca href=\"https://www.linkedin.com/in/kelvin-yang-b7b508198/\"\u003e\u003cimg src=\"https://avatars.githubusercontent.com/u/49781698?v=4?s=100\" width=\"100px;\" alt=\"\"/\u003e\u003cbr /\u003e\u003csub\u003e\u003cb\u003eJiun Kai Yang\u003c/b\u003e\u003c/sub\u003e\u003c/a\u003e\u003cbr /\u003e\u003ca href=\"https://github.com/aidudezzz/deepbots/commits?author=KelvinYang0320\" title=\"Code\"\u003e💻\u003c/a\u003e\u003c/td\u003e\n    \u003ctd align=\"center\"\u003e\u003ca href=\"https://github.com/MentalGear\"\u003e\u003cimg src=\"https://avatars.githubusercontent.com/u/2837147?v=4?s=100\" width=\"100px;\" alt=\"\"/\u003e\u003cbr /\u003e\u003csub\u003e\u003cb\u003eMentalGear\u003c/b\u003e\u003c/sub\u003e\u003c/a\u003e\u003cbr /\u003e\u003ca href=\"#ideas-MentalGear\" title=\"Ideas, Planning, \u0026 Feedback\"\u003e🤔\u003c/a\u003e\u003c/td\u003e\n    \u003ctd align=\"center\"\u003e\u003ca href=\"https://github.com/DreamtaleCore\"\u003e\u003cimg src=\"https://avatars.githubusercontent.com/u/12713528?v=4?s=100\" width=\"100px;\" alt=\"\"/\u003e\u003cbr /\u003e\u003csub\u003e\u003cb\u003eDreamtale\u003c/b\u003e\u003c/sub\u003e\u003c/a\u003e\u003cbr /\u003e\u003ca href=\"https://github.com/aidudezzz/deepbots/issues?q=author%3ADreamtaleCore\" title=\"Bug reports\"\u003e🐛\u003c/a\u003e\u003c/td\u003e\n    \u003ctd align=\"center\"\u003e\u003ca href=\"https://nickkok.github.io/my-website/\"\u003e\u003cimg src=\"https://avatars.githubusercontent.com/u/8222731?v=4?s=100\" width=\"100px;\" alt=\"\"/\u003e\u003cbr /\u003e\u003csub\u003e\u003cb\u003eNikolaos Kokkinis-Ntrenis\u003c/b\u003e\u003c/sub\u003e\u003c/a\u003e\u003cbr /\u003e\u003ca href=\"https://github.com/aidudezzz/deepbots/commits?author=NickKok\" title=\"Code\"\u003e💻\u003c/a\u003e \u003ca href=\"https://github.com/aidudezzz/deepbots/commits?author=NickKok\" title=\"Documentation\"\u003e📖\u003c/a\u003e \u003ca href=\"#ideas-NickKok\" title=\"Ideas, Planning, \u0026 Feedback\"\u003e🤔\u003c/a\u003e\u003c/td\u003e\n  \u003c/tr\u003e\n\u003c/table\u003e\n\n\u003c!-- markdownlint-restore --\u003e\n\u003c!-- prettier-ignore-end --\u003e\n\n\u003c!-- ALL-CONTRIBUTORS-LIST:END --\u003e\n\nThis project follows the\n[all-contributors](https://github.com/all-contributors/all-contributors)\nspecification. Contributions of any kind welcome!\n\n\u003cb\u003e Special thanks to \u003ca href='https://www.papanikolaouev.com/'\u003ePapanikolaou Evangelia\u003c/a\u003e \u003c/b\u003e for designing project's logo! \u003c/b\u003e \n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Faidudezzz%2Fdeepbots","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Faidudezzz%2Fdeepbots","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Faidudezzz%2Fdeepbots/lists"}