{"id":21625624,"url":"https://github.com/eljandoubi/ddpg-for-continuous-control","last_synced_at":"2025-12-31T00:14:06.988Z","repository":{"id":60123739,"uuid":"541141776","full_name":"eljandoubi/DDPG-for-continuous-control","owner":"eljandoubi","description":"An implementation of DDPG agent to solve a Unity environment like Reacher and Crawler.","archived":false,"fork":false,"pushed_at":"2022-09-29T03:58:22.000Z","size":2802,"stargazers_count":0,"open_issues_count":0,"forks_count":1,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-01-24T22:35:34.807Z","etag":null,"topics":["crawler-environment","ddpg-algorithm","multi-agent-reinforcement-learning","pytorch","reacher-environment","reinforcement-learning","unity-environment"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/eljandoubi.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2022-09-25T11:01:08.000Z","updated_at":"2022-09-29T01:36:10.000Z","dependencies_parsed_at":"2023-01-19T00:31:04.613Z","dependency_job_id":null,"html_url":"https://github.com/eljandoubi/DDPG-for-continuous-control","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/eljandoubi%2FDDPG-for-continuous-control","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/eljandoubi%2FDDPG-for-continuous-control/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/eljandoubi%2FDDPG-for-continuous-control/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/eljandoubi%2FDDPG-for-continuous-control/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/eljandoubi","download_url":"https://codeload.github.com/eljandoubi/DDPG-for-continuous-control/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":244297908,"owners_count":20430347,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["crawler-environment","ddpg-algorithm","multi-agent-reinforcement-learning","pytorch","reacher-environment","reinforcement-learning","unity-environment"],"created_at":"2024-11-25T01:09:51.551Z","updated_at":"2025-12-31T00:14:06.950Z","avatar_url":"https://github.com/eljandoubi.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"[//]: # (Image References)\n\n[image1]: https://user-images.githubusercontent.com/10624937/43851024-320ba930-9aff-11e8-8493-ee547c6af349.gif \"Trained Agent\"\n[image2]: https://user-images.githubusercontent.com/10624937/43851646-d899bf20-9b00-11e8-858c-29b5c2c94ccc.png \"Crawler\"\n[image3]: https://user-images.githubusercontent.com/10624937/42386929-76f671f0-8106-11e8-9376-f17da2ae852e.png\n\n\n# DDPG for continuous control\nThis repository contains material from the [second Udacity DRL procjet](https://github.com/udacity/deep-reinforcement-learning/tree/master/p2_continuous-control) and the coding exercice [DDPG-pendulum](https://github.com/udacity/deep-reinforcement-learning/tree/master/ddpg-pendulum).\n\n\n## Introduction\n\nIn this project, I trained a DDPG agent to solve two types of environment.  \n\n![Trained Agent][image1]\n\nFirst the **Reacher** environment, a double-jointed arm can move to target locations. A reward of +0.1 is provided for each step that the agent's hand is in the goal location. Thus, the goal of your agent is to maintain its position at the target location for as many time steps as possible.\n\nThe observation space consists of 33 variables corresponding to position, rotation, velocity, and angular velocities of the arm. Each action is a vector with four numbers, corresponding to torque applicable to two joints. Every entry in the action vector should be a number between -1 and 1.\n\n---\nSecond, the **Crawler** environment.\n\n![Crawler][image2]\n\nIn this continuous control environment, the goal is to teach a creature with four legs to walk forward without falling. \n\n___\nAn environment is considered solved, when an average score of +30 over 100 consecutive episodes, and over all agents is obtained. \n\n## Dependencies\n\nTo set up your python environment to run the code in this repository, follow the instructions below.\n\n1. Create (and activate) a new environment with Python 3.9.\n\n\t- __Linux__ or __Mac__: \n\t```bash \n    conda create --name drlnd \n    source activate drlnd\n\t```\n\t- __Windows__: \n\t```bash\n\tconda create --name drlnd \n\tactivate drlnd\n\t```\n2. Follow the instructions in [Pytorch](https://pytorch.org/) web page to install pytorch and its dependencies (PIL, numpy,...). For Windows and cuda 11.6\n\n    ```bash\n    conda install pytorch torchvision torchaudio cudatoolkit=11.6 -c pytorch -c conda-forge\n    ```\n\t\n\n3. Follow the instructions in [this repository](https://github.com/openai/gym) to perform a minimal install of OpenAI gym.  \n\t- Install the **box2d** environment group by following the instructions [here](https://github.com/openai/gym#box2d).\n\n    ```bash\n    pip install gym[box2d]\n    ```\n    \n4. Follow the instructions in [second Udacity DRL procjet](https://github.com/udacity/deep-reinforcement-learning/tree/master/p2_continuous-control) to get the environment.\n\t\n5. Clone the repository, and navigate to the `python/` folder.  Then, install several dependencies.\n```bash\ngit clone https://github.com/eljandoubi/DDPG-for-continuous-control.git\ncd DDPG-for-continuous-control/python\npip install .\n```\n\n6. Create an [IPython kernel](http://ipython.readthedocs.io/en/stable/install/kernel_install.html) for the `drlnd` environment.  \n```bash\npython -m ipykernel install --user --name drlnd --display-name \"drlnd\"\n```\n\n7. Before running code in a notebook, change the kernel to match the `drlnd` environment by using the drop-down `Kernel` menu. \n\n![Kernel][image3]\n\n## Training and inference\nYou can train and/or inference an environment by following instructions in its notebook.\n\n## Implementation and Resultats\n\nThe implementation and resultats are discussed in the report.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Feljandoubi%2Fddpg-for-continuous-control","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Feljandoubi%2Fddpg-for-continuous-control","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Feljandoubi%2Fddpg-for-continuous-control/lists"}