{"id":20389533,"url":"https://github.com/giabb/reinforcement-learning","last_synced_at":"2026-05-10T10:41:36.849Z","repository":{"id":236502309,"uuid":"342929560","full_name":"giabb/reinforcement-learning","owner":"giabb","description":"Reinforcement Learning exam project - \"Sapienza\" University of Rome, Fall Semester 2019 ","archived":false,"fork":false,"pushed_at":"2021-03-02T12:53:41.000Z","size":19924,"stargazers_count":2,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-01-15T10:04:14.351Z","etag":null,"topics":["ant","gym","mujoco","reinforcement-learning","rome","sac","sapienza","university"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/giabb.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.md","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2021-02-27T18:33:13.000Z","updated_at":"2023-10-01T06:53:27.000Z","dependencies_parsed_at":null,"dependency_job_id":"c5106fe2-83a5-4305-8072-490cb8f520a0","html_url":"https://github.com/giabb/reinforcement-learning","commit_stats":null,"previous_names":["giabb/reinforcement-learning"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/giabb%2Freinforcement-learning","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/giabb%2Freinforcement-learning/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/giabb%2Freinforcement-learning/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/giabb%2Freinforcement-learning/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/giabb","download_url":"https://codeload.github.com/giabb/reinforcement-learning/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":241940539,"owners_count":20045878,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ant","gym","mujoco","reinforcement-learning","rome","sac","sapienza","university"],"created_at":"2024-11-15T03:18:27.765Z","updated_at":"2026-05-10T10:41:36.776Z","avatar_url":"https://github.com/giabb.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Reinforcement Learning using SAC algorithm and Ant-v2 gym environment\n\nThis project has been developed during the 2019 Reinforcement Learning Course held py [Prof. Capobianco](http://robertocapobianco.com/) at [Sapienza University of Rome](https://www.uniroma1.it/).\n\nThe algorithm used in this project is the [Soft Actor-Critic algorithm](https://arxiv.org/abs/1812.05905) . More details on the implementation in the next sections.\n\n## Summary\n\n  - [Getting Started](#getting-started)\n  - [Some Specifications](#some-specifications)\n  - [Authors](#authors)\n  - [License](#license)\n  - [Acknowledgments](#acknowledgments)\n\n## Getting Started\n\nThe project contains only a Jupyter Notebook file. Meet the prerequisite and use it.\n\n### Prerequisites\n\n- Python 3.5+\n- Jupyer ``` pip install jupyterlab ```\n- [MuJoCo](http://www.mujoco.org) \n\t- I suggest [this article](https://medium.com/@ganeshprasanna/setting-up-mujoco-7a5ee62cf6dc) to install it. It worked on Ubuntu 18.04, Python 3.7.5 and mujoco200.\n\t- You will need a MuJoCo license.\n- Gym ``` pip install gym ```\n- Stable Baselines [installation](https://stable-baselines.readthedocs.io/en/master/guide/install.html)\n- Numpy ``` pip install numpy ```\n- Scipy ``` pip install scipy ```\n- TQDM ``` pip install tqdm ```\n\n## Some specifications\n\nThe environment where the tests are taken is the MuJoCo environment [Ant-v2](https://gym.openai.com/envs/Ant-v2/) . The target of this environment is to let the Ant walk as fast as possible, as long as possible. The ant is a hierarchical structure with the \"torso\" as the main object, and the 4 legs as the children:\n\n\n\u003cimg src=\"https://raw.githubusercontent.com/giabb/reinforcement-learning/main/md_media/ant.jpg\" alt=\"img_ant\" width=\"250\" height=\"250\"\u003e\n\n\nThe observation space is a 111-dim space:\n\n|\t  Total dimension \t| 111 |\n|:-----------------------------:|:---:|\n|          Torso Height         |  1  |\n|       Torso Orientation       |  4  |\n|          Joint Angles         |  8  |\n| Velocities (angular + linear) |  6  |\n|        Joint Velocities       |  8  |\n|        External Forces        |  84 |\n\nThe reward function is [defined here](https://github.com/openai/gym/blob/master/gym/envs/mujoco/ant.py#L10) .\n\nYou can find a video of the final execution [here](https://github.com/giabb/reinforcement-learning/blob/main/md_media/The%20Walking%20Ant.mp4) .\n\n\n## Authors\n\n  - **Giovanbattista Abbate** - [giabb](https://github.com/giabb)\n\n## License\n\nThis project is licensed under the MIT License - see the [LICENSE.md](LICENSE.md) file for details\n\n## Acknowledgments\n\n- **Billie Thompson** - *Provided README Template* - [PurpleBooth](https://github.com/PurpleBooth)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgiabb%2Freinforcement-learning","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fgiabb%2Freinforcement-learning","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgiabb%2Freinforcement-learning/lists"}