{"id":27633353,"url":"https://github.com/negarhonarvar/deepreinforcementlearning","last_synced_at":"2025-07-23T14:35:56.079Z","repository":{"id":259484482,"uuid":"798429152","full_name":"negarhonarvar/DeepReinforcementLearning","owner":"negarhonarvar","description":"A Complete Collection of Deep RL Famous Algorithms implemented in Gymnasium most Popular environments","archived":false,"fork":false,"pushed_at":"2025-04-13T16:12:53.000Z","size":6730,"stargazers_count":9,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-04-23T18:15:10.652Z","etag":null,"topics":["boltzmann-exploration","cartpole-v1","d3qn","dqn","drl-algorithms","gymnasium-environment","lunar-lander","ppo-algorithm","sarsa","softmax-exploration","swimmer"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/negarhonarvar.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-05-09T18:57:17.000Z","updated_at":"2025-04-13T16:12:57.000Z","dependencies_parsed_at":"2024-10-25T22:52:54.148Z","dependency_job_id":"8a61374f-7fca-411b-970e-a1ffad5829c1","html_url":"https://github.com/negarhonarvar/DeepReinforcementLearning","commit_stats":null,"previous_names":["negarhonarvar/deepreinforcementlearning"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/negarhonarvar%2FDeepReinforcementLearning","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/negarhonarvar%2FDeepReinforcementLearning/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/negarhonarvar%2FDeepReinforcementLearning/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/negarhonarvar%2FDeepReinforcementLearning/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/negarhonarvar","download_url":"https://codeload.github.com/negarhonarvar/DeepReinforcementLearning/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":250487531,"owners_count":21438612,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["boltzmann-exploration","cartpole-v1","d3qn","dqn","drl-algorithms","gymnasium-environment","lunar-lander","ppo-algorithm","sarsa","softmax-exploration","swimmer"],"created_at":"2025-04-23T18:15:17.376Z","updated_at":"2025-04-23T18:15:18.117Z","avatar_url":"https://github.com/negarhonarvar.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Deep Reinforcement Learning Algorithms\nDeep Reinforcement Learning Course Assignments by DR. Armin Salimi Badr.\nthe codes in this repository utilize Gymnasium library environments.\n\n## Prerequisites 📋\nTo successfully run the codes in this repository, you need to install:\n\n    Gymnasium v1.0.0\n   \n\n# CartPole V1\n### Objective\nThe goal of this environment is to balance a pole by applying forces in the left and right directions on the cart. It has a discrete action space:\n\n    0: Push cart to the left\n    1: Push cart to the right\n### Observation Space\nUpon taking an action, either left or right, an agent observes a 4-dimensional state consisting of:\n\n    Cart Position\n    Cart Velocity\n    Pole Angle\n    Pole Angular Velocity\n\nA reward of +1 is granted to the agent at each step while the pole is kept upright. The maximum reward an agent can earn in a single episode is 500.\n\n### Termination\nThe episode ends under the following conditions:\n\n    Termination: Pole Angle is greater than ±12°\n    Termination: Cart Position is greater than ±2.4 (center of the cart reaches the edge of the display)\n    Truncation: Episode length exceeds 500 steps\n\n## Agents\n\n- \u003cth\u003e\u003ca href=\"https://github.com/negarhonarvar/DeepReinforcementLearning/tree/main/DQN%20in%20CartPole\"\u003eDQN\u003c/a\u003e\u003c/th\u003e\n- \u003cth\u003e\u003ca href=\"https://github.com/negarhonarvar/DeepReinforcementLearning/tree/main/SARSA%20in%20CartPole\"\u003eSARSA\u003c/a\u003e\u003c/th\u003e\n\n\n# Lunar Lander\nThis environment is part of the Box2D environments which contains general information about the environment and  is a classic rocket trajectory optimization problem.\n\n### Objective\nAccording to Pontryagin’s maximum principle, it is optimal to fire the engine at full throttle or turn it off. This is the reason why this environment has discrete actions: engine on or off.\n\n### Action Space \nThere are four discrete actions available\n\n    0: do nothing\n    1: fire left orientation engine\n    2: fire main engine\n    3: fire right orientation engine\n\n### Observation Space\nThe state is an 8-dimensional vector: the coordinates of the lander in x \u0026 y, its linear velocities in x \u0026 y, its angle, its angular velocity, and two booleans that represent whether each leg is in contact with the ground or not.\n\n## Agents\n\n- \u003cth\u003e\u003ca href=\"https://github.com/negarhonarvar/DeepReinforcementLearning/tree/main/DQN%20in%20Lunar%20Lander\"\u003eDQN\u003c/a\u003e\u003c/th\u003e\n- \u003cth\u003e\u003ca href=\"https://github.com/negarhonarvar/DeepReinforcementLearning/tree/main/D3QN\"\u003eD3QN\u003c/a\u003e\u003c/th\u003e\n- \u003cth\u003e\u003ca href=\"https://github.com/negarhonarvar/DeepReinforcementLearning/tree/main/EnhancedD3QN\"\u003eEnhanced D3QN\u003c/a\u003e\u003c/th\u003e\n\n# Swimmer Environment\nThe Swimmer environment in MuJoCo is a reinforcement learning environment where the goal is to control a multi-jointed swimmer to move forward quickly in a two-dimensional fluid environment. The swimmer is essentially a small robotic entity with a simple body structure, consisting of a head and multiple tail-like segments. As described on the reference site:\n\n\"One rotor joint connects three or more segments ('links') and exactly two rotors ('rotors') to form a linear chain of articulation joints.\"    \n\n### Objective:\nThe primary goal of the actor in the Swimmer environment is to move to the right along the horizontal axis of a two-dimensional plane by activating its rotors. When a trajectory ends, the agent always starts its new trajectory from a fixed starting point. The reward function typically rewards the agent based on the horizontal distance covered by the swimmer in each time step, and it is accompanied by penalties to prevent excessive use of activation forces. Therefore, the agent must learn how to propel itself forward effectively and efficiently.for further information on this environment, visit gymnasium.\n\n### Problem parameters:\n\n    n: number of body parts\n    mi: mass of part i (i ∈ {1…n})\n    li: length of part i (i ∈ {1…n})\n    k: viscous-friction coefficient\n\n### Action Space\n\n    Num = 0 : Torque applied on the first rotor (-1,1)\n    Num = 1 : Torque applied on the second rotor (-1,1)\n    \n### Observation Space\n\n    qpos (3 elements by default): Position values of the robot’s body parts.\n    qvel (5 elements): The velocities of these individual body parts (their derivatives).\n\n## Agents\n\n- \u003cth\u003e\u003ca href=\"https://github.com/negarhonarvar/DeepReinforcementLearning/tree/main/PPO%20with%20adaptive%20kl\"\u003ePPO with Adaptive KL\u003c/a\u003e\u003c/th\u003e\n- \u003cth\u003e\u003ca href=\"https://github.com/negarhonarvar/DeepReinforcementLearning/tree/main/PPo%20with%20clipped%20objective\"\u003ePPO with Clipped Objective\u003c/a\u003e\u003c/th\u003e\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnegarhonarvar%2Fdeepreinforcementlearning","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fnegarhonarvar%2Fdeepreinforcementlearning","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnegarhonarvar%2Fdeepreinforcementlearning/lists"}