{"id":13658934,"url":"https://github.com/carpedm20/deep-rl-tensorflow","last_synced_at":"2025-05-16T03:06:15.899Z","repository":{"id":47897969,"uuid":"60857280","full_name":"carpedm20/deep-rl-tensorflow","owner":"carpedm20","description":"TensorFlow implementation of Deep Reinforcement Learning papers","archived":false,"fork":false,"pushed_at":"2018-06-04T07:19:33.000Z","size":613,"stargazers_count":1589,"open_issues_count":20,"forks_count":395,"subscribers_count":92,"default_branch":"master","last_synced_at":"2025-04-08T13:13:44.391Z","etag":null,"topics":["deep-reinforcement-learning","dqn","tensorflow"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/carpedm20.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2016-06-10T15:19:30.000Z","updated_at":"2025-03-26T11:01:17.000Z","dependencies_parsed_at":"2022-08-12T14:00:38.189Z","dependency_job_id":null,"html_url":"https://github.com/carpedm20/deep-rl-tensorflow","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/carpedm20%2Fdeep-rl-tensorflow","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/carpedm20%2Fdeep-rl-tensorflow/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/carpedm20%2Fdeep-rl-tensorflow/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/carpedm20%2Fdeep-rl-tensorflow/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/carpedm20","download_url":"https://codeload.github.com/carpedm20/deep-rl-tensorflow/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254459088,"owners_count":22074605,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["deep-reinforcement-learning","dqn","tensorflow"],"created_at":"2024-08-02T05:01:03.896Z","updated_at":"2025-05-16T03:06:10.884Z","avatar_url":"https://github.com/carpedm20.png","language":"Python","funding_links":[],"categories":["Python (144)","Awesome","Python","Machine Learning","📌 Other"],"sub_categories":["Reinforcement Learning"],"readme":"# Deep Reinforcement Learning in TensorFlow\n\nTensorFlow implementation of Deep Reinforcement Learning papers. This implementation contains:\n\n[1] [Playing Atari with Deep Reinforcement Learning](http://arxiv.org/abs/1312.5602)  \n[2] [Human-Level Control through Deep Reinforcement Learning](http://home.uchicago.edu/~arij/journalclub/papers/2015_Mnih_et_al.pdf)  \n[3] [Deep Reinforcement Learning with Double Q-learning](http://arxiv.org/abs/1509.06461)  \n[4] [Dueling Network Architectures for Deep Reinforcement Learning](http://arxiv.org/abs/1511.06581)  \n[5] [Prioritized Experience Replay](http://arxiv.org/pdf/1511.05952v3.pdf) (in progress)  \n[6] [Deep Exploration via Bootstrapped DQN](http://arxiv.org/abs/1602.04621) (in progress)  \n[7] [Asynchronous Methods for Deep Reinforcement Learning](http://arxiv.org/abs/1602.01783) (in progress)  \n[8] [Continuous Deep q-Learning with Model-based Acceleration](http://arxiv.org/abs/1603.00748) (in progress)  \n\n\n## Requirements\n\n- Python 2.7\n- [gym](https://github.com/openai/gym)\n- [tqdm](https://github.com/tqdm/tqdm)\n- [OpenCV2](http://opencv.org/) or [Scipy](https://www.scipy.org/)\n- [TensorFlow 0.12.0](https://www.tensorflow.org/)\n\n\n## Usage\n\nFirst, install prerequisites with:\n\n    $ pip install -U 'gym[all]' tqdm scipy\n\nDon't forget to also install the latest\n[TensorFlow](https://www.tensorflow.org/). Also note that you need to install\nthe dependences of [`doom-py`](https://github.com/openai/doom-py) which is\nrequired by `gym[all]`\n\nTrain with DQN model described in [[1]](#deep-reinforcement-learning-in-tensorflow) without gpu:\n\n    $ python main.py --network_header_type=nips --env_name=Breakout-v0 --use_gpu=False\n\nTrain with DQN model described in [[2]](#deep-reinforcement-learning-in-tensorflow):\n\n    $ python main.py --network_header_type=nature --env_name=Breakout-v0\n\nTrain with Double DQN model described in [[3]](#deep-reinforcement-learning-in-tensorflow):\n\n    $ python main.py --double_q=True --env_name=Breakout-v0\n\nTrain with Deuling network with Double Q-learning described in [[4]](#deep-reinforcement-learning-in-tensorflow):\n\n    $ python main.py --double_q=True --network_output_type=dueling --env_name=Breakout-v0\n\nTrain with MLP model described in [[4]](#deep-reinforcement-learning-in-tensorflow) with corridor environment (useful for debugging):\n\n    $ python main.py --network_header_type=mlp --network_output_type=normal --observation_dims='[16]' --env_name=CorridorSmall-v5 --t_learn_start=0.1 --learning_rate_decay_step=0.1 --history_length=1 --n_action_repeat=1 --t_ep_end=10 --display=True --learning_rate=0.025 --learning_rate_minimum=0.0025\n    $ python main.py --network_header_type=mlp --network_output_type=normal --double_q=True --observation_dims='[16]' --env_name=CorridorSmall-v5 --t_learn_start=0.1 --learning_rate_decay_step=0.1 --history_length=1 --n_action_repeat=1 --t_ep_end=10 --display=True --learning_rate=0.025 --learning_rate_minimum=0.0025\n    $ python main.py --network_header_type=mlp --network_output_type=dueling --observation_dims='[16]' --env_name=CorridorSmall-v5 --t_learn_start=0.1 --learning_rate_decay_step=0.1 --history_length=1 --n_action_repeat=1 --t_ep_end=10 --display=True --learning_rate=0.025 --learning_rate_minimum=0.0025\n    $ python main.py --network_header_type=mlp --network_output_type=dueling --double_q=True --observation_dims='[16]' --env_name=CorridorSmall-v5 --t_learn_start=0.1 --learning_rate_decay_step=0.1 --history_length=1 --n_action_repeat=1 --t_ep_end=10 --display=True --learning_rate=0.025 --learning_rate_minimum=0.0025\n\n\n## Results\n\nResult of `Corridor-v5` in [[4]](#deep-reinforcement-learning-in-tensorflow) for DQN (purple), DDQN (red), Dueling DQN (green), Dueling DDQN (blue).\n\n![model](assets/corridor_result.png)\n\nResult of `Breakout-v0' for DQN without frame-skip (white-blue), DQN with frame-skip (light purple), Dueling DDQN (dark blue).\n\n![model](assets/A1_A4_double_dueling.png)\n\nThe hyperparameters and gradient clipping are not implemented as it is as [[4]](#deep-reinforcement-learning-in-tensorflow).\n\n\n## References\n\n- [DQN-tensorflow](https://github.com/devsisters/DQN-tensorflow)\n- [DeepMind's code](https://sites.google.com/a/deepmind.com/dqn/)\n\n\n## Author\n\nTaehoon Kim / [@carpedm20](http://carpedm20.github.io/)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcarpedm20%2Fdeep-rl-tensorflow","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fcarpedm20%2Fdeep-rl-tensorflow","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcarpedm20%2Fdeep-rl-tensorflow/lists"}