{"id":20304355,"url":"https://github.com/dev-michael-schmidt/ai-atari-gamer","last_synced_at":"2026-05-09T01:40:10.517Z","repository":{"id":229901208,"uuid":"759641534","full_name":"dev-michael-schmidt/ai-atari-gamer","owner":"dev-michael-schmidt","description":"I'm an AI, I play Atari's breakout in my spare time.","archived":false,"fork":false,"pushed_at":"2024-03-26T20:25:28.000Z","size":14054,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-01-14T11:16:23.457Z","etag":null,"topics":["atari-games","convolutional-neural-networks","reinforcement-learning"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/dev-michael-schmidt.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null}},"created_at":"2024-02-19T03:43:48.000Z","updated_at":"2024-06-04T20:56:23.000Z","dependencies_parsed_at":"2024-03-26T21:44:21.876Z","dependency_job_id":null,"html_url":"https://github.com/dev-michael-schmidt/ai-atari-gamer","commit_stats":null,"previous_names":["dev-michael-schmidt/atari-breakout-cnn","dev-michael-schmidt/ai-atari-gamer"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dev-michael-schmidt%2Fai-atari-gamer","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dev-michael-schmidt%2Fai-atari-gamer/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dev-michael-schmidt%2Fai-atari-gamer/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dev-michael-schmidt%2Fai-atari-gamer/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/dev-michael-schmidt","download_url":"https://codeload.github.com/dev-michael-schmidt/ai-atari-gamer/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":241801257,"owners_count":20022390,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["atari-games","convolutional-neural-networks","reinforcement-learning"],"created_at":"2024-11-14T16:43:55.655Z","updated_at":"2026-05-09T01:40:10.467Z","avatar_url":"https://github.com/dev-michael-schmidt.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"## Breakout Atari Agent\r\n\u003cimg src=\"https://github.com/CSCI4850/S18-team1-project/blob/master/breakout.gif\" width=\"200px\" height=\"auto\"\u003e\r\n\r\n**What:** A CNN that learned to play a video game, [Atari's Breakout](https://gymnasium.farama.org/environments/atari/breakout/).\r\n\r\nResults of training on a dusty GTX 1080 for 10 hours / overnight.\r\n\r\n### Model:\r\nThis model consists of a Convolutional Neural Network with a preprocessed frame from Breakout of a (210, 160, 3) tuple =\u003e (84, 84) grayscale down-sized frame and a linear output size of 4:\r\n- no-op\r\n- fire\r\n- move left\r\n- move right\r\n\r\nwhich gets reduced down to 3 (no-op, move left, move right) because (fire) in breakout is basically a no-op. \\\r\nThe model uses the Adam optimizer with a logcosh, mean squared error, or huber loss function.\r\n\r\n### Requirements:\r\nPython3.8: create using conda or [asdf](https://asdf-vm.com/)\r\n\r\n```pip install -r requirements.txt```\r\n~~You will also need ```pip install tensorflow-gpu==1.7.0``` if you are using a GPU to train.~~ I will also need to determine which tensorflow package if training on a GPU. `requirements.txt` / `numpy` was recently updated and I don't know.  I just don't.\r\n\r\n### Python Components (located in src):\r\n1. `breakout.py`:\r\n  The main breakout game loop. Integrates with all of the components.\r\n\r\n2. `DQNAgent.py`:\r\n  The Deep Q Network Agent for learning the breakout game.\r\n\r\n3. `ReplayMemory.py`:\r\n  The Remembering and Replaying for the DQNAgent to learn.\r\n  \r\n4. `hyperparameters.py`:\r\n  All of the hyperparameters\r\n  \r\n5. `discrete_frames.py`:\r\n    Discrete frames into the model and memory. More memory footprint, more backpropogation steps.\r\n\r\n6. `sliding_frames.py`:\r\n    Sliding frames into the model and memory. Less memory footprint, less backpropagation steps.\r\n    \r\n7. `utils.py`:\r\n    List of utility functions used by numerous components.\r\n\r\n### Breakout Main Loop: \r\n    'GAME' : 'BreakoutDeterministic-v4', # Name of which game to use\r\n                                         # v1-4 Deterministic or Not\r\n\r\n    'DISCRETE_FRAMING' : True,     # 2 discrete sets of frames stored in memory\r\n    \r\n    'LOAD_WEIGHTS' : '',           # Loads weights into the model if so desired\r\n                                   # leave '' if starting from a new model\r\n\r\n    'RENDER_ENV' : False,          # shows the screen of the game as it learns\r\n                                   # massivly slows the training down when True\r\n                                   # default: False\r\n\r\n    'HEIGHT' : 84,                 # height in pixels\r\n    'WIDTH'  : 84,                 # and width in pixels that the game window will get downscaled to\r\n                                   # defaults: 84, 84\r\n\r\n    'FRAME_SKIP_SIZE' : 4,         # how many frames we skip and and how many times we \r\n                                   # choose an action consecutively for that many frames.\r\n                                   # default: 4\r\n    \r\n    'MAX_EPISODES' : 12000,        # defined as how many cycles of full life to end life or\r\n                                   # winning a round\r\n                                   # default: 12,000\r\n\r\n    'MAX_FRAMES' : 50000000,       # max number of frames allowed to pass before stopping\r\n                                   # default: 50,000,000 (how many google used)\r\n\r\n    'SAVE_MODEL' : 500,            # how many episodes should we go through until we save the model?\r\n                                   # default: whenever you want to save the model\r\n\r\n    'TARGET_UPDATE' : 10000,       # on what mod epochs should we update the target network?\r\n                                   # default: 10000\r\n    \r\n#### DQNAgent:\r\n    'WATCH_Q' : False,             # watch the Q function and see what decision it picks\r\n                                   # cool to watch\r\n                                   # default: False\r\n\r\n    'LEARNING_RATE' : 0.00025,     # learning rate of the Adam optimizer\r\n                                   # default: 0.00025\r\n        \r\n    'INIT_EXPLORATION' : 1.0,      # exploration rate, start at 100%\r\n    'EXPLORATION' : 1000000,       # how many frames we decay till\r\n    'MIN_EXPLORATION' : 0.1,       # ending exploration rate\r\n                                   # defaults: 1.0, 1,000,000, 0.1\r\n    \r\n    'OPTIMIZER' : 'Adam',          # optimizer used\r\n                                   # default: RMSprop or Adam\r\n    \r\n    'MIN_SQUARED_GRADIENT' : 0.01, # epsilon rate\r\n                                   # default: 0.01\r\n    \r\n    'GRADIENT_MOMENTUM' : 0.95,    # momentum into the gradient used\r\n                                   # default: 0.95\r\n\r\n    'LOSS' : 'huber',              # can be 'logcosh' for logarithm of hyperbolic cosine\r\n                                   # or 'mse' for mean squared error\r\n                                   # or 'huber' for huber loss\r\n                                   # default: logcosh, mse, or huber\r\n        \r\n    'NO-OP_MAX' : 30,              # how many times no-op can be called at the beginning\r\n                                   # of a single episode, reduces using the same state\r\n                                   # at the beginning and increases variance of similar states\r\n                                   # default: 30 (don't set this too high or we may lose before acting!)\r\n#### Replay and Remember Memory:\r\n    'SHOW_FIT' : 0,                # shows the fit of the model and it's work, turn to 0 for off\r\n                                   # default: 0 for off\r\n    \r\n    'REPLAY_START' : 50000,        # when to start using replay to update the model\r\n                                   # default: 50000 frames\r\n\r\n    'MEMORY_SIZE' : 1000000,       # size of the memory bank\r\n                                   # default: 1,000,000\r\n\r\n    'GAMMA' : 0.99,                # integration of rewards, discount factor, \r\n                                   # preference for present rewards as opposed to future rewards\r\n                                   # default: 0.99\r\n    # 4 * 8 = 32 batch\r\n    'REPLAY_ITERATIONS' : 4,       # how many irerations of replay\r\n                                   # default: 4\r\n\r\n    'BATCH_SIZE' : 8               # batch size used to learn\r\n                                   # default: 8\r\n                                   \r\n### Instructions:\r\nTo start the breakout game with the DQN Agent, run ```python3 breakout.py``` \\\r\nTo change how the DQN Agent learns, modify hyperparameters.py\r\n\r\n### Demo:\r\nTo start the demo, run ```python3 DQN_Testing.py```\r\nAlternatively, there is a python notebook under DQN_Testing.ipynb which renders every 6 frames.\r\n\r\n\r\n### Useful References (From start to keras):\r\n1. http://docs.python-guide.org/en/latest/starting/installation/\r\n2. https://www.makeuseof.com/tag/install-pip-for-python/\r\n3. https://storage.googleapis.com/deepmind-media/dqn/DQNNaturePaper.pdf\r\n4. https://github.com/dennybritz/reinforcement-learning/issues/30\r\n5. https://github.com/tokb23/dqn/blob/master/dqn.py\r\n6. https://github.com/jcwleo/Reinforcement_Learning/blob/master/Breakout/Breakout_DQN_class.py\r\n7. https://medium.com/mlreview/speeding-up-dqn-on-pytorch-solving-pong-in-30-minutes-81a1bd2dff55\r\n8. https://becominghuman.ai/beat-atari-with-deep-reinforcement-learning-part-2-dqn-improvements-d3563f665a2c\r\n9. https://github.com/keras-rl/keras-rl\r\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdev-michael-schmidt%2Fai-atari-gamer","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdev-michael-schmidt%2Fai-atari-gamer","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdev-michael-schmidt%2Fai-atari-gamer/lists"}