{"id":17723433,"url":"https://github.com/vivek3141/pacman-ai","last_synced_at":"2025-04-01T13:33:43.937Z","repository":{"id":60347796,"uuid":"171736036","full_name":"vivek3141/pacman-ai","owner":"vivek3141","description":"A.I. plays the original 1980 Pacman using Neuroevolution of Augmenting Topologies and Deep Q Learning","archived":false,"fork":false,"pushed_at":"2019-04-01T18:06:11.000Z","size":1644,"stargazers_count":26,"open_issues_count":2,"forks_count":4,"subscribers_count":3,"default_branch":"master","last_synced_at":"2023-03-02T04:25:50.185Z","etag":null,"topics":["artificial-intelligence","deep-q-learning","dopamine","dqn","neat","neat-python","neural-network","neuroevolution","pacman","python","q-learning","reinforcement-learning","tensorflow","tensorflow-rl"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/vivek3141.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2019-02-20T19:29:21.000Z","updated_at":"2023-02-01T21:53:34.000Z","dependencies_parsed_at":"2022-09-28T10:50:25.425Z","dependency_job_id":null,"html_url":"https://github.com/vivek3141/pacman-ai","commit_stats":null,"previous_names":[],"tags_count":null,"template":null,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vivek3141%2Fpacman-ai","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vivek3141%2Fpacman-ai/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vivek3141%2Fpacman-ai/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/vivek3141%2Fpacman-ai/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/vivek3141","download_url":"https://codeload.github.com/vivek3141/pacman-ai/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":246604609,"owners_count":20804099,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["artificial-intelligence","deep-q-learning","dopamine","dqn","neat","neat-python","neural-network","neuroevolution","pacman","python","q-learning","reinforcement-learning","tensorflow","tensorflow-rl"],"created_at":"2024-10-25T15:42:57.239Z","updated_at":"2025-04-01T13:33:42.803Z","avatar_url":"https://github.com/vivek3141.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Pacman AI\nThis project builds a program that can play the original 1980 Atari Pacman. \nThe approaches used are Deep-Q-Learning and Neuroevolution of Augmenting Topologies\n\n## Running The Program\nUse the Makefile to run various parts of this project.\n* NEAT\n    * Train - `make neat-train`\n    * Test - `make neat-test`\n* DQN\n    * Train - `make rl-train`\n    * Test - `make rl-test`\n* Q Learning Demo - `make q-learning`\n\n#### Alternative\n`main.py` usage\n```bash\npython3 main.py [algorithm] [train/test]\n```\n\n## Requirements\nInstall the requirements with\n```bash\npip install -r requirements.txt\n```\n#### Alternative\n```bash\nsudo make install\n```\n\n## NEAT - Neuroevolution of Augmenting Topologies\nAll files using NEAT are stored under `NEAT/`\n\nTrain the NEAT model:\n```bash\nmake neat-train\n```\n\nTest the NEAT model:\n```bash\nmake neat-test\n```\n\n### Explanation\nFor an explanation on a project using the same algorithm watch [this video](https://www.youtube.com/watch?v=UdJ4titVY7I).\n#### Short Explanation\nThis program uses a mathematical model, called a neural network, which simulates the brain of a human being. \nA neural network works by taking inputs and outputting probabilities for each of the outputs. This can be accomplished\nby using a sigmoid function. \u003cbr\u003e\u003cbr\u003e\n![Sigmoid](https://qph.fs.quoracdn.net/main-qimg-07066668c05a556f1ff25040414a32b7)\n\u003cbr\u003e\u003cbr\u003e\n`Neuroevolution of Augmenting Topologies`, or `NEAT` is what this project uses. The way standard\n`neuroevolution` works is by randomly initializing a population of neural networks and\nusing survival of the fittest to get the best model. The best networks in each generations\nare bred and some mutations are introduced. `NEAT` introduces features like speciation to\nmake a much more effective neuroevolution model. Neuroevolution is known to do better than standard\nreinforcement learning models.\u003cbr\u003e\n\n## DQN - Deep Q-Learning\nAll files using DQN are stored under `DQN/`\n\nTrain the DQN model:\n```bash\nmake rl-train\n```\n\nTest the DQN model:\n```bash\nmake rl-test\n```\n\n### Explanation\n\n#### Q-Learning\nQ-learning learns the action-value function Q(s, a): how good to take an action at a particular state.\nHere's what each term means:\n\n* state - The observation you take from the environment. In this case, it would be an image\nof the Pacman game or the RAM of the Atari console.\u003cbr\u003e\n![Pacman State](https://i.imgur.com/2yT83gV.jpg)\n\n* action - The program's output for this particular state. For example, an action \nwould be to move left, up, right, or down in Pacman\n\n* reward - A reward is a number that tells how good or bad an episode was. In this\ncase, the reward can be the score.\n\n* Q(s, a) - Q is called the action-value function. In Q Learning, we build an table,\ncalled the Q-Table for every state action pair. This Q Table helps determine \nwhat action to choose. \n\nThis kind of state-action-reward system where the next state depends on the\nprevious state is called Markov Decision Process.\u003cbr\u003e\n\n![MDP](https://qph.fs.quoracdn.net/main-qimg-f92c275af47e561651857f9af6bb85e9)\n\n##### How Q-Learning Works\n\n![Q-Learning](https://cdn-images-1.medium.com/max/1600/1*QeoQEqWYYPs1P8yUwyaJVQ.png)\n\n#### Deep-Q-Learning\n\nDeep Q-learning is a special type of Q-Learning\nwhere the Q-function is learnt by a deep neural network. \nThe input to the neural network is the state of the environment\nand the outputs are the Q-Values.\nThe action with the maximum predicted Q-value is chosen as our action \nto be taken in the environment.\n![DQN](https://cdn-images-1.medium.com/max/1200/1*0_TNa54fr_LsLOllgIsrcw.png)","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fvivek3141%2Fpacman-ai","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fvivek3141%2Fpacman-ai","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fvivek3141%2Fpacman-ai/lists"}