{"id":17594919,"url":"https://github.com/andreaconti/supermario-dqn","last_synced_at":"2026-05-04T21:37:02.737Z","repository":{"id":73012640,"uuid":"207824039","full_name":"andreaconti/supermario-dqn","owner":"andreaconti","description":"Deep Reinforcement Learning Agent for Super Mario Bros","archived":false,"fork":false,"pushed_at":"2019-09-28T21:18:10.000Z","size":78780,"stargazers_count":2,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"master","last_synced_at":"2025-06-05T19:03:21.362Z","etag":null,"topics":["deep-learning","python3","reinforcement-learning","super-mario-bros-ai"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/andreaconti.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.rst","contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":"AUTHORS.rst","dei":null,"publiccode":null,"codemeta":null}},"created_at":"2019-09-11T13:49:56.000Z","updated_at":"2021-04-22T17:52:21.000Z","dependencies_parsed_at":null,"dependency_job_id":"e7d7a1dd-1c4e-4826-b64a-f873a26b28a2","html_url":"https://github.com/andreaconti/supermario-dqn","commit_stats":null,"previous_names":[],"tags_count":1,"template":false,"template_full_name":null,"purl":"pkg:github/andreaconti/supermario-dqn","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/andreaconti%2Fsupermario-dqn","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/andreaconti%2Fsupermario-dqn/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/andreaconti%2Fsupermario-dqn/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/andreaconti%2Fsupermario-dqn/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/andreaconti","download_url":"https://codeload.github.com/andreaconti/supermario-dqn/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/andreaconti%2Fsupermario-dqn/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":32626490,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-04T10:08:07.713Z","status":"ssl_error","status_checked_at":"2026-05-04T10:08:02.005Z","response_time":58,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["deep-learning","python3","reinforcement-learning","super-mario-bros-ai"],"created_at":"2024-10-22T07:24:22.683Z","updated_at":"2026-05-04T21:37:02.720Z","avatar_url":"https://github.com/andreaconti.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"supermario-dqn\n==============\n\nDeep Reinforcement Learning Agent for Super Mario Bros using OpenAI gym\n[gym-super-mario-bros](https://github.com/Kautenja/gym-super-mario-bros) environment\n\n\n## Usage\n\n~~~shell\n\n# create virtual env in the project folder\n$ python3 -m venv .venv\n$ source .venv/bin/activate\n\n# install dependencies\n$ pip3 install -r requirements\n\n# use\n$ supermario_train -h\nusage: supermario_train [-h] [--batch_size BATCH_SIZE]\n                        [--fit_interval FIT_INTERVAL] [--gamma GAMMA]\n                        [--eps_start EPS_START] [--eps_end EPS_END]\n                        [--eps_decay EPS_DECAY]\n                        [--target_update TARGET_UPDATE]\n                        [--save_path SAVE_PATH] [--memory_size MEMORY_SIZE]\n                        [--num_episodes NUM_EPISODES] [--resume RESUME]\n                        [--checkpoint CHECKPOINT] [--random] [--render]\n                        [--world_stage WORLD_STAGE WORLD_STAGE]\n                        [--actions ACTIONS] [--test TEST] [--log]\n\nHandle training\n\noptional arguments:\n  -h, --help            show this help message and exit\n  --batch_size BATCH_SIZE\n                        size of each batch used for training\n  --fit_interval FIT_INTERVAL\n                        fit every `fit_interval` examples available\n  --gamma GAMMA         discount rate used for Q-values learning\n  --eps_start EPS_START\n                        start probability to choose a random action\n  --eps_end EPS_END     end probability to choose a random action\n  --eps_decay EPS_DECAY\n                        decay of eps probabilities\n  --target_update TARGET_UPDATE\n                        number of episodes between each target dqn update\n  --save_path SAVE_PATH\n                        where save trained model\n  --memory_size MEMORY_SIZE\n                        size of replay memory\n  --num_episodes NUM_EPISODES\n                        number of games to be played before end\n  --resume RESUME       load from a checkpoint\n  --checkpoint CHECKPOINT\n                        number of episodes between each network checkpoint\n  --random              choose randomly different worlds and stages\n  --render              rendering of frames, only for debug\n  --world_stage WORLD_STAGE WORLD_STAGE\n                        select specific world and stage\n  --actions ACTIONS     select actions used between [\"simple\"]\n  --test TEST           each `test` episodes network is used and tested over\n                        an episode\n  --log                 logs episodes results\n\n# play\n$ supermario_play -h\nusage: play a game [-h] [--world_stage WORLD_STAGE WORLD_STAGE] [--skip SKIP]\n                   [--processed]\n                   model\n\npositional arguments:\n  model                 neural network model\n\noptional arguments:\n  -h, --help            show this help message and exit\n  --world_stage WORLD_STAGE WORLD_STAGE\n                        select a specific world and stage, world in [1..8],\n                        stage in [1..4]\n  --skip SKIP           number of frames to skip\n  --processed           shows frames processed for neural network\n~~~\n\n### Results\n\n~~~bash\n$ supermario_play --skip 5 --world_stage 1 1 trained/train_1_1/model.pt\n~~~\n\n| rewards | play gif |\n|---------|----------|\n|![](trained/train_1_1/rewards_over_steps.png)| ![](trained/train_1_1/play_gif.png)|\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fandreaconti%2Fsupermario-dqn","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fandreaconti%2Fsupermario-dqn","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fandreaconti%2Fsupermario-dqn/lists"}