{"id":18717293,"url":"https://github.com/thisiscetin/ttt_qlearning","last_synced_at":"2026-05-02T03:03:17.114Z","repository":{"id":143687877,"uuid":"246593713","full_name":"thisiscetin/ttt_qlearning","owner":"thisiscetin","description":"TicTacToe game with Double Q-learning.","archived":false,"fork":false,"pushed_at":"2020-03-21T18:32:17.000Z","size":191,"stargazers_count":2,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"master","last_synced_at":"2024-12-28T10:37:55.576Z","etag":null,"topics":["qlearning","reinforcement-learning","reinforcement-learning-excercises","tictactoe-game"],"latest_commit_sha":null,"homepage":"","language":"C++","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/thisiscetin.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null}},"created_at":"2020-03-11T14:29:33.000Z","updated_at":"2024-05-09T01:49:58.000Z","dependencies_parsed_at":"2024-03-11T03:03:16.611Z","dependency_job_id":"328468d8-0951-462d-81dc-ca4eba05e70b","html_url":"https://github.com/thisiscetin/ttt_qlearning","commit_stats":null,"previous_names":["c7n0/ttt_qlearning"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/thisiscetin%2Fttt_qlearning","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/thisiscetin%2Fttt_qlearning/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/thisiscetin%2Fttt_qlearning/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/thisiscetin%2Fttt_qlearning/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/thisiscetin","download_url":"https://codeload.github.com/thisiscetin/ttt_qlearning/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":239581796,"owners_count":19662958,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["qlearning","reinforcement-learning","reinforcement-learning-excercises","tictactoe-game"],"created_at":"2024-11-07T13:15:40.645Z","updated_at":"2025-11-10T12:30:19.160Z","avatar_url":"https://github.com/thisiscetin.png","language":"C++","funding_links":[],"categories":[],"sub_categories":[],"readme":"# TicTacToe game with Double Q-Learning\n\nAim of this project to build a model-free reinforcement learning algorithm (QLearning) that can play tic tac toe \nbetter than a human does. When application runs, agent starts playing agains in two other agents. One opponent agent picks highly random moves while the other one makes a bit smarter moves.\n\nAt the same time you can play against the agent, by using numbers on the board between [0, 8]. 0 refers to the (1, 1) cell in the tictactoe board while 8 refers to (3, 3).\n\n## Building\n\n```\n\u003e cmake . \u0026\u0026 make\n```\nThis command should create a binary in `bin/` folder name `game`.\n\n## Running the game\nAfter running the command from the base (ttt_qlearning) folder,\n\n```\n\u003e bin/game\n```\n\nYou will generate 3 agents playing TicTacToe.\n- Agent A will be playing with Agent B.\n- Agent A will be playing with Agent C.\n\nAnd you will be promted to enter a number to mark on the board. While training you can play against Agent A yourself and see the improvement.\n\n```\n[agent a vs. b]\tagent 0 won %: 58.547, agent 1 won %: 41.0874\t\t| agent 0 double table (action) size: 29936\n[agent a vs. c]\tagent 0 won %: 51.6226, agent 1 won %: 48.2668\t\t| agent 0 double table (action) size: 30034\n\n-o-\nxo-\nx--\n\n\nEnter pos [0-8]: \n```\n\nWhile training continues you can play the game continuously.\n\n## References\n- [Reinforcement Learning by Deepsense](https://deepsense.ai/what-is-reinforcement-learning-the-complete-guide/)\n- [Double Q-Learning](https://towardsdatascience.com/double-q-learning-the-easy-way-a924c4085ec3)\n- [Wikipedia Q-learning](https://en.wikipedia.org/wiki/Q-learning)\n- [Wikipedia Tic Tac Toe](https://en.wikipedia.org/wiki/Tic-tac-toe)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fthisiscetin%2Fttt_qlearning","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fthisiscetin%2Fttt_qlearning","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fthisiscetin%2Fttt_qlearning/lists"}