{"id":17997588,"url":"https://github.com/airlegend/chessrl","last_synced_at":"2025-03-26T04:31:25.130Z","repository":{"id":40976722,"uuid":"216511839","full_name":"AIRLegend/ChessRL","owner":"AIRLegend","description":"Deep Reinforcement Learning for Chess","archived":false,"fork":false,"pushed_at":"2022-11-22T04:36:26.000Z","size":512,"stargazers_count":18,"open_issues_count":4,"forks_count":2,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-03-21T06:01:58.136Z","etag":null,"topics":["alpha-zero","chess","chess-engine","deep-learning","neural-network","reinforcement-learning"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/AIRLegend.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2019-10-21T08:07:35.000Z","updated_at":"2024-12-26T08:28:15.000Z","dependencies_parsed_at":"2022-08-19T04:22:04.984Z","dependency_job_id":null,"html_url":"https://github.com/AIRLegend/ChessRL","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/AIRLegend%2FChessRL","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/AIRLegend%2FChessRL/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/AIRLegend%2FChessRL/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/AIRLegend%2FChessRL/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/AIRLegend","download_url":"https://codeload.github.com/AIRLegend/ChessRL/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":245589264,"owners_count":20640254,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["alpha-zero","chess","chess-engine","deep-learning","neural-network","reinforcement-learning"],"created_at":"2024-10-29T21:19:44.308Z","updated_at":"2025-03-26T04:31:24.691Z","avatar_url":"https://github.com/AIRLegend.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Neural chess \u003cbr\u003e Reinforcement Learning based chess engine.\n\nPersonal project to build a chess engine based using reinforcement learning.\n\nThe idea is to some sort replicate the system built by DeepMind with AlphaZero. I'm\naware that the computational resources to achieve their results is huge, but my aim\nit's simply to reach an amateur chess level performance (about 1200-1400 Elo), not\nstate of the art.\n\nAt this moment, the approach I'm using is based on pre-training a model using self-play data of a Stockfish \nalgorithm. Later, the idea is to put two models to play agaisnt each other and make selection/merges of weights (RL part).\n\nIf you want to reuse this code on your project, and have any doubt [here](https://github.com/AIRLegend/ChessRL/blob/master/DOCS.md) you will find some explanation about the most important classes. Also, feel free to open an issue on this repo to ask.\n\n*Work in progress*\n\n## Requirements\nThe necessary python packages (ATM) are listed in the requirements file.\nYou can install them with\n\n```bash\npip3 install -r requirements.txt\n```\n\nTensorflow is also needed, but you must install either `tensorflow` or `tensorflow-gpu` (for the development I used \u003e= TF 2.0).\n\nAlso, you need to download the specific \n[stockfish binary](https://stockfishchess.org/download/) for your platform,\nfor automating this made a script to automatically download it.\n\n```bash\ncd res\nchmod +x get_stockfish.sh\n./get_stockfish.sh linux   #or \"mac\", depending of your platform. \n```\nIf you want to download it manually, you have to put the stockfish executable under `res/stockfish-10-64` path, in order to the training script to detect it.\n\n\n## Training\n\u003e **DISCLAIMER:** This is under development and can still contains bugs or  inefficiencies and modifications are being made.\n\n\u003e **DISCLAIMER 2:** As soon as I get acceptable results, I will also share weights/datasets with this code.\n\nFor training an agent in a supervised way you will need a saved dataset of games. The script `gen_data_stockfish.py` is made for generating a JSON with this. This script will play (and record) several games using two Stockfish instances. Execute it first to create this dataset (take a look at it's possible arguments).\n\nThe main purpose of this part is to pre-train the model to make the policy head of the network to reliably predict the game outcome. This will be useful during the self-play phase as the MCTS will make better move policies (reducing the training time).\n\n```bash\ncd src/chessrl\npython gen_data_stockfish.py ../../data/dataset_stockfish.json --games 100\n```\n\nOnce we have a training dataset (generated or your own adapted), start the supervised training with:\n\n```bash\ncd src/chessrl\npython supervised.py ../../data/models/model1 ../../data/dataset_stockfish.json --epochs 2 --bs 4\n```\n\nOnce we have a pretrained model, we can move to the self-play phase. The incharged of this process is the `selfplay.py` script, which will fire up a instance of the model which play agaisnt itself and after each one, makes a training round (saving the model and the results). Please, take a look at the possible arguments. However, here you have an example. (Keep in mind that this is an expensive process which takes a considerable amount of time per move).\n\n```bash\ncd src/chessrl\npython selfplay.py ../../data/models/model1-superv --games 100\n```\n\n\n## How do I view the progress?\n\nThe neural network trainning evolution can be monitored with Tensorboard, simply:\n\n```bash\ntensorboard --logdir data/models/model1/train\n```\n(And set the \"Horizontal axis\" to \"WALL\" for viewing all the diferent runs.)\n\nAlso, in the same model directory you will find a `gameplays.json` file which\ncontains the recorded training games of the model. With this, we can study its\nbehaviour over time.\n\n## Can I play against the agent?\n\nYes. Under `src/webplayer` you will find a Flask app which deploys a web interface to play against the trained agent. There is another README with more information.\n\n\n## Literature\n\n1. Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning\n   Algorithm, Silver, D. et al. https://arxiv.org/pdf/1712.01815.pdf\n2. Mastering the game of Go without human knowledge. Silver, D. et al. https://www.nature.com/articles/nature24270.epdf?author_access_token=VJXbVjaSHxFoctQQ4p2k4tRgN0jAjWel9jnR3ZoTv0PVW4gB86EEpGqTRDtpIz-2rmo8-KG06gqVobU5NSCFeHILHcVFUeMsbvwS-lxjqQGg98faovwjxeTUgZAUMnRQ\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fairlegend%2Fchessrl","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fairlegend%2Fchessrl","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fairlegend%2Fchessrl/lists"}