{"id":18587626,"url":"https://github.com/sgrvinod/chess-transformers","last_synced_at":"2025-04-06T01:07:04.532Z","repository":{"id":207461821,"uuid":"639562692","full_name":"sgrvinod/chess-transformers","owner":"sgrvinod","description":"Teaching transformers to play chess","archived":false,"fork":false,"pushed_at":"2025-01-25T23:00:35.000Z","size":11930,"stargazers_count":119,"open_issues_count":2,"forks_count":6,"subscribers_count":3,"default_branch":"main","last_synced_at":"2025-03-30T00:06:08.214Z","etag":null,"topics":["chess","chess-engine","pytorch","transformers"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/sgrvinod.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-05-11T18:07:10.000Z","updated_at":"2025-03-19T20:52:43.000Z","dependencies_parsed_at":null,"dependency_job_id":"a8fe290a-b0e0-4e01-844b-a74d7a9582a7","html_url":"https://github.com/sgrvinod/chess-transformers","commit_stats":null,"previous_names":["sgrvinod/chess-transformers"],"tags_count":7,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sgrvinod%2Fchess-transformers","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sgrvinod%2Fchess-transformers/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sgrvinod%2Fchess-transformers/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sgrvinod%2Fchess-transformers/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/sgrvinod","download_url":"https://codeload.github.com/sgrvinod/chess-transformers/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247419860,"owners_count":20936012,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["chess","chess-engine","pytorch","transformers"],"created_at":"2024-11-07T00:39:52.512Z","updated_at":"2025-04-06T01:07:04.497Z","avatar_url":"https://github.com/sgrvinod.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003cp align=\"center\"\u003e\n  \u003cimg width=\"300\" src=\"img/logo.png\"/\u003e\n\u003c/p\u003e\n\n\u003ch1 align=\"center\"\u003e\u003ci\u003eChess Transformers\u003c/i\u003e\u003c/h1\u003e\n\u003cp align=\"center\"\u003e\u003ci\u003eTeaching transformers to play chess\u003c/i\u003e\u003c/p\u003e\n\u003cp align=\"center\"\u003e \u003ca href=\"https://github.com/sgrvinod/chess-transformers/releases/tag/v0.3.1\"\u003e\u003cimg alt=\"Version\" src=\"https://img.shields.io/github/v/tag/sgrvinod/chess-transformers?label=version\"\u003e\u003c/a\u003e \u003ca href=\"https://github.com/sgrvinod/chess-transformers/blob/main/LICENSE\"\u003e\u003cimg alt=\"License\" src=\"https://img.shields.io/github/license/sgrvinod/chess-transformers?label=license\"\u003e\u003c/a\u003e\u003c/p\u003e\n\u003cbr\u003e\n\n*Chess Transformers* is a library for training transformer models to play chess by learning from human games. \n\n## Contents\n\n[**Install**](#install)\n\n[**Models**](#models)\n\n[**Datasets**](#datasets)\n\n[**Play**](#play)\n\n[**Train Models**](#train-models)\n\n[**Contribute**](#contribute)\n\n\n## Install\n\nTo install *Chess Transformers*, clone this repository and install as a Python package locally.\n\n```\ngh repo clone sgrvinod/chess-transformers\ncd chess-transformers\npip install .\n```\n\nIf you are planning to develop or contribute or make changes to the codebase, install the package in \u003cins\u003eeditable mode\u003c/ins\u003e, using the `-e` flag.\n\n```\npip install -e .\n```\n\n**OPTIONAL** — If you want to train or evaluate a model, you may need to set some of the following environment variables on your computer:\n\n  - Set **`CT_DATA_FOLDER`** to the folder on your computer where you have the training data. You \u003cins\u003edo not\u003c/ins\u003e need to set this if you do not plan to train any models. \n\n  - Set **`CT_STOCKFISH_PATH`** to the executable of the Stockfish 16 chess engine. You \u003cins\u003edo not\u003c/ins\u003e need to set this if you do not plan to have a model play against this chess engine.\n\n  - Set **`CT_FAIRY_STOCKFISH_PATH`** to the executable of the Fairy Stockfish chess engine. You \u003cins\u003edo not\u003c/ins\u003e need to set this if you do not plan to have a model play against this chess engine.\n\n## Models\n\nThere are currently four models available for use in *Chess Transformers*.\n\n|          Model Name           | # Params |      Training Data      |            Architecture             |                                                                 Predictions                                                                  |\n| :---------------------------: | :------: | :---------------------: | :---------------------------------: | :------------------------------------------------------------------------------------------------------------------------------------------: |\n|   [***CT-E-20***](#ct-e-20)   |   20M    | [***LE22ct***](#le22ct) |      Transformer encoder only       |                                                 Best next half-move (or ply) \u003cbr\u003e eg. *f2e3*                                                 |\n| [***CT-EFT-20***](#ct-eft-20) |   20M    | [***LE22ct***](#le22ct) |      Transformer encoder only       |                            Best *From* and *To* squares corresponding to the next half-move eg. from *f2* to *e3*                            |\n|  [***CT-ED-45***](#ct-ed-45)  |   45M    | [***LE22ct***](#le22ct) | Transformer encoder \u003cbr\u003eand decoder | Sequence of half-moves (or plies) \u003cbr\u003e eg. *f2e3* -\u003e *b4b3* -\u003e *e3h6* -\u003e *b3b2* -\u003e *g4e6* -\u003e *g8f8* -\u003e *g3g7* -\u003e *f8e8* -\u003e *g7f7* -\u003e *loses* |\n| [***CT-EFT-85***](#ct-eft-85) |   85M    |  [***LE22c***](#le22c)  |      Transformer encoder only       |                            Best *From* and *To* squares corresponding to the next half-move eg. from *f2* to *e3*                            |\n\nAll models are evaluated against the [Fairy Stockfish](https://github.com/fairy-stockfish/Fairy-Stockfish) chess engine at increasing strength levels 1 to 6, [as predefined](https://github.com/lichess-org/fishnet/blob/dc4be23256e3e5591578f0901f98f5835a138d73/src/api.rs#L224) for use in the popular Stockfish chess bots on Lichess. The engine is run on an AMD Ryzen 7 3800X 8-Core Processor, with 8 CPU threads, and a hash table size of 8 GB. All other engine parameters are at their default values.\n\n\u003cp align=\"center\"\u003e\n  \u003cimg src=\"img/win_ratio.png\" style=\"width: 60%;\"/\u003e\n\u003c/p\u003e\n\nAt each strength level of the chess engine, $n=1000$ games are played by the model, i.e. $500$ games each with black and white pieces. \n\nWin ratios and the difference between the Elo rating of the model and the chess engine are calculated from these games' outcomes.\n\n\u003cp align=\"center\"\u003e\n  \u003cimg src=\"img/elo_difference.png\" style=\"width: 60%;\"/\u003e\n\u003c/p\u003e\n\nDetailed evaluation results for each model are provided below.\n\n### *CT-E-20*\n\n[**Configuration File**](chess_transformers/configs/models/CT-E-20.py) | [**Checkpoint**](https://chesstransformers.blob.core.windows.net/checkpoints/CT-E-20/averaged_CT-E-20.pt) | \n[**TensorBoard Logs**](https://chesstransformers.blob.core.windows.net/logs/CT-E-20.zip) \n\nThis is the encoder from the original transformer model in [*Vaswani et al. (2017)*](https://arxiv.org/abs/1706.03762) trained on the [*LE22ct*](#le22ct) dataset. A classification head at the **`turn`** token predicts the best half-move to be made (in UCI notation).\n\n\u003cp align=\"center\"\u003e\n  \u003cimg src=\"img/ct_e_20.png\"/\u003e\n\u003c/p\u003e\n\nThis is essentially a sequence (or image) classification task, where the sequence is the current state of the board, and the classes are the various moves that can be made on a chessboard in UCI notation. \n\n*CT-E-20* contains about 20 million parameters.\n\n```python\nfrom chess_transformers.play import load_model\nfrom chess_transformers.configs import import_config\n\nCONFIG = import_config(\"CT-E-20\")\nmodel = load_model(CONFIG)\n```\n\nYou \u003cins\u003edo not\u003c/ins\u003e need to download the model checkpoint manually. It will be downloaded automatically if required.\n\n#### Model Strength\n\n*CT-E-20* was evaluated against the [Fairy Stockfish](https://github.com/fairy-stockfish/Fairy-Stockfish) chess engine at various strength levels [as predefined](https://github.com/lichess-org/fishnet/blob/dc4be23256e3e5591578f0901f98f5835a138d73/src/api.rs#L224) for use in the popular Stockfish chess bots on Lichess. The engine is run on an AMD Ryzen 7 3800X 8-Core Processor, with 8 CPU threads, and a hash table size of 8 GB. All other engine parameters are at their default values.\n\nThese evaluation games can be viewed [here](chess_transformers/evaluate/games/CT-E-20/).\n\n| Strength Level | Games | Wins  | Losses | Draws |          Win Ratio          |      Elo Difference      | Likelihood of Superiority |\n| :------------: | :---: | :---: | :----: | :---: | :-------------------------: | :----------------------: | :-----------------------: |\n|      $LL$      |  $n$  |  $w$  |  $l$   |  $d$  | $\\frac{w + \\frac{d}{2}}{n}$ |      $\\Delta_{Elo}$      |           $LOS$           |\n|     **1**      | 1000  |  989  |   0    |  11   |         **99.45%**          | 902.90 \u003cbr\u003e *(± 117.67)* |          100.00%          |\n|     **2**      | 1000  |  980  |   0    |  20   |         **99.00%**          | 798.25 \u003cbr\u003e *(± 81.48)*  |          100.00%          |\n|     **3**      | 1000  |  872  |   61   |  67   |         **90.55%**          | 392.58 \u003cbr\u003e *(± 33.31)*  |          100.00%          |\n|     **4**      | 1000  |  431  |  455   |  114  |         **48.80%**          |  -8.34 \u003cbr\u003e *(± 20.30)*  |          21.00%           |\n|     **5**      | 1000  |  205  |  685   |  110  |         **26.00%**          | -181.70 \u003cbr\u003e *(± 22.78)* |           0.00%           |\n|     **6**      | 1000  |  24   |  952   |  24   |          **3.60%**          | -571.11 \u003cbr\u003e *(± 54.08)* |           0.00%           |\n\n\n### *CT-EFT-20*\n\n[**Configuration File**](chess_transformers/configs/models/CT-EFT-20.py) | [**Checkpoint**](https://chesstransformers.blob.core.windows.net/checkpoints/CT-EFT-20/averaged_CT-EFT-20.pt) | \n[**TensorBoard Logs**](https://chesstransformers.blob.core.windows.net/logs/CT-EFT-20.zip) \n\nThis is the encoder from the original transformer model in [*Vaswani et al. (2017)*](https://arxiv.org/abs/1706.03762) trained on the [*LE22ct*](#le22ct) dataset. Two classification heads operate upon the encoder outputs at all chessboard squares to predict the best candidates for the source (*From*) and destination (*To*) squares that correspond to the best half-move to be made.\n\n\u003cp align=\"center\"\u003e\n  \u003cimg src=\"img/ct_eft_20.png\"/\u003e\n\u003c/p\u003e\n\nThis is essentially a sequence (or image) labeling task, where the sequence is the current state of the chessboard, and each square competes to be labeled as the *From* or *To* square.\n\n*CT-E-20* contains about 20 million parameters.\n\n```python\nfrom chess_transformers.play import load_model\nfrom chess_transformers.configs import import_config\n\nCONFIG = import_config(\"CT-EFT-20\")\nmodel = load_model(CONFIG)\n```\n\nYou \u003cins\u003edo not\u003c/ins\u003e need to download the model checkpoint manually. It will be downloaded automatically if required.\n\n#### Model Strength\n\n*CT-EFT-20* was evaluated against the [Fairy Stockfish](https://github.com/fairy-stockfish/Fairy-Stockfish) chess engine at various strength levels [as predefined](https://github.com/lichess-org/fishnet/blob/dc4be23256e3e5591578f0901f98f5835a138d73/src/api.rs#L224) for use in the popular Stockfish chess bots on Lichess. The engine is run on an AMD Ryzen 7 3800X 8-Core Processor, with 8 CPU threads, and a hash table size of 8 GB. All other engine parameters are at their default values.\n\nThese evaluation games can be viewed [here](chess_transformers/evaluate/games/CT-EFT-20/).\n\n| Strength Level | Games | Wins  | Losses | Draws |          Win Ratio          |      Elo Difference       | Likelihood of Superiority |\n| :------------: | :---: | :---: | :----: | :---: | :-------------------------: | :-----------------------: | :-----------------------: |\n|      $LL$      |  $n$  |  $w$  |  $l$   |  $d$  | $\\frac{w + \\frac{d}{2}}{n}$ |      $\\Delta_{Elo}$       |           $LOS$           |\n|     **1**      | 1000  |  994  |   0    |   6   |         **99.70%**          | 1008.63 \u003cbr\u003e *(± 190.18)* |          100.00%          |\n|     **2**      | 1000  |  988  |   0    |  12   |         **99.40%**          | 887.69 \u003cbr\u003e *(± 111.13)*  |          100.00%          |\n|     **3**      | 1000  |  942  |   11   |  47   |         **96.55%**          |  578.77 \u003cbr\u003e *(± 48.57)*  |          100.00%          |\n|     **4**      | 1000  |  697  |  192   |  111  |         **75.25%**          |  193.17 \u003cbr\u003e *(± 23.08)*  |          100.00%          |\n|     **5**      | 1000  |  482  |  379   |  139  |         **55.15%**          |  35.91 \u003cbr\u003e *(± 20.09)*   |          99.98%           |\n|     **6**      | 1000  |  61   |  872   |  67   |          **9.45%**          | -392.58 \u003cbr\u003e *(± 33.31)*  |           0.00%           |\n\n\n### *CT-ED-45*\n\n[**Configuration File**](chess_transformers/configs/models/CT-ED-45.py) | [**Checkpoint**](https://chesstransformers.blob.core.windows.net/checkpoints/CT-ED-45/averaged_CT-ED-45.pt) | \n[**TensorBoard Logs**](https://chesstransformers.blob.core.windows.net/logs/CT-ED-45.zip) \n\nThis is the original transformer model (encoder *and* decoder) in [*Vaswani et al. (2017)*](https://arxiv.org/abs/1706.03762) trained on the [*LE22ct*](#le22ct) dataset. A classification head after the last decoder layer predicts a sequence of half-moves, starting with the best half-move to be made next, followed by the likely course of the game an arbitrary number of half-moves into the future. \n\n\u003cp align=\"center\"\u003e\n  \u003cimg src=\"img/ct_ed_45.png\"/\u003e\n\u003c/p\u003e\n\nThis is essentially a sequence-to-sequence (or image-to-sequence) task, where the input sequence is the current state of the board, and the output sequence is a string of half-moves that will likely occur on the board from that point onwards. Potentially, strategies applied to such tasks, such as beam search for decoding the best possible sequence of half-moves, can also be applied to this model. Training the model to predict not only the best half-move to make on the board right now, but also the sequence of half-moves that follow, can be viewed as a type of multitask training. \n\nWe are ultimately only interested in the very first half-move. Nevertheless, the full sequence of half-moves might help explain the model's decision for this important first half-move.\n\n*CT-ED-45* contains about 45 million parameters.\n\n```python\nfrom chess_transformers.play import load_model\nfrom chess_transformers.configs import import_config\n\nCONFIG = import_config(\"CT-ED-45\")\nmodel = load_model(CONFIG)\n```\nYou \u003cins\u003edo not\u003c/ins\u003e need to download the model checkpoint manually. It will be downloaded automatically if required.\n\n#### Model Strength\n\n*CT-ED-45* was evaluated against the [Fairy Stockfish](https://github.com/fairy-stockfish/Fairy-Stockfish) chess engine at various strength levels [as predefined](https://github.com/lichess-org/fishnet/blob/dc4be23256e3e5591578f0901f98f5835a138d73/src/api.rs#L224) for use in the popular Stockfish chess bots on Lichess. The engine is run on an AMD Ryzen 7 3800X 8-Core Processor, with 8 CPU threads, and a hash table size of 8 GB. All other engine parameters are at their default values.\n\n| Strength Level | Games | Wins  | Losses | Draws |          Win Ratio          |      Elo Difference       | Likelihood of Superiority |\n| :------------: | :---: | :---: | :----: | :---: | :-------------------------: | :-----------------------: | :-----------------------: |\n|      $LL$      |  $n$  |  $w$  |  $l$   |  $d$  | $\\frac{w + \\frac{d}{2}}{n}$ |      $\\Delta_{Elo}$       |           $LOS$           |\n|     **1**      | 1000  |  976  |   0    |  24   |         **98.80%**          |  766.23 \u003cbr\u003e *(± 73.45)*  |          100.00%          |\n|     **2**      | 1000  |  977  |   2    |  21   |         **98.75%**          |  759.05 \u003cbr\u003e *(± 78.19)*  |          100.00%          |\n|     **3**      | 1000  |  676  |  244   |  80   |         **71.60%**          |  160.64 \u003cbr\u003e *(± 22.72)*  |          100.00%          |\n|     **4**      | 1000  |  195  |  726   |  79   |         **23.45%**          | -205.52 \u003cbr\u003e *(± 24.04)*  |           0.00%           |\n|     **5**      | 1000  |  67   |  895   |  38   |          **8.60%**          | -410.58 \u003cbr\u003e *(± 36.41)*  |           0.00%           |\n|     **6**      | 1000  |   6   |  987   |   7   |          **0.95%**          | -807.25 \u003cbr\u003e *(± 113.69)* |           0.00%           |\n\n\n### *CT-EFT-85*\n\n[**Configuration File**](chess_transformers/configs/models/CT-EFT-85.py) | [**Checkpoint**](https://chesstransformers.blob.core.windows.net/checkpoints/CT-EFT-85/averaged_CT-EFT-85.pt) | \n[**TensorBoard Logs**](https://chesstransformers.blob.core.windows.net/logs/CT-EFT-85.zip) \n\nThis is a larger version of the encoder from the original transformer model in [*Vaswani et al. (2017)*](https://arxiv.org/abs/1706.03762) trained on the [*LE22c*](#le22c) dataset. Its size is analogous to BERT\u003csub\u003eBASE\u003c/sub\u003e in [*Devlin et al. (2018)*](https://arxiv.org/abs/1810.04805). Two classification heads operate upon the encoder outputs at all chessboard squares to predict the best candidates for the source (*From*) and destination (*To*) squares that correspond to the best half-move to be made.\n\n\u003cp align=\"center\"\u003e\n  \u003cimg src=\"img/ct_eft_85.png\"/\u003e\n\u003c/p\u003e\n\nThis is essentially a sequence (or image) labeling task, where the sequence is the current state of the chessboard, and each square competes to be labeled as the *From* or *To* square.\n\n*CT-E-85* contains about 85 million parameters.\n\n```python\nfrom chess_transformers.play import load_model\nfrom chess_transformers.configs import import_config\n\nCONFIG = import_config(\"CT-EFT-85\")\nmodel = load_model(CONFIG)\n```\n\nYou \u003cins\u003edo not\u003c/ins\u003e need to download the model checkpoint manually. It will be downloaded automatically if required.\n\n#### Model Strength\n\n*CT-EFT-85* was evaluated against the [Fairy Stockfish](https://github.com/fairy-stockfish/Fairy-Stockfish) chess engine at various strength levels [as predefined](https://github.com/lichess-org/fishnet/blob/dc4be23256e3e5591578f0901f98f5835a138d73/src/api.rs#L224) for use in the popular Stockfish chess bots on Lichess. The engine is run on an AMD Ryzen 7 3800X 8-Core Processor, with 8 CPU threads, and a hash table size of 8 GB. All other engine parameters are at their default values.\n\nThese evaluation games can be viewed [here](chess_transformers/evaluate/games/CT-EFT-85/).\n\n| Strength Level | Games | Wins  | Losses | Draws |          Win Ratio          |      Elo Difference       | Likelihood of Superiority |\n| :------------: | :---: | :---: | :----: | :---: | :-------------------------: | :-----------------------: | :-----------------------: |\n|      $LL$      |  $n$  |  $w$  |  $l$   |  $d$  | $\\frac{w + \\frac{d}{2}}{n}$ |      $\\Delta_{Elo}$       |           $LOS$           |\n|     **1**      | 1000  |  999  |   0    |   1   |         **99.95%**          | 1320.33 \u003cbr\u003e *(± 34.06)*  |          100.00%          |\n|     **2**      | 1000  |  997  |   0    |   3   |         **99.85%**          | 1129.30 \u003cbr\u003e *(± 101.08)* |          100.00%          |\n|     **3**      | 1000  |  979  |   0    |  21   |         **98.95%**          |  789.69 \u003cbr\u003e *(± 79.22)*  |          100.00%          |\n|     **4**      | 1000  |  883  |   65   |  52   |         **90.90%**          |  399.81 \u003cbr\u003e *(± 34.71)*  |          100.00%          |\n|     **5**      | 1000  |  712  |  183   |  105  |         **76.45%**          |  204.55 \u003cbr\u003e *(± 23.52)*  |          100.00%          |\n|     **6**      | 1000  |  184  |  713   |  103  |         **23.55%**          | -204.55 \u003cbr\u003e *(± 23.56)*  |           0.00%           |\n\n\n## Datasets\n\nThere are currently four training datasets available in *Chess Transformers*.\n\n|      Dataset Name       |                                    Components                                    | # Datapoints |\n| :---------------------: | :------------------------------------------------------------------------------: | :----------: |\n|  [***ML23c***](#ml23c)  | Board positions, turn, castling rights, next-move sequence (up to 10 half-moves) |  10,797,366  |\n| [***LE22ct***](#le22ct) | Board positions, turn, castling rights, next-move sequence (up to 10 half-moves) |  13,287,522  |\n|  [***LE22c***](#le22c)  | Board positions, turn, castling rights, next-move sequence (up to 10 half-moves) | 127,684,720  |\n|  [***ML23d***](#ml23d)  | Board positions, turn, castling rights, next-move sequence (up to 10 half-moves) | 144,625,397  |\n\nThese datasets are sourced from groups of PGN files containing real games played by humans. There are currently three PGN filesets:\n\n- ***LE22*** consists of games from the [Lichess Elite Database](https://database.nikonoel.fr/) put together by [nikonoel](https://lichess.org/@/nikonoel), a collection of all standard chess games played on [Lichess.org](https://lichess.org/) by players with a Lichess Elo rating of 2400+ against players with a Lichess Elo rating of 2200+ up to December 2021, and players rated 2500+ against players rated 2300+ from December 2021 up to December 2022\n  \n- ***ML23*** consists of Master-level games downloaded from [PGN mentor](https://www.pgnmentor.com/files.html), [TWIC](https://theweekinchess.com/twic), and [Caissabase](http://caissabase.co.uk/) in December 2023\n\nThe lowercase letters at the end of every dataset denote specific filters that were applied to games from the corresponding PGN filesets:\n\n- \"***c*** for games that ended in a checkmate\n- \"***t***\" for games that used a specific time control\n- \"***d***\" for games that ended decisively\n\n### *ML23c*\n\nThis consists of Master-level games downloaded from [PGN mentor](https://www.pgnmentor.com/files.html), [TWIC](https://theweekinchess.com/twic), and [Caissabase](http://caissabase.co.uk/) in December 2023.\n\nOn this data (11,081,724 games), we apply the following filters to keep only those games that:\n\n- are unique (5,213,634 games) \n- and ended in a checkmate (**250,694 games**)\n\nThese 250,694 games consist of a total **10,797,366 half-moves** made by the \u003cins\u003ewinners\u003c/ins\u003e of the games, which alone constitute the dataset. For each such half-move, the chessboard, turn (white or black), and castling rights of both players before the move are calculated, as well as the sequence of half-moves beginning with this half-move up to 10 half-moves into the future. Draw potential is not calculated.\n\n[**Download here.**](https://chesstransformers.blob.core.windows.net/data/ML23c.zip) The data is zipped and will need to be extracted.\n\nIt consists of the following files:\n\n- **`ML23c.h5`**, an HDF5 file containing two tables, one with the raw data and the other encoded with indices (that will be used in the transformer model), containing the following fields:\n  - **`board_position`**, the chessboard layout, or positions of pieces on the board\n  - **`turn`**, the color of the pieces of the player to play\n  - **`white_kingside_castling_rights`**, whether white can castle kingside\n  - **`white_queenside_castling_rights`**, whether white can castle queenside\n  - **`black_kingside_castling_rights`**, whether black can castle kingside\n  - **`black_queenside_castling_rights`**, whether black can castle queenside\n  - **`moves`**, 10 half-moves into the future made by both players\n  - **`length`**, the number of half-moves in the sequence, as this will be less than 10 at the end of the game\n\n### *LE22ct*\n\nThis consists of games from the [Lichess Elite Database](https://database.nikonoel.fr/) put together by [nikonoel](https://lichess.org/@/nikonoel), a collection of all standard chess games played on [Lichess.org](https://lichess.org/) by players with a Lichess Elo rating of 2400+ against players with a Lichess Elo rating of 2200+ up to December 2021, and players rated 2500+ against players rated 2300+ from December 2021 up to December 2022.\n\nOn this data (20,241,368 games), we apply the following filters to keep only those games that:\n\n- used a time control of at least 5 minutes  (2,073,780 games)\n- and ended in a checkmate (**274,794 games**)\n\nThese 274,794 games consist of a total **13,287,522 half-moves** made by the \u003cins\u003ewinners\u003c/ins\u003e of the games, which alone constitute the dataset. For each such half-move, the chessboard, turn (white or black), and castling rights of both players before the move are calculated, as well as the sequence of half-moves beginning with this half-move up to 10 half-moves into the future. Draw potential is not calculated.\n\n[**Download here.**](https://chesstransformers.blob.core.windows.net/data/LE22ct.zip) The data is zipped and will need to be extracted.\n\nIt consists of the following files:\n\n- **`LE22ct.h5`**, an HDF5 file containing two tables, one with the raw data and the other encoded with indices (that will be used in the transformer model), containing the following fields:\n  - **`board_position`**, the chessboard layout, or positions of pieces on the board\n  - **`turn`**, the color of the pieces of the player to play\n  - **`white_kingside_castling_rights`**, whether white can castle kingside\n  - **`white_queenside_castling_rights`**, whether white can castle queenside\n  - **`black_kingside_castling_rights`**, whether black can castle kingside\n  - **`black_queenside_castling_rights`**, whether black can castle queenside\n  - **`moves`**, 10 half-moves into the future made by both players\n  - **`length`**, the number of half-moves in the sequence, as this will be less than 10 at the end of the game\n\n### *LE22c*\n\nThis is an extended version of [*LE22ct*](#le22ct), and consists of games from the [Lichess Elite Database](https://database.nikonoel.fr/) put together by [nikonoel](https://lichess.org/@/nikonoel), a collection of all standard chess games played on [Lichess.org](https://lichess.org/) by players with a Lichess Elo rating of 2400+ against players with a Lichess Elo rating of 2200+ up to December 2021, and players rated 2500+ against players rated 2300+ from December 2021 up to December 2022.\n\nOn this data (20,241,368 games), we apply the following filters to keep only those games that:\n\n- ended in a checkmate (**2,751,394 games**)\n\nThese 2,751,394 games consist of a total **127,684,720 half-moves** made by the \u003cins\u003ewinners\u003c/ins\u003e of the games, which alone constitute the dataset. For each such half-move, the chessboard, turn (white or black), and castling rights of both players before the move are calculated, as well as the sequence of half-moves beginning with this half-move up to 10 half-moves into the future. Draw potential is not calculated.\n\n[**Download here.**](https://chesstransformers.blob.core.windows.net/data/LE22c.zip) The data is zipped and will need to be extracted.\n\nIt consists of the following files:\n\n- **`LE22c.h5`**, an HDF5 file containing two tables, one with the raw data and the other encoded with indices (that will be used in the transformer model), containing the following fields:\n  - **`board_position`**, the board layout or positions of pieces on the board\n  - **`turn`**, the color of the pieces of the player to play\n  - **`white_kingside_castling_rights`**, whether white can castle kingside\n  - **`white_queenside_castling_rights`**, whether white can castle queenside\n  - **`black_kingside_castling_rights`**, whether black can castle kingside\n  - **`black_queenside_castling_rights`**, whether black can castle queenside\n  - **`moves`**, 10 half-moves into the future made by both players\n  - **`length`**, the number of half-moves in the sequence, as this will be less than 10 at the end of the game\n\n### *ML23d*\n\nThis consists of Master-level games downloaded from [PGN mentor](https://www.pgnmentor.com/files.html), [TWIC](https://theweekinchess.com/twic), and [Caissabase](http://caissabase.co.uk/) in December 2023.\n\nOn this data (11,081,724 games), we apply the following filters to keep only those games that:\n\n- are unique (5,213,634 games) \n- and are decisive, i.e. a player won (**3,739,604 games**)\n\nThese 3,739,604 games consist of a total **144,625,397 half-moves** made by the \u003cins\u003ewinners\u003c/ins\u003e of the games, which alone constitute the dataset. For each such half-move, the chessboard, turn (white or black), and castling rights of both players before the move are calculated, as well as the sequence of half-moves beginning with this half-move up to 10 half-moves into the future. Draw potential is not calculated.\n\n[**Download here.**](https://chesstransformers.blob.core.windows.net/data/ML23d.zip) The data is zipped and will need to be extracted.\n\nIt consists of the following files:\n\n- **`ML23d.h5`**, an HDF5 file containing two tables, one with the raw data and the other encoded with indices (that will be used in the transformer model), containing the following fields:\n  - **`board_position`**, the chessboard layout, or positions of pieces on the board\n  - **`turn`**, the color of the pieces of the player to play\n  - **`white_kingside_castling_rights`**, whether white can castle kingside\n  - **`white_queenside_castling_rights`**, whether white can castle queenside\n  - **`black_kingside_castling_rights`**, whether black can castle kingside\n  - **`black_queenside_castling_rights`**, whether black can castle queenside\n  - **`moves`**, 10 half-moves into the future made by both players\n  - **`length`**, the number of half-moves in the sequence, as this will be less than 10 at the end of the game\n\n## Play\n\nAfter [installing](#install) *Chess Transformers*, you can play games \u003cins\u003eagainst an available model\u003c/ins\u003e or have a model play \u003cins\u003eagainst a chess engine\u003c/ins\u003e.\n\n### You v. Model\n\nYou could either play in a Jupyter notebook (recommended for better UI) or in a Python shell. \n\n```python\nimport os\nfrom chess_transformers.configs import import_config\nfrom chess_transformers.play.utils import write_pgns\nfrom chess_transformers.play import load_model, warm_up, human_v_model \n\n# Load configuration\nconfig_name = \"CT-EFT-85\"\nCONFIG = import_config(config_name)\n\n# Load assets\nmodel = load_model(CONFIG)\n\n# Warmup model (triggers compilation)\nwarm_up(\n    model=model\n)\n\n# Play\nwins, losses, draws, pgns = human_v_model(\n    human_color=\"b\",  # color you want to play\n    model=model,\n    k=1,  # \"k\" in \"top_k sampling\", k=1 is best\n    use_amp=True,\n    rounds=1,  # number of rounds you want to play\n    clock=None, \n    white_player_name=config_name,\n    black_player_name=\"Me\",\n)\n\n# Print games in Portable Game Notation (PGN) format\nprint(pgns)\n\n# Save PGNs if you wish\nwrite_pgns(\n    pgns,\n    pgn_file=\"somewhere/something.pgn\",\n)\n```\n\nYou could also just make a copy of [**`human_play.ipynb`**](chess_transformers/play/human_play.ipynb) and play in that notebook.\n\n### Model v. Engine\n\nThe process is the same as above, except you must use a different set of functions:\n\n```python\nfrom chess_transformers.play import model_v_engine\nfrom chess_transformers.play.utils import load_engine\n\n# Load engine\nengine = load_engine(CONFIG.FAIRY_STOCKFISH_PATH)\n\n# Play\nLL = 1  # Try strength levels 1 to 8 (note: 7 and 8 may be slow)\nmodel_color = \"w\"  # Try \"w\" and \"b\"\nwins, losses, draws, pgns = model_v_engine(\n    model=model,\n    k=CONFIG.SAMPLING_K,\n    use_amp=CONFIG.USE_AMP,\n    model_color=model_color,\n    engine=engine,\n    time_limit=CONFIG.LICHESS_LEVELS[LL][\"TIME_CONSTRAINT\"],\n    depth_limit=CONFIG.LICHESS_LEVELS[LL][\"DEPTH\"],\n    uci_options={\"Skill Level\": CONFIG.LICHESS_LEVELS[LL][\"SKILL\"]},\n    rounds=500,\n    clock=None,\n    white_player_name=\"Fairy Stockfish @ LL {}\".format(LL)\n    if model_color == \"b\"\n    else config_name,\n    black_player_name=\"Fairy Stockfish @ LL {}\".format(LL)\n    if model_color == \"w\"\n    else config_name,\n    event=config_name + \" v. Fairy Stockfish @ LL {}\".format(LL)\n    if model_color == \"w\"\n    else \"Fairy Stockfish @ LL {} v. \".format(LL) + config_name,\n)\n```\nSee [**`evaluate.py`**](chess_transformers/evaluate/evaluate.py) for an example.\n\n### Time Control\n\nIf you're using a *Unix*-type operating system — basically, not Windows — you can also set a time control for your games. \nCurrently, only Fischer time control is available. \n\n```python\nfrom chess_transformers.play.clocks import ChessClock\n\nclock = ChessClock(base_time=60, \n                   increment=1)\n```\n\nPass this **`clock`** to the functions above instead of **`clock=None`**.\n\n## Train Models\n\nYou're welcome to try to train your own models, but if you wish to contribute trained models, please [discuss first](#contribute).\n\n### Dataset\n\nYou can skip this step if you wish to use one of the [existing datasets](#datasets).\n\n- Collect PGN files containing games you wish to use for training the model.\n\n- Create a bash script for parsing these PGN files into a collection of FENs and moves using [*pgn-extract*](https://www.cs.kent.ac.uk/people/staff/djb/pgn-extract/), like in [**`LE22ct.sh`**](#le22ct), and execute it in the folder with the PGN files.\n\n- Create a configuration file for the dataset, like in [**`LE22ct.py`**](chess_transformers/configs/data/LE22ct.py).\n\n- Run [**`prep.py`**](chess_transformers/data/prep.py) like `python prep.py [config_name]`, or do it in your own Python script.\n\n```python\nfrom chess_transformers.data import prepare_data\nfrom chess_transformers.configs import import_config\n\n# Load configuration\nCONFIG = import_config(\"[config_name]\")\n\n# Prepare data\nprepare_data(\n    data_folder=CONFIG.DATA_FOLDER,\n    h5_file=CONFIG.H5_FILE,\n    max_move_sequence_length=CONFIG.MAX_MOVE_SEQUENCE_LENGTH,\n    expected_rows=CONFIG.EXPECTED_ROWS,\n    val_split_fraction=CONFIG.VAL_SPLIT_FRACTION,\n)\n```\nData files will be created in **`CONFIG.DATA_FOLDER`**.\n\n### Training\n\n- Create a configuration file for the model, like in [**`CT-E-20.py`**](chess_transformers/configs/models/CT-E-20.py).\n\n- Run [**`train.py`**](chess_transformers/train/train.py) like `python train.py [config_name]`, or do it in your own Python script.\n\n```python\n\nfrom chess_transformers.train import train_model\nfrom chess_transformers.configs import import_config\n\n# Load configuration\nCONFIG = import_config(\"[config_name]\")\n\n# Train model\ntrain_model(CONFIG)\n```\n- Monitor training with [*TensorBoard*](https://www.tensorflow.org/tensorboard) with `tensorboard --logdir $CT_LOGS_DIR`.\n\n- Average checkpoints saved for averaging to produce the final checkpoint. Run [**`average_checkpoints.py`**](chess_transformers/train/average_checkpoints.py) like `python average_checkpoints.py [config_name]`, or do it in your own Python script.\n  \n```python\n\nfrom chess_transformers.train import average_checkpoints\nfrom chess_transformers.configs import import_config\n\n# Load configuration\nCONFIG = import_config(\"[config_name]\")\n\n# Average checkpoints\naverage_checkpoints(\n    checkpoint_folder=CONFIG.CHECKPOINT_FOLDER,\n    checkpoint_avg_prefix=CONFIG.CHECKPOINT_AVG_PREFIX,\n    checkpoint_avg_suffix=CONFIG.CHECKPOINT_AVG_SUFFIX,\n    final_checkpoint=CONFIG.FINAL_CHECKPOINT,\n)\n```\n\n### Evaluation\n\nRun [**`evaluate.py`**](chess_transformers/evaluate/evaluate.py) like `python evaluate.py [config_name]`, or do it in your own Python notebook/script.\n\n```python\n\nfrom chess_transformers.configs import import_config\nfrom chess_transformers.evaluate import evaluate_model\n\n# Load configuration\nCONFIG = import_config(\"[config_name]\")\n\n# Evaluate model\nevaluate_model(CONFIG)\n```\n\n## Contribute\n\nContributions — and any discussion thereof — are welcome. As you may have noticed, *Chess Transformers* is in initial development and the public API is \u003cins\u003enot\u003c/ins\u003e to be considered stable. \n\nIf you are planning to contribute bug-fixes, please go ahead and do so. If you are planning to contribute in a way that extends *Chess Transformers*, or adds any new features, data, or models, please [create a discussion thread](https://github.com/sgrvinod/chess-transformers/discussions/new/choose) to discuss it \u003cins\u003ebefore\u003c/ins\u003e you spend any time on it. Otherwise, your PR may be rejected due to lack of consensus or alignment with current goals.\n\nPresently, the following types of contributions may be useful:\n\n- Better, more robust evaluation methods of models.\n- Evaluation of existing models against chess engines on different CPUs to study the effect of CPU specifications on engine strength and evaluation.\n- New models with:\n  - the same transformer architectures but of a larger size, and trained on larger datasets.\n  - or different transformer architectures or internal mechanisms.\n  - or in general, improved evaluation scores.\n- Chess clocks for Windows OS, or for *Unix*-type OS but for time controls \u003cins\u003eother than\u003c/ins\u003e Fischer time control.\n- Refactoring of code that improves its ease of use.\n- Model visualization for explainable AI, such as visualizing positional and move embeddings, or attention patterns.\n\nThis list is not exhaustive. Please do not hesitate to discuss your ideas. Thank you!\n\n## License\n\n*Chess Transformers* is licensed under the [MIT license](LICENSE). \n\n\n\n\n\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsgrvinod%2Fchess-transformers","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsgrvinod%2Fchess-transformers","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsgrvinod%2Fchess-transformers/lists"}