Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/51616/cu_makhos
Thai Checkers deep reinforcement learning AI
https://github.com/51616/cu_makhos
board-game checkers deep-learning deep-reinforcement-learning deeplearning reinforcement-learning
Last synced: about 1 month ago
JSON representation
Thai Checkers deep reinforcement learning AI
- Host: GitHub
- URL: https://github.com/51616/cu_makhos
- Owner: 51616
- Created: 2019-02-26T14:16:36.000Z (almost 6 years ago)
- Default Branch: master
- Last Pushed: 2019-06-13T16:48:25.000Z (over 5 years ago)
- Last Synced: 2024-11-09T19:04:23.695Z (3 months ago)
- Topics: board-game, checkers, deep-learning, deep-reinforcement-learning, deeplearning, reinforcement-learning
- Language: Python
- Homepage:
- Size: 27.9 MB
- Stars: 3
- Watchers: 1
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# CU Makhos
Implementation reinforcement learning for Thai Checkers based on the AlphaGo methods (supervised + reinforcement) since I failed to implement zero knowledge model in the first run (probably some hyperparameters issues since RL is prone to such thing).
The implementation uses Python and Pytorch. It is based on [alpha-zero-general](https://github.com/suragnair/alpha-zero-general) framework so you can easily implement other variants of checkers.
If you plan to train your model or improve the bundled one look at `main_th_checkers.py` (This is ad-hoc solution/implementation for my environment and hardware setup)
### How to play
To play against model you need Python and CUDA installed.
```
python human_ui.py
```### Pretrained model
Latest iteration of my run is 268, where the first 35 iterations learned from minimax algorithm gameplay and the rest are self-play of the latest iteration at the time.
![alt tag](results/winrate_vs_minimax_7.png)
The performance of the model was estimated as a total result in 200-games match against depth-7 minimax algorithm using 200 simulations Monte Carlo tree search each move.
Model search is about 10 times efficiency compare to depth-7 minimax in terms of move simulations.
### Thanks to
- [alpha-zero-general](https://github.com/suragnair/alpha-zero-general)
- [chess-alpha-zero](https://github.com/Zeta36/chess-alpha-zero)
- [pytorch-classification](https://github.com/bearpaw/pytorch-classification) and [progress](https://github.com/verigak/progress).
- [alpha-nagibator](https://github.com/evg-tyurin/alpha-nagibator)