https://github.com/51616/cu_makhos

Thai Checkers deep reinforcement learning AI
https://github.com/51616/cu_makhos

board-game checkers deep-learning deep-reinforcement-learning deeplearning reinforcement-learning

Last synced: 4 months ago
JSON representation

Thai Checkers deep reinforcement learning AI

Host: GitHub
URL: https://github.com/51616/cu_makhos
Owner: 51616
Created: 2019-02-26T14:16:36.000Z (over 6 years ago)
Default Branch: master
Last Pushed: 2019-06-13T16:48:25.000Z (about 6 years ago)
Last Synced: 2025-01-05T08:32:49.222Z (6 months ago)
Topics: board-game, checkers, deep-learning, deep-reinforcement-learning, deeplearning, reinforcement-learning
Language: Python
Homepage:
Size: 27.9 MB
Stars: 3
Watchers: 1
Forks: 1
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# CU Makhos

Implementation reinforcement learning for Thai Checkers based on the AlphaGo methods (supervised + reinforcement) since I failed to implement zero knowledge model in the first run (probably some hyperparameters issues since RL is prone to such thing).

The implementation uses Python and Pytorch. It is based on [alpha-zero-general](https://github.com/suragnair/alpha-zero-general) framework so you can easily implement other variants of checkers.

If you plan to train your model or improve the bundled one look at `main_th_checkers.py` (This is ad-hoc solution/implementation for my environment and hardware setup)

### How to play

To play against model you need Python and CUDA installed.

```
python human_ui.py
```

### Pretrained model

Latest iteration of my run is 268, where the first 35 iterations learned from minimax algorithm gameplay and the rest are self-play of the latest iteration at the time.

![alt tag](results/winrate_vs_minimax_7.png)

The performance of the model was estimated as a total result in 200-games match against depth-7 minimax algorithm using 200 simulations Monte Carlo tree search each move.

Model search is about 10 times efficiency compare to depth-7 minimax in terms of move simulations.

### Thanks to

- [alpha-zero-general](https://github.com/suragnair/alpha-zero-general)
- [chess-alpha-zero](https://github.com/Zeta36/chess-alpha-zero)
- [pytorch-classification](https://github.com/bearpaw/pytorch-classification) and [progress](https://github.com/verigak/progress).
- [alpha-nagibator](https://github.com/evg-tyurin/alpha-nagibator)

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/51616/cu_makhos

Awesome Lists containing this project

README