https://github.com/ts1/uctoth
Self-learning Othello (Reversi) game engine written in CoffeeScript 2
https://github.com/ts1/uctoth
c coffeescript coffeescript2 game game-engine vue webassembly
Last synced: 7 months ago
JSON representation
Self-learning Othello (Reversi) game engine written in CoffeeScript 2
- Host: GitHub
- URL: https://github.com/ts1/uctoth
- Owner: ts1
- Created: 2018-12-28T12:20:20.000Z (about 7 years ago)
- Default Branch: master
- Last Pushed: 2023-02-28T09:04:45.000Z (about 3 years ago)
- Last Synced: 2025-05-25T00:35:45.721Z (10 months ago)
- Topics: c, coffeescript, coffeescript2, game, game-engine, vue, webassembly
- Language: CoffeeScript
- Homepage: https://ts1.github.io/uctoth/
- Size: 28.2 MB
- Stars: 2
- Watchers: 2
- Forks: 0
- Open Issues: 6
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Uctoth
Othello (Reversi) game engine written in CoffeeScript 2.
Demo page is [here](https://ts1.github.io/uctoth/).
## Experimental
This code is mostly experimental, not intended for reuse.
Reusing may need some work.
Read the source if you are interested.
## Features
### CoffeeScript 2
Works both on Node.js (with ES6+ features) and on [browsers](https://ts1.github.io/uctoth/) (with help from babel and webpack).
Now the most of CPU-intensive parts, endgame solving, game tree search and
learning, are also written in C.
They work as add-on of Node.js and also as WebAssembly on browsers
(except learning).
CoffeeScript implementations still exist and used when add-on/wasm is not
available.
### Self learning
Starting from random plays, strengthens itself by loops of self playing and
machine learning.
Uses no existing records of games.
Included `weights.json` file is a result of learning ~100,000 self-played games.
### Game tree search
Employs modified *UCT search* with static evaluation.
Searches 15-25 plies deep maximum on modern JavaScript engines.
C version searches 30+ plies deep.
Beats my implementation of minimax with NegaScout by 70-80% winning rate.
### Static evaluation
Classic pattern-based evaluation method as described in publications by Michael Buro.
He is the god of computer Othello for me.
### Machine learning
Simple linear regression of ~57,000 sparse features per game position.
Optionally supports logistic regression, which predicts probability to win instead
of final score.
Now using mini-batch with AdaGrad for faster convergence.
Also written from scratch in CoffeeScript/C.
## Usage
First off run `yarn` to install dependencies and build Node add-on.
```
yarn
```
Most scripts expect `weights.json` file existing in this directory.
Copy from `ref` directory at first (then build your own).
```
cp ref/weights.json .
```
To run scripts written in CoffeeScript 2, you should either install
`coffeescript` globally, or use `npx coffee` to run.
### Bootstrapping
This section describes how to train your own `weights.json` from scratch.
First off, generate 1,000 randomly played games.
```
npx coffee selfplay-rnd -R -n 1000 -w 12 -f 10 -b 1000000 --min_col=0
```
They are random but the last 10 moves are perfectly played.
Generated games are stored in `book.db` (SQLite3 database).
Next, learn the generated games and make your first `weights.json`.
```
npx coffee learn
```
> If you can't use Node add-on for some reason, use `reg` instead.
Now you can remove `book.db` of random games.
```
rm book.db*
```
Copy `auto` script from `samples` directory.
```
cp samples/auto .
```
You may edit `auto` as you like.
Now you are ready to run automatic self-learning loop.
```
./auto
```
It runs regression and 30-game matches against `ref/weights.json`
every 1,000 games generated.
Match results are appended to `match.log`.
### Running self-play and learning in parallel
Running self-play and regression simultaneously can utilize multi-core CPUs,
thus can speed up the entire self learning process.
Sample scripts are in `samples` directory, copy them.
```
cp samples/selfplay-loop samples/reg-loop samples/match-loop .
```
Edit the scripts as you need.
Then run `reg-loop` in one terminal, `selfplay-loop` in another, and `match-loop` in the third.
This setup uses 3 CPU threads, but if it isn't enough for your machine,
you may run `selfplay-loop` as many as you want.
The trick is simple.
All selfplay scripts watch `weights.json` to change.
When `reg-loop` finished creating a new `weights.json`, selfplay scripts exit
and invoked again by shell script.
`watch` script is useful for watching to see if everything is working well.
### Automatic tuning of L2 regularization parameter
`learn` (and its CoffeeScript counter part `reg`/`minibatch`) uses
*L2 regularization* to avoid over-fitting.
It's important to give optimal parameters of this to build strong weights.
`l2tune` script is written for this purpose.
It uses K-fold cross-validation and optionally actually plays 30-game match to
find strongest parameters.
The result is written to `l2.json` and `learn` reads values from this file if
available.
These are very time consuming processes.
It's recommended to run `l2tune` without `--match` periodically while learning,
and use `l2tune --match` for final finish of your `weights.json`.
> If Node add-on is not available, use `l2seach` and `l2opt` instead.
## Acknowledgement
Sound created by Nobuyuki Honda.
## License
MIT
Copyright © 2018-2019 by Takeshi Sone