https://github.com/omerkel/ucthello

UCThello - a board game demonstrator (Othello variant) with computer AI using Monte Carlo Tree Search (MCTS) with UCB (Upper Confidence Bounds) applied to trees (UCT in short)
https://github.com/omerkel/ucthello

2-player-strategy-game abstract-game ai ai-players artificial-intelligence board-game entertainment game mcts mobile mobile-app mobile-game monte-carlo-tree-search othello perfect-information simulation ucb uct upper-confidence-bounds

Last synced: 2 months ago
JSON representation

UCThello - a board game demonstrator (Othello variant) with computer AI using Monte Carlo Tree Search (MCTS) with UCB (Upper Confidence Bounds) applied to trees (UCT in short)

Host: GitHub
URL: https://github.com/omerkel/ucthello
Owner: OMerkel
License: other
Created: 2016-02-11T18:24:57.000Z (over 9 years ago)
Default Branch: master
Last Pushed: 2023-07-10T16:35:08.000Z (almost 2 years ago)
Last Synced: 2025-03-24T14:46:08.471Z (3 months ago)
Topics: 2-player-strategy-game, abstract-game, ai, ai-players, artificial-intelligence, board-game, entertainment, game, mcts, mobile, mobile-app, mobile-game, monte-carlo-tree-search, othello, perfect-information, simulation, ucb, uct, upper-confidence-bounds
Language: JavaScript
Homepage: http://omerkel.github.io/UCThello/html5/src
Size: 5.61 MB
Stars: 25
Watchers: 4
Forks: 4
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE
- Authors: AUTHORS

Awesome Lists containing this project

README

UCThello icon UCThello
====================

* Start an online UCThello session on http://omerkel.github.io/UCThello/html5/src
* Android APK available for install https://github.com/OMerkel/UCThello/releases
* Runs in various browsers on
* desktop systems like BSDs, Linux, Win, MacOS and
* mobile platforms like Android, FirefoxOS, iOS.

_UCThello - a board game demonstrator with computer AI using
Monte-Carlo Tree Search (MCTS) with UCB (Upper Confidence Bounds)
applied to trees (UCT in short)_

__Keywords, Categories__ _Monte-Carlo Tree Search (MCTS),
Upper Confidence Bounds (UCB), UCB applied to trees (UCT), AI,
2-player board game, deterministic game with perfect information,
JavaScript, ECMAScript, W3C WebWorker_

# Abstract

UCThello is a board game using Monte-Carlo Tree Search (MCTS) with
UCB (Upper Confidence Bounds) applied to trees (UCT in short) for the
computer player AI. The board game used for demonstration purposes of
the UCT algorithm is close to a game named _Othello_ depending on
selected options. In fact it can be played depending on your
configuration following the official tournament rules of the
WOF - World Othello Federation - if intended [WOF14]. Other rule settings
to play variants are available, too.
Per design decision the playing strength is limited for
pleasure and fun level. Thus it is kept at a moderate to quite
strong level on purpose due to the target environment, device
platform, and audience expectations.
This is done e.g. by limitation of the maximum AI response time and
using a single execution thread for AI only plus just a
second independent execution thread for a responsive user
interface to avoid battery drains if full CPU and GPU
core support would be implemented leading to bad user experience.
Other possible but at least currently postponed improvements could
be done by simple usage of a well-known and available game opening book.
Although such simple modifications could improve the playing strength
these features are not implemented in the current version yet.

_Othello_ is a derivative of the board game _Reversi_ which can be
played by UCThello as well. _Reversi_ is claimed to be invented by
either Lewis Waterman or John W. Mollett. Predecessor of _Reversi_ created
by Mollett is _The game of Annexation_, also called _Annex_ back in
19th century.

# Monte-Carlo Tree Search

The __Monte-Carlo Tree Search__ (MCTS in short) represents an algorithms used to build a
_Search Tree_ interatively by successively adding nodes according to traversing of
nodes and simulations in the problem domain. If the problem domain is a game then
the nodes can represent moves according to the game rules.
Traversing nodes follows a _Selection Strategy_. _Simulations_ are often called
_playouts_, too. The different nodes inside the simulated paths get statistics
reflecting ratios of win and loss related to total amount of simulations.
Assumption is that with higher total amount of simulations the confidence in the
statistics gets high enough and allows to select quality nodes or moves.
Such that the idea is to retrieve the acceptable next node or move with optimal
ratio then.

The iterative MCTS algorithm is modelled to perform four main states typically called
* _Selection_,
* _Expansion_,
* _Simulation_, and
* _Backpropagation_. See [Cha10] & [CBSS08]

In UCThello the related code fragment for this loop is close to

```
Uct.prototype.getActionInfo = function ( board, maxIterations, verbose ) {
var root = new UctNode(null, board, null);
for(var iterations=0; iterations

The _objective of the Selection Strategy_ is to branch the intended search path in
a balance of information _exploration_ and _exploitation_. If a branch is selected
following a search path branch already examined previously this is seen as an
exploit. An exploit shall confirm the quality of an already examined node in terms
of gaining higher statistical confidence. Higher statistical confidence does mean
to have more reliable estimates. Exploration is performed by creating new
nodes in later MCTS steps or alternatively search path branch selection of
relatively rare traversed nodes. Nodes traversed in a low amount simply
reflects a low reliability or statistical confidence. The border between
exploit and explore is often seen as being soft and fluent.

Thus a selection of a child node to traverse next at each level of the already
build search tree path is usually based on a quality value of the visited
nodes in earlier iterations.
An optimal _Selection Strategy_ to best support the objective is unknown. One
statistical approach called _Upper Confidence Bounds_ (UCB) algorithm uses
a logarithm based formula on collected quality values correlated to the
nodes on the search path if applied to MCTS. The combination of MCTS and UCB
called UCT (short for _UCB applied to trees_) is credited to [KS06]. Other
approaches or additional supporting ideas for a _Selection Strategy_ are
presented and discussed e.g. in [CSUB06].

Besides the Selection Strategy in search path branch Selection an additional
aspect is seen. To avoid a risk that any high quality node is unvisited that
is located near the rood node already. To reach such a design goal a possible
solution is to favor traversing any unexplored child node over following
explored siblings. Widening the search tree is then favored over deepening.
Critics could be that randomness of Monte-Carlo methods is reduced if applied.

In UCThello the select child step implements the UCT algorithm. The UCB
related code is part of the _UctNode.prototype.selectChild_ function.

Additionally UCThello implements to favor early _Selection_ of a traversed
node on any unexplored (or unexamined) child existing. Such an unexplored
(or unexamined) child is preferred over continuing traversing any
explored node.

```
var node = root;
var variantBoard = board.copy();
/* Selection */
while (node.unexamined.length == 0 && node.children.length > 0) {
node = node.selectChild();
variantBoard.doAction(node.action);
}
```

## Expansion

The objective of the __Expansion__ step is to add a new unexplored child of
the node determined by the previous _Selection_.

If the node determined by the _Selection_ is an inner node instead of a
leaf node then this node has a combination of explored and unexplored
children. Either way an unexplored child shall be added for the
coming _Simulation_ state. Only exception is that a leaf node has been
reached representing a terminal node. In such a case no Expansion and
Simulation is needed since a terminal node means that a end of game
is implied at that node on the search path.

Sometimes you will find implementations where multiple Expansions take
place on the Selection node. This simply means a set of child nodes is
added at once then.

In UCThello exactly one node will be added unless a terminal node is
reached and the list of remaining unexplored child nodes is
determined before. To avoid any preferred order when getting a node
from the set of remaining nodes or when a dependency from any
parameter or state exists the returned node is selected randomly.

```
/* Selection */
...
/* Expansion */
if (node.unexamined.length > 0) {
var j = Math.floor(Math.random() * node.unexamined.length);
variantBoard.doAction(node.unexamined[j]);
node = node.addChild(variantBoard, j);
}
```

Terminal nodes do not have any child nodes. So it is sufficient to
check for the unexamined.length in case a terminal node has been
selected.

## Simulation

Now the objective of a __Simulation__ is to playout a possible scenario
starting from the newly expanded search tree leaf node. Simulation is
performed until end of game is reached.

Mind the playout does not modify the expanded search tree leaf node. The fixed
leaf node - respectively the correlated game state - is used as the base for
the simulation only.

On each simulation step a player's action valid by the rules is performed on the
created variant board. The variant board is used as a complete copy of the
current board and game state. This is to avoid changes to the board and game
state while following the full search path and simulation steps.

Instead of doing just a single playout alternatively several playouts could be
started from the selected and expanded search tree leaf node. Idea behind
this would be to save the run time needed for a possible choice of the same
selection path in later iterations.

In UCThello a single playout is performed per iteration. The number of MCTS
algorithm iterations equals the number of simulations then.

```
var variantBoard = board.copy();
/* Selection */
...
/* Expansion */
...
/* Simulation */
var actions = variantBoard.getActions();
while(actions.length > 0) {
variantBoard.doAction(actions[Math.floor(Math.random() * actions.length)]);
...
actions = variantBoard.getActions();
}
```

## Backpropagation

Objective of the __Backpropagation__ is to update the statistics of all nodes
along the search tree path in reverse order until the root node is reached.
The Simulation did not perform any changes on the search tree path. Since the
search tree path is unchanged this means the eventually played or predicted
result on the playout can be used to update statistics starting at the
search tree path leaf node via the parent nodes until the root node is
reached.

```
var node = root;
var variantBoard = board.copy();
/* Selection */
...
/* Expansion */
...
/* Simulation */
...
/* Backpropagation */
var result = variantBoard.getResult();
while(node) {
node.update(result);
node = node.parentNode;
}
```

In UCThello the UCT AI player does not maximize for the
amount of discs of own color on board.
Instead it analyzes the end of game situation just for any
result being a win. The call variantBoard.getResult()
returns an array of length two.
The two values returned stand for the game result of
the players in terms of win or loss. The winning player
gets a full point while his opponent scores zero points.
Meaning the result is either [ 1, 0 ] or [ 0, 1 ].
A draw or stalemate situation is represented as an array
[ 0.5, 0.5 ]. Meaning a draw is better than a loss but
shall be interpreted as half a win for both players.

The statistics for a node is updated by node.update(result).
Mind the search tree node is representing a move of the active
player according to the rules. The update picks the active
player's end of game result from the result array and adds it
to a statistics value representing the total amount of wins
found traversing the search tree node over all MCTS iterations.
Additionally the amount of visits for the node is increased.

# References

* __[Cha10]__ Guillaume Maurice Jean-Bernard Chaslot, "[Monte-Carlo Tree Search](https://project.dke.maastrichtuniversity.nl/games/files/phd/Chaslot_thesis.pdf)", PHD Proefschrift, Universiteit Maastricht, NL, 2010.
* __[CBSS08]__ Guillaume Chaslot, Sander Bakkes, Istvan Szita and Pieter Spronck, "[Monte-Carlo Tree Search: A New Framework for Game AI](http://sander.landofsand.com/publications/AIIDE08_Chaslot.pdf)", in Proceedings of the Fourth Artificial Intelligence and Interactive Digital Entertainment Conference, Stanford, California, 2008. Published by The AAAI Press, Menlo Park, California.
* __[KS06]__ Levente Kocsis, Csaba Szepesvári, "[Bandit based Monte-Carlo Planning](http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.102.1296)", In European Conference on Machine Learning (ECML) 2006, Lecture Notes in Artificial Intelligence 4212, pp. 282–293, 2006.
* __[CSUB06]__ Guillaume Chaslot, Jahn-Takeshi Saito, Jos W.H.M. Uiterwijk, Bruno Bouzy, H. Jaap van den Herik, "[Monte-Carlo Strategies for Computer Go](http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.97.8924)", In Proceedings of the 18th Belgian-Dutch Conference on Artificial Intelligence, pp. 83–90, 2006.
* Brian Rose, "[Othello. A Minute to Learn... A Lifetime to Master](http://www.ffothello.org/livres/othello-book-Brian-Rose.pdf)", 2005.
* __[WOF14]__ World Othello Federation, "[World Othello Championship Rules](http://www.worldothello.nu/sites/default/files/field/image/wocrules2014.pdf)", as valid for the 39th World Othello Championship 2015, Cambridge, UK, October 2015.

# 3rd Party Libraries

* jQuery: MIT licensed, https://github.com/jquery/jquery
* jQuery Mobile: MIT licensed, https://github.com/jquery/jquery-mobile

# Links

* Association for the Advancement of Artificial Intelligence, http://www.aaai.org
* HTML Living Standard, Web Workers, https://html.spec.whatwg.org
* The Othello Museum, http://www.beppi.it/public/OthelloMuseum/pages/history.php

## Othello Organizations
Mind that UCThello follows (most) official tournament rules of the listed
organizations depending on your selected options. Still UCThello is
independent development from any work of these organizations.

* World Othello Federation, http://www.worldothello.org
* Australian Othello Federation, http://www.othello.asn.au
* British Othello Federation, http://www.britishothello.org.uk
* Dansk Othello Forbund, http://www.othello.dk
* Fédération Française d’Othello, http://www.ffothello.org
* Federazione Nazionale Gioco Othello, Italia, http://www.fngo.it
* Malaysia Othello Association (MOA), http://z12.invisionfree.com/MOA/index.php
* Othello Club Deutschland, http://www.othello-club.de.vu
* United States Othello Association (USOA), http://www.usothello.org

# Contributors / Authors

Oliver Merkel,

This image is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

Oliver Merkel, Creative Commons License, This image is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

_All logos, brands and trademarks mentioned belong to their respective owners._

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/omerkel/ucthello

Awesome Lists containing this project

README