https://github.com/leekwoon/KR-DL-UCT

[ICML 2018] Deep Reinforcement Learning in Continuous Action Spaces: a Case Study in the Game of Simulated Curling
https://github.com/leekwoon/KR-DL-UCT

Last synced: 5 months ago
JSON representation

[ICML 2018] Deep Reinforcement Learning in Continuous Action Spaces: a Case Study in the Game of Simulated Curling

Host: GitHub
URL: https://github.com/leekwoon/KR-DL-UCT
Owner: leekwoon
License: gpl-3.0
Created: 2018-05-30T09:33:03.000Z (over 7 years ago)
Default Branch: master
Last Pushed: 2018-07-20T10:35:08.000Z (over 7 years ago)
Last Synced: 2024-11-17T13:38:21.396Z (11 months ago)
Language: Python
Homepage:
Size: 35.4 MB
Stars: 35
Watchers: 4
Forks: 3
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: COPYING

Awesome Lists containing this project

awesome-monte-carlo-tree-search-papers - [Code

README

# KR-DL-UCT

This repository provides the source codes for KR-DL-UCT algorithm in the paper.

[Deep Reinforcement Learning in Continuous Action Spaces: a Case Study in the Game of Simulated Curling](http://proceedings.mlr.press/v80/lee18b/lee18b.pdf) by Kyowoon Lee, Sol-A Kim, Jaesik Choi and Seong-Whan Lee in [ICML-2018](https://icml.cc/Conferences/2018)

## Abstract
Many real-world applications of reinforcement learning require an agent to select optimal actions from continuous action spaces. Recently, deep neural networks have successfully been applied to games with discrete actions spaces. However, deep neural networks for discrete actions are not suitable for devising strategies for games in which a very small change in an action can dramatically affect the outcome. In this paper, we present a new framework which incorporates a deep neural network that can be used to learn game strategies based on a kernel-based Monte Carlo tree search that finds actions within a continuous space. To avoid hand-crafted features, we train our network using supervised learning followed by reinforcement learning with a high-fidelity simulator for the Olympic sport of curling. The program trained under our framework outperforms existing programs equipped with several hand-crafted features and won an international digital curling competition.

## Prerequisites
- Python 2.7 or Python 3.3+
- [Tensorflow](https://www.tensorflow.org/?hl=en)
- [cython](https://cython.readthedocs.io/en/latest/)

## Install

To get our code:

```bash
git clone --recursive https://github.com/leekwoon/KR-DL-UCT.git
```

To install:

```bash
python setup.py install build_ext --inplace
```

## Running examples

In this code, you can run the game by using our algorithm. The game log will be located in `./data`

```
python -m src.tests.game_test
```

You can download the [latest simulator](http://minerva.cs.uec.ac.jp/curling/wiki.cgi?page=%A5%C0%A5%A6%A5%F3%A5%ED%A1%BC%A5%C9) and watch the game from the log file.

![breakout-tunneling.gif](assets/1.gif)
![pong-killshot.gif](assets/2.gif)

Description of simulator are available from
http://minerva.cs.uec.ac.jp/curling_en/wiki.cgi?page=Description+of+each+part

## Authors

[Kyowoon Lee](http://sail.unist.ac.kr/members/)\*¹(leekwoon@unist.ac.kr), [Sol-A Kim](http://sail.unist.ac.kr/members/)\*¹(sol-a@unist.ac.kr), [Jaesik Choi](http://sail.unist.ac.kr/members/jaesik/)¹(jaesik@unist.ac.kr), [Seong-Whan Lee](http://ibi.korea.ac.kr/sub2_1.php?code=LSW)²(sw.lee@korea.ac.kr)

¹[UNIST](http://www.unist.ac.kr/) @ Department of Computer Engineering, UNIST, Ulsan, Republic of Korea

²[Korea University](http://www.korea.ac.kr/mbshome/mbs/en/index.do) @ Department of Brain and Cognitive Engineering, Korea University, Seoul, Republic of Korea

\* Equal contribution

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/leekwoon/KR-DL-UCT

Awesome Lists containing this project

README