https://github.com/instance01/bootlegalphazero
AlphaZero written in C++, for research. Includes some tangential goodies such as imitation learning, Python implementations and a plethora of experiments.
https://github.com/instance01/bootlegalphazero
Last synced: 5 months ago
JSON representation
AlphaZero written in C++, for research. Includes some tangential goodies such as imitation learning, Python implementations and a plethora of experiments.
- Host: GitHub
- URL: https://github.com/instance01/bootlegalphazero
- Owner: instance01
- Created: 2020-04-24T17:34:34.000Z (almost 6 years ago)
- Default Branch: master
- Last Pushed: 2020-08-12T16:47:59.000Z (over 5 years ago)
- Last Synced: 2025-01-20T00:35:50.676Z (about 1 year ago)
- Language: Python
- Homepage:
- Size: 81.6 MB
- Stars: 4
- Watchers: 2
- Forks: 0
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
This is a bootleg version of AlphaZero written in C++17, purely using only the PyTorch C++ Frontend. It also includes some work leading up to it such as MCTS, imitation learning and a Python implementation (which I stopped working with due to performance). I tried a kind of bridge whereas the C++ code is the main program which only calls gym routines in Python using the Python C API, but unfortunately that turned out to be too slow. Thus, I went for a pure C++ version and included a few environments from [openai/gym](https://github.com/openai/gym/) rewritten in C++ (see the envs folder).
Below is the result of training 10 times on MountainCar using bootleg AlphaZero with parameter configuration 127. More current configurations can be seen [here](https://github.com/instance01/BootlegAlphaZero/blob/master/alphazero/cpp_impl/results.md).
Quite the variance, and takes ages to learn (roughly 6 hours to be more precise). Needs more work.
## Setup with Docker
These are instructions for the C++ version.
1. Go to `alphazero/contrib` and build the Docker image: `sudo docker build -t grab0 -f Dockerfile .`.
2. Go to `alphazero/cpp_impl` and run the Docker image: `sudo docker run -v $(pwd):/app --privileged -it grab0 bash`.
3. In the Docker image execute `setup` to compile.
4. BootlegAlphaZero can be run as `./GRAB0 `, e.g. `./GRAB0 mtcar 133`. All parameters are listed in `simulations.json`.
## Setup without Docker
This is for Debian Buster. This could be automated at some point.
1. Go to `alphazero/contrib`.
2. Run `printf "deb http://httpredir.debian.org/debian buster-backports main non-free\ndeb-src http://httpredir.debian.org/debian buster-backports main non-free" > /etc/apt/sources.list.d/backports.list`.
3. Run `apt-get update --allow-releaseinfo-change && apt-get install -t buster-backports -y g++ vim gdb cmake python3-dev wget unzip git libprotobuf-dev libprotobuf17 protobuf-compiler nlohmann-json3-dev`.
4. Run `pip3 install pytest numpy cython torch gym gym-minigrid git+https://github.com/instance01/gym-mini-envs.git`.
5. Go to `alphazero/cpp_impl`.
6. Run `cmake . && cmake --build .`.