https://github.com/patwie/tfgo
Independent efficient re-Implementation AlphaGo SL policy network
https://github.com/patwie/tfgo
alphago deep-learning swig tensorflow
Last synced: about 2 months ago
JSON representation
Independent efficient re-Implementation AlphaGo SL policy network
- Host: GitHub
- URL: https://github.com/patwie/tfgo
- Owner: PatWie
- Created: 2017-06-18T18:50:50.000Z (over 8 years ago)
- Default Branch: master
- Last Pushed: 2017-07-03T21:08:26.000Z (over 8 years ago)
- Last Synced: 2025-03-22T11:43:35.261Z (7 months ago)
- Topics: alphago, deep-learning, swig, tensorflow
- Language: C
- Homepage:
- Size: 464 KB
- Stars: 3
- Watchers: 3
- Forks: 3
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Policy-Network (SL) from AlphaGO in TensorFlow
Yet another re-implementation of the policy-network (supervised) from Deepmind's AlphaGo. This implementations uses a C++ backend to compute the feature planes presented in the [Nature-Paper](https://gogameguru.com/i/2016/03/deepmind-mastering-go.pdf) and a custom fileformat for efficient storage. To train the network it uses the dataflow and multi-GPU setup of [TensorPack](https://github.com/ppwwyyxx/tensorpack).
# Data + Features
See [here](https://u-go.net/gamerecords/) or [here](https://www.u-go.net/gamerecords-4d/) to get a database of GO games in the SGF fileformat. It is also possible to buy a database `GoGoD`. This database consists of `89942` (without the games before 1800). Some statistics about this database are- 2052 out of 89942 games are corrupt ~2.281470%
- 2908 out of 89942 games are amateur ~3.233195%
- total moves 17 676 038 (games with professional)
- average moves 207 per game
- u-go.net provides 1681414 files including some amateur games.To handle these games efficiently, we convert them to binary by
python reader.py --action convert --pattern "/tmp/godb/Database/*/*.sgf"
Now, to merge all games within a single file, we dump these games to an LMDB file (train/val/test split of the games):
python go_db.py --lmdb "/tmp/godb/" --pattern "/tmp/godb/Database/*/*.sgfbin" --action create
I do not split the positions into train/val/test, I split the games to makes sure they are totally independent. All training data can be compressed to just (1.1GB/55M) and validation data is just (120MB/6.2MB) for u-go.net/GoGoD databases.
To simulate the board position from the encoded moves, we setup the SWIG-Python binding `goplanes` of the C++ implementation by:cd go-engine && python setup.py install --user
This generates all feature-planes from positions randomly extracted from the db including all rotations (12x8 inputs /sec). On a 6-core this gives araound 100x8 positions per second. I verified this implementation along all final positions from GoGoD simulated in GnuGo and GoPlane.
python go_db.py --lmdb "/home/patwie/godb/go_train.lmdb" --action benchmark
# Training
To train the version with `128` filters just fire up.
python tfgo.py --gpu 0,1 --k 128 --path /tmp # or --gpu 0 for single gpu
I saw no big different on a small number of GPUS, this uses the Sync-Training rather than any Async-Training. It will also create checkpoints for the best performing models from the validation phase.
Tensorboard should show something like
