{"id":15633402,"url":"https://github.com/maxpumperla/scalphagozero","last_synced_at":"2025-04-14T12:52:16.690Z","repository":{"id":78427753,"uuid":"142998606","full_name":"maxpumperla/ScalphaGoZero","owner":"maxpumperla","description":"An independent implementation of DeepMind's AlphaGoZero in Scala, using Deeplearning4J (DL4J)","archived":false,"fork":false,"pushed_at":"2020-04-05T14:37:47.000Z","size":83841,"stargazers_count":156,"open_issues_count":2,"forks_count":24,"subscribers_count":14,"default_branch":"master","last_synced_at":"2025-03-28T02:01:47.065Z","etag":null,"topics":["artificial-intelligence","deep-learning","deep-reinforcement-learning","dl4j","keras","python","scala"],"latest_commit_sha":null,"homepage":"","language":"Scala","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/maxpumperla.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":".github/CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2018-07-31T10:22:09.000Z","updated_at":"2024-03-17T22:31:05.000Z","dependencies_parsed_at":"2023-07-16T17:30:12.343Z","dependency_job_id":null,"html_url":"https://github.com/maxpumperla/ScalphaGoZero","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/maxpumperla%2FScalphaGoZero","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/maxpumperla%2FScalphaGoZero/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/maxpumperla%2FScalphaGoZero/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/maxpumperla%2FScalphaGoZero/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/maxpumperla","download_url":"https://codeload.github.com/maxpumperla/ScalphaGoZero/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248885306,"owners_count":21177618,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["artificial-intelligence","deep-learning","deep-reinforcement-learning","dl4j","keras","python","scala"],"created_at":"2024-10-03T10:49:03.878Z","updated_at":"2025-04-14T12:52:16.663Z","avatar_url":"https://github.com/maxpumperla.png","language":"Scala","funding_links":[],"categories":[],"sub_categories":[],"readme":"# ScalphaGoZero [![Build Status](https://travis-ci.org/maxpumperla/ScalphaGoZero.svg?branch=master)](https://travis-ci.org/maxpumperla/ScalphaGoZero)\n\nScalphaGoZero is an independent implementation of DeepMind's AlphaGo Zero in Scala, \nusing [Deeplearning4J (DL4J)](https://deeplearning4j.org/) to run neural networks. \nYou can either run experiments with models built in [DL4J directly](https://github.com/maxpumperla/ScalphaGoZero/blob/master/src/main/scala/org/deeplearning4j/scalphagozero/models/DualResnetModel.scala) \nor import prebuilt [Keras models](https://github.com/maxpumperla/ScalphaGoZero/blob/master/src/main/scala/org/deeplearning4j/scalphagozero/models/KerasModel.scala).\n\nScalphaGoZero is mainly an engineering effort to demonstrate how complex and successful systems\nin machine learning are not bound to Python anymore. With access to powerful tools like ND4J for\nadvanced maths, DL4J for neural networks, and the mature infrastructure of the JVM, languages\nlike Scala can offer a viable alternative for data scientists. \n\nThis project is a Scala port of the AlphaGo Zero module found in \n[Deep Learning and the Game of Go](https://github.com/maxpumperla/deep_learning_and_the_game_of_go).\n\n## Getting started\n\nTo run after cloning, do\n\n```bash\ncd ScalphaGoZero\nsbt run\n```\n\n[This application](https://github.com/maxpumperla/ScalphaGoZero/blob/master/src/main/scala/org/deeplearning4j/scalphagozero/demo/ScalphaGoZero.scala) \nwill set up two opponents, simulate some specified number of games between them using the\nAlphaGo Zero methodology and train one of the opponents with the experience data\ngained from the games. You can also accumulate training experience in a saved model and use that model later.\n\nTo use Keras model import you need to generate the resources first:\n\n```bash\ncd src/test/python\npip install tensorflow keras\npython generate_h5_resources.py\n```\n\nThe generated, serialized Keras models are put into `src/main/resources` and are picked up\nby the `KerasModel` class, as [demonstrated in our tests](https://github.com/maxpumperla/ScalphaGoZero/blob/master/src/test/scala/org/deeplearning4j/scalphagozero/models/TestKerasImport.scala).\n \n### GTP Client\n\nIt is possible to build a fat executable jar that can be used as a GTP client for Go front ends like [Sabaki](https://sabaki.yichuanshen.de/). First run `sbt assembly` to build the fat jar. The engine can be run as\n```\njava -jar target/scala-2.12/ScalphaGoZero-assembly-1.0.1.jar gtp\n```\nHowever, you should use the gtpClient.bat (on Windows) when configuring Sabaki. \nGo to *Engines | Manage Engines* in the top menu, then add a ScalphaGoZero engine as shown here.\n\n![add scalphaGoZero engine to Sabaki](images/sabaki_manage_engines.PNG)\n\nThen, from the menu, do *File | New*. \nThis allows you to configure a new game and have one or both \nplayers use the engine that you just added. \nIt will attempt to load the model file from the `models` directory based on the board size specified. Currently, there is only a model file for 5x5 games. It is `models/model_size_5_layers_2.model`.\n\n![Sabaki play with ScalphaGoZero engine](images/sabaki_gtp_game.PNG)\n\nWhen running, inofrmation and errors are logged to output.log.\n\n## Core Concepts\n\nQuite a few concepts are needed to build an AlphaGoZero system, ScalphaGoZero is intended\nas a software developer friendly approach with clean abstractions to help users get \nstarted. Many of the concepts used here can be reused for other games, only the basics are\nreally designed for the game of Go. For black, or white, or both, select the ScalphaGoZero engine that you just added and play the game.\n\n- Basics: To let a computer play a game you need to code the basics of the game, including \nwhat a [Go board](https://github.com/maxpumperla/ScalphaGoZero/blob/master/src/main/scala/org/deeplearning4j/scalphagozero/board/GoBoard.scala),\na [player](https://github.com/maxpumperla/ScalphaGoZero/blob/master/src/main/scala/org/deeplearning4j/scalphagozero/board/Player.scala),\na [move](https://github.com/maxpumperla/ScalphaGoZero/blob/master/src/main/scala/org/deeplearning4j/scalphagozero/board/Move.scala) or \nthe current [game state](https://github.com/maxpumperla/ScalphaGoZero/blob/master/src/main/scala/org/deeplearning4j/scalphagozero/board/GameState.scala) is.\nNotably, the [Zobrist hashing technique](https://github.com/maxpumperla/ScalphaGoZero/blob/master/src/main/scala/org/deeplearning4j/scalphagozero/board/ZobristHashing.scala)\nis implemented in the Go board class to speed up computation. \n- [Encoders](https://github.com/maxpumperla/ScalphaGoZero/blob/master/src/main/scala/org/deeplearning4j/scalphagozero/encoders/Encoder.scala): \nGame states and moves need to be translated into something a neural network can\nuse for training and predictions, namely tensors. We use [ND4J](https://deeplearning4j.org/docs/latest/nd4j-overview)\nfor this encoding step. AlphaGo Zero needs a specific [ZeroEncoder](https://github.com/maxpumperla/ScalphaGoZero/blob/master/src/main/scala/org/deeplearning4j/scalphagozero/encoders/ZeroEncoder.scala),\nbut many other encoders are feasible and can be implemented by the user.\n- [Agents](https://github.com/maxpumperla/ScalphaGoZero/blob/master/src/main/scala/org/deeplearning4j/scalphagozero/agents/Agent.scala):\nA Go-playing agent knows how to play a game, by selecting the next move, and handles game state information\ninternally. For AlphaGo Zero you need a [ZeroAgent](https://github.com/maxpumperla/ScalphaGoZero/blob/master/src/main/scala/org/deeplearning4j/scalphagozero/agents/ZeroAgent.scala),\nbut other agents with simpler methodology can also lead to decent results.\n- [Models](https://github.com/maxpumperla/ScalphaGoZero/tree/master/src/main/scala/org/deeplearning4j/scalphagozero/models)\nTo select a move, agents need machine learning models to predict the value of the current position (value function)\nand how well a next move would probably work (policy function). In AlphaGo Zero both of these\ncomponents are integrated into one deep neural network, with a so called policy and value head.\nWe implemented this [model in DL4J here](https://github.com/maxpumperla/ScalphaGoZero/blob/master/src/main/scala/org/deeplearning4j/scalphagozero/models/DualResnetModel.scala).\nTo start with, you might want to work with simpler models. Each model that takes encoded states and outputs\nthe right shape can be used within this framework.\n- [Scoring](https://github.com/maxpumperla/ScalphaGoZero/tree/master/src/main/scala/org/deeplearning4j/scalphagozero/scoring)\nTo play actual games, agents need the ability to estimate scores at the end of a game to decide \nwho won and reinforce the signals leading to victory (and weaken those leading to defeat). This\nincludes territory estimation and reporting game results.\n- [Experience](https://github.com/maxpumperla/ScalphaGoZero/tree/master/src/main/scala/org/deeplearning4j/scalphagozero/experience):\nWhen opponents play many games against each other, they generate game play data, or experience,\nthat can be used for training the agents. We use [experience collectors](https://github.com/maxpumperla/ScalphaGoZero/blob/master/src/main/scala/org/deeplearning4j/scalphagozero/experience/ExperienceCollector.scala) \nto store this information.\n- [Similation](https://github.com/maxpumperla/ScalphaGoZero/blob/master/src/main/scala/org/deeplearning4j/scalphagozero/simulation/ZeroSimulator.scala):\nThe last piece needed to run your own AlphaGo Zero is to create a simulation between two `ZeroAgent`\ninstances. The simulation stores experience data and lets your agents learn from it, so they\nbecome better players over time.\n\n## Contribute\n\nScalphaGoZero can be improved in many ways, here are a few examples:\n\n- Experience collectors build one large ND4J array, which won't work for large experiments.\nThis should be refactored into an iterator that only provides you with the next batch\nneeded for training.\n- Test coverage can be vastly improved. The basics are covered, but there are potentially many\nedge cases still missing.\n- Running a larger experiment and storing the weights somewhere freely accessible to users\nwould be beneficial to get started and see reasonable results from the start.\n- Building a demo with a user interface would be nice. Agents could be wrapped in an HTTP server,\nfor instance, and connect against a web interface so humans can play their bots.   \n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmaxpumperla%2Fscalphagozero","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmaxpumperla%2Fscalphagozero","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmaxpumperla%2Fscalphagozero/lists"}