{"id":28753554,"url":"https://github.com/google-deepmind/alphadev","last_synced_at":"2025-06-17T00:40:59.149Z","repository":{"id":173412728,"uuid":"647850552","full_name":"google-deepmind/alphadev","owner":"google-deepmind","description":null,"archived":false,"fork":false,"pushed_at":"2023-06-20T08:36:12.000Z","size":34,"stargazers_count":662,"open_issues_count":7,"forks_count":67,"subscribers_count":11,"default_branch":"main","last_synced_at":"2024-04-16T04:53:36.750Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/google-deepmind.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null}},"created_at":"2023-05-31T16:49:19.000Z","updated_at":"2024-04-03T09:54:31.000Z","dependencies_parsed_at":"2023-09-07T20:34:43.292Z","dependency_job_id":null,"html_url":"https://github.com/google-deepmind/alphadev","commit_stats":null,"previous_names":["deepmind/alphadev","google-deepmind/alphadev"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/google-deepmind/alphadev","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/google-deepmind%2Falphadev","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/google-deepmind%2Falphadev/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/google-deepmind%2Falphadev/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/google-deepmind%2Falphadev/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/google-deepmind","download_url":"https://codeload.github.com/google-deepmind/alphadev/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/google-deepmind%2Falphadev/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":260268635,"owners_count":22983601,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-06-17T00:40:42.749Z","updated_at":"2025-06-17T00:40:59.139Z","avatar_url":"https://github.com/google-deepmind.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# AlphaDev\n\nThis repository contains relevant pseudocode and algorithms for the publication\n\"Faster sorting algorithms discovered using deep reinforcement learning\"\n\nThe repository contains two modules:\n\n- `alphadev.py` with the pseudocode for the AlphaDev agent and the Assembly Game RL environment\n- `sort_functions_test.cc` with the discovered assembly programs and checks their correctness. This includes the following:\n    - `Sort3AlphaDev` sorts 3 elements with 17 instructions\n    - `Sort4AlphaDev` sorts 4 elements with 28 instructions\n    - `Sort5AlphaDev` sorts 5 elements with 43 instructions\n    - `Sort6AlphaDev` sorts 6 elements with 57 instructions\n    - `Sort7AlphaDev` sorts 7 elements with 76 instructions\n    - `Sort8AlphaDev` sorts 8 elements with 91 instructions\n    - `VarSort3AlphaDev` sorts up to 3 elements with 25 instructions\n    - `VarSort4AlphaDev` sorts up to 4 elements with 57 instructions\n    - `VarSort5AlphaDev` sorts up to 5 elements with 80 instructions\n\n## Installation\n\nThe code present in `alphadev.py` is pseudocode to  simplify reproduction.\nAs such, no installation is required for the pseudocode.\n\nTo test the discovered assembly programs, we need to [install bazel](https://docs.bazel.build/versions/main/install.html) and verify it builds correctly (we only support Linux with clang, but other\nplatforms might work)\n\n## Usage\n\nThe `alphadev.py` contains logic for the RL environment, AlphaDev agent and the\nAssembly Game. The main components are:\n\n- `AssemblyGame` This represents the Assembly Game RL environment. The state of\nthe RL environment contains the current program and the state of memory and\nregisters. Doing a step in this environment is equivalent to adding a new\nassembly instruction to the program (see the `step` method). The reward is a\ncombination of correctness and latency reward after executing the assembly\nprogram over an input distribution. For simplicity of the overall algorithm we\nare not including the assembly runner, but assembly execution can be delegated\nto an external library (e.g. [AsmJit](https://github.com/asmjit/asmjit)).\n- `AlphaDevConfig` contains the main hyperparameters used for the AlphaDev\n  agent. This includes configuration of AlphaZero, MCTS, and underlying\n  networks.\n- `play_game` contains the logic to run an AlphaDev game. This include the MCTS\nprocedure and the storage of the game.\n- `RepresentationNet` and `PredictionNet` contain the implementation the\n  networks used in the AlphaZero algorithm. It uses a [MultiQuery\n  Transformer](https://arxiv.org/abs/1911.02150) to represent assembly\n  instructions.\n\n\nTo run the assembly test in `sort_functions_test.cc`, use the following command:\n`CC=clang bazel test :sort_functions_test`\n\n## Citing this work\n\n```bibtex\n@Article{AlphaDev2023,\n  author  = {Mankowitz, Daniel J. and Michi, Andrea and Zhernov, Anton and Gelmi, Marco and Selvi, Marco and Paduraru, Cosmin and Leurent, Edouard and Iqbal, Shariq and Lespiau, Jean-Baptiste and Ahern, Alex and Koppe, Thomas and Millikin, Kevin and Gaffney, Stephen and Elster, Sophie and Broshear, Jackson and Gamble, Chris and Milan, Kieran and Tung, Robert and Hwang, Minjae and Cemgil, Taylan and Barekatain, Mohammadamin and Li, Yujia and Mandhane, Amol and Hubert, Thomas and Schrittwieser, Julian and Hassabis, Demis and Kohli, Pushmeet and Riedmiller, Martin and Vinyals, Oriol and Silver, David},\n  journal = {Nature},\n  title   = {Faster sorting algorithms discovered using deep reinforcement learning},\n  year    = {2023},\n  volume  = {618},\n  number  = {7964},\n  pages   = {257--263},\n  doi     = {10.1038/s41586-023-06004-9}\n}\n```\n\n\n## License and disclaimer\n\nCopyright 2022 DeepMind Technologies Limited\n\nAll software is licensed under the Apache License, Version 2.0 (Apache 2.0);\nyou may not use this file except in compliance with the Apache 2.0 license.\nYou may obtain a copy of the Apache 2.0 license at:\nhttps://www.apache.org/licenses/LICENSE-2.0\n\nAll other materials are licensed under the Creative Commons Attribution 4.0\nInternational License (CC-BY). You may obtain a copy of the CC-BY license at:\nhttps://creativecommons.org/licenses/by/4.0/legalcode\n\nUnless required by applicable law or agreed to in writing, all software and\nmaterials distributed here under the Apache 2.0 or CC-BY licenses are\ndistributed on an \"AS IS\" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND,\neither express or implied. See the licenses for the specific language governing\npermissions and limitations under those licenses.\n\nThis is not an official Google product.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgoogle-deepmind%2Falphadev","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fgoogle-deepmind%2Falphadev","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgoogle-deepmind%2Falphadev/lists"}