{"id":13528420,"url":"https://github.com/rlgraph/rlgraph","last_synced_at":"2025-10-03T21:19:07.207Z","repository":{"id":132364079,"uuid":"132180122","full_name":"rlgraph/rlgraph","owner":"rlgraph","description":"RLgraph: Modular computation graphs for deep reinforcement learning","archived":false,"fork":false,"pushed_at":"2019-11-05T16:33:55.000Z","size":8244,"stargazers_count":319,"open_issues_count":23,"forks_count":40,"subscribers_count":21,"default_branch":"master","last_synced_at":"2025-03-18T14:50:21.956Z","etag":null,"topics":["deep-learning","deep-reinforcement-learning","dqn","machine-learning","neural-networks","ppo","pytorch","reinforcement-learning","tensorflow"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/rlgraph.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null}},"created_at":"2018-05-04T19:19:19.000Z","updated_at":"2025-03-15T13:17:05.000Z","dependencies_parsed_at":"2024-01-12T17:34:29.315Z","dependency_job_id":null,"html_url":"https://github.com/rlgraph/rlgraph","commit_stats":null,"previous_names":[],"tags_count":26,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rlgraph%2Frlgraph","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rlgraph%2Frlgraph/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rlgraph%2Frlgraph/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rlgraph%2Frlgraph/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/rlgraph","download_url":"https://codeload.github.com/rlgraph/rlgraph/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":246647720,"owners_count":20811368,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["deep-learning","deep-reinforcement-learning","dqn","machine-learning","neural-networks","ppo","pytorch","reinforcement-learning","tensorflow"],"created_at":"2024-08-01T07:00:18.932Z","updated_at":"2025-10-03T21:19:07.099Z","avatar_url":"https://github.com/rlgraph.png","language":"Python","funding_links":[],"categories":["Libraries","时间序列"],"sub_categories":["网络服务_其他"],"readme":"[![PyPI version](https://badge.fury.io/py/rlgraph.svg)](https://badge.fury.io/py/rlgraph)\n[![Python 3.5](https://img.shields.io/badge/python-3.5-orange.svg)](https://www.python.org/downloads/release/python-356/)\n[![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://github.com/rlgraph/rlgraph/blob/master/LICENSE)\n[![Documentation Status](https://readthedocs.org/projects/rlgraph/badge/?version=latest)](https://rlgraph.readthedocs.io/en/latest/?badge=latest)\n[![Build Status](https://travis-ci.org/rlgraph/rlgraph.svg?branch=master)](https://travis-ci.org/rlgraph/rlgraph)\n\n# RLgraph\nModular computation graphs for deep reinforcement learning.\n\nRLgraph is a framework to quickly prototype, define and execute reinforcement learning\nalgorithms both in research and practice. RLgraph is different from most other libraries as it can support\nTensorFlow (or static graphs in general) or eager/define-by run execution (PyTorch) through\na single component interface. An introductory blogpost can also be found here: [link](https://rlgraph.github.io/rlgraph/2019/01/04/introducing-rlgraph.html).\n \nRLgraph exposes a well defined API for using agents, and offers a novel component concept\nfor testing and assembly of machine learning models. By separating graph definition, compilation and execution,\nmultiple distributed backends and device execution strategies can be accessed without modifying\nagent definitions. This means it is especially suited for a smooth transition from applied use case prototypes\nto large scale distributed training.\n\nThe current state of RLgraph in version 0.4.0 is alpha. The core engine is substantially complete\nand works for TensorFlow and PyTorch (1.0). Distributed execution on Ray is exemplified via Distributed\nPrioritized Experience Replay (Ape-X), which also supports multi-gpu mode and solves e.g. Atari-Pong in ~1 hour\non a single-node. Algorithms like Ape-X or PPO can be used both with PyTorch and TensorFlow. Distributed TensorFlow can\nbe tested via the IMPALA agent. Please create an issue to discuss improvements or contributions.\n \nRLgraph currently implements the following algorithms:\n\n- DQN - ```dqn_agent``` -  [paper](https://www.cs.toronto.edu/~vmnih/docs/dqn.pdf)\n- Double-DQN - ```dqn_agent``` - via ```double_dqn``` flag -  [paper](https://www.aaai.org/ocs/index.php/AAAI/AAAI16/paper/download/12389/11847)\n- Dueling-DQN - ```dqn_agent``` - via ```dueling_dqn``` flag -  [paper](https://arxiv.org/abs/1509.06461)\n- Prioritized experience replay - via ```memory_spec``` option ```prioritized_replay``` - [paper](https://arxiv.org/abs/1511.05952)\n- Deep-Q learning from demonstration ```dqfd_agent``` - [paper](https://arxiv.org/abs/1704.03732)\n- Distributed prioritized experience replay (Ape-X) on Ray - via `apex_executor` - [paper](https://arxiv.org/abs/1803.00933)\n- Importance-weighted actor-learner architecture (IMPALA) on distributed TF/Multi-threaded single-node - ```impala_agents``` - [paper](https://arxiv.org/abs/1802.01561)\n- Proximal policy optimization with generalized advantage estimation - ```ppo_agent``` - [paper](https://arxiv.org/abs/1707.06347)\n- Soft Actor-Critic / SAC ```sac_agent``` - [paper](https://arxiv.org/abs/1801.01290)\n- Simple actor-critic for REINFORCE/A2C/A3C ```actor_critic_agent``` - [paper](https://arxiv.org/abs/1602.01783)\n\nThe ```SingleThreadedWorker``` implements high-performance environment vectorisation, and a ```RayWorker``` can execute\nray actor tasks in conjunction with a ```RayExecutor```. The ```examples``` folder contains simple scripts to \ntest these agents. There is also a very extensive test package including tests for virtually every component. Note\nthat we run tests on TensorFlow and have not reached full coverage/test compatibility with PyTorch. \n\nFor more detailed documentation on RLgraph and its API-reference, please visit\n[our readthedocs page here](https://rlgraph.readthedocs.io).\n\nBelow we show some training results on gym tasks:\n\n![Learning results](https://user-images.githubusercontent.com/14904111/53730353-fab97800-3e77-11e9-974c-69b21f1522d0.png)\n\n**Left:** Soft Actor Critic on Pendulum-v0 (10 seeds). **Right:** Multi-GPU Ape-X on Pong-v0 (10 seeds).\n\n\n## Install\n\nThe simplest way to install RLgraph is from pip:\n\n```pip install rlgraph```\n\nNote that some backends (e.g. ray) need additional dependencies (see setup.py).\nFor example, to install dependencies for the distributed backend ray, enter:\n\n```pip install rlgraph[ray]```\n\nTo successfully run tests, please also install OpenAI gym, e.g.\n\n```pip install gym[all]```\n\nUpon calling RLgraph, a config JSON is created under ~.rlgraph/rlgraph.json\nwhich can be used to change backend settings. The current default stable\nbackend is TensorFlow (\"tf\"). The PyTorch backend (\"pytorch\") does not support\nall utilities available in TF yet. Namely, device handling for PyTorch is incomplete,\nand we will likely wait until a stable PyTorch 1.0 release in the coming weeks.\n\n### Quickstart / example usage\n\nWe provide an example script for training the Ape-X algorithm on ALE using Ray in the [examples](examples) folder.\n\nFirst, you'll have to ensure, that Ray is used as the distributed backend. RLgraph checks the file\n`~/.rlgraph/rlgraph.json` for this configuration. You can use this command to\nconfigure RLgraph to use TensorFlow as the backend and Ray as the distributed backend:\n\n```bash\necho '{\"BACKEND\":\"tf\",\"DISTRIBUTED_BACKEND\":\"ray\"}' \u003e $HOME/.rlgraph/rlgraph.json\n```\n\nThen you can run our Ape-X example:\n\n```bash\n# Start ray on the head machine\nray start --head --redis-port 6379\n# Optionally join to this cluster from other machines with ray start --redis-address=...\n\n# Run script\npython apex_pong.py\n```\n\nYou can also train a simple DQN agent locally on OpenAI gym environments such as CartPole (this doesn't require Ray).\nThe following example script also contains a simple tf-summary switch for adding neural net variables to\nyour tensorboard reports (specify those Component by Perl-RegExp, whose variables you would like to see):\n\n```bash\npython dqn_cartpole_with_tf_summaries.py\n```\n\n\n## Import and use agents\n\nAgents can be imported and used as follows:\n\n```python\nfrom rlgraph.agents import DQNAgent\nfrom rlgraph.environments import OpenAIGymEnv\n\nenvironment = OpenAIGymEnv('CartPole-v0')\n\n# Create from .json file or dict, see agent API for all\n# possible configuration parameters.\nagent = DQNAgent.from_file(\n  \"configs/dqn_cartpole.json\",\n  state_space=environment.state_space, \n  action_space=environment.action_space\n)\n\n# Get an action, take a step, observe reward.\nstate = environment.reset()\naction, preprocessed_state = agent.get_action(\n  states=state,\n  extra_returns=\"preprocessed_states\"\n)\n\n# Execute step in environment.\nnext_state, reward, terminal, info =  environment.step(action)\n\n# Observe result.\nagent.observe(\n    preprocessed_states=preprocessed_state,\n    actions=action,\n    internals=[],\n    next_states=next_state,\n    rewards=reward,\n    terminals=terminal\n)\n\n# Call update when desired:\nloss = agent.update()\n```\n\nFull examples can be found in the examples folder.\n\n## Cite\n\nIf you use RLgraph in your research, please cite the following paper: [link](https://arxiv.org/abs/1810.09028)\n\n\n```\n@InProceedings{Schaarschmidt2019,\n  author    = {Schaarschmidt, Michael and Mika, Sven and Fricke, Kai and Yoneki, Eiko},\n  title     = {{RLgraph: Modular Computation Graphs for Deep Reinforcement Learning}},\n  booktitle = {{Proceedings of the 2nd Conference on Systems and Machine Learning (SysML)}},\n  year      = {2019},\n  month     = apr,\n}\n```","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frlgraph%2Frlgraph","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Frlgraph%2Frlgraph","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frlgraph%2Frlgraph/lists"}