{"id":13400393,"url":"https://github.com/tensorforce/tensorforce","last_synced_at":"2025-04-09T01:21:29.510Z","repository":{"id":38539111,"uuid":"85491050","full_name":"tensorforce/tensorforce","owner":"tensorforce","description":"Tensorforce: a TensorFlow library for applied reinforcement learning","archived":false,"fork":false,"pushed_at":"2024-07-31T20:26:54.000Z","size":28753,"stargazers_count":3298,"open_issues_count":43,"forks_count":530,"subscribers_count":141,"default_branch":"master","last_synced_at":"2024-10-29T15:02:49.742Z","etag":null,"topics":["control","deep-reinforcement-learning","reinforcement-learning","system-control","tensorflow","tensorflow-library","tensorforce"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/tensorforce.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":".github/FUNDING.yml","license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":"ROADMAP.md","authors":null,"dei":null,"publiccode":null,"codemeta":null},"funding":{"github":"AlexKuhnle","patreon":null,"open_collective":null,"ko_fi":null,"tidelift":null,"community_bridge":null,"liberapay":"TensorforceTeam","issuehunt":null,"otechie":null,"custom":null}},"created_at":"2017-03-19T16:24:22.000Z","updated_at":"2024-10-16T03:44:21.000Z","dependencies_parsed_at":"2023-11-09T13:05:49.273Z","dependency_job_id":"ed9feada-c030-40a6-9cf7-1ab2f5725756","html_url":"https://github.com/tensorforce/tensorforce","commit_stats":{"total_commits":1818,"total_committers":88,"mean_commits":20.65909090909091,"dds":0.6787678767876788,"last_synced_commit":"1bf4c3abb471062fb66f9fe52852437756fd527b"},"previous_names":["reinforceio/tensorforce"],"tags_count":26,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tensorforce%2Ftensorforce","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tensorforce%2Ftensorforce/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tensorforce%2Ftensorforce/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tensorforce%2Ftensorforce/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/tensorforce","download_url":"https://codeload.github.com/tensorforce/tensorforce/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":246730344,"owners_count":20824399,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["control","deep-reinforcement-learning","reinforcement-learning","system-control","tensorflow","tensorflow-library","tensorforce"],"created_at":"2024-07-30T19:00:51.543Z","updated_at":"2025-04-02T00:17:47.197Z","avatar_url":"https://github.com/tensorforce.png","language":"Python","funding_links":["https://github.com/sponsors/AlexKuhnle","https://liberapay.com/TensorforceTeam","https://liberapay.com/TensorforceTeam/donate"],"categories":["Python","The Data Science Toolbox","Uncategorized","Libraries","Sensor Processing","Python (144)","Reinforcement Learning","强化学习","TensorFlow Models"],"sub_categories":["Deep Learning Packages","Uncategorized","Machine Learning","Others","Frameworks and Libraries","Reinforcement Learning"],"readme":"# Tensorforce: a TensorFlow library for applied reinforcement learning\n\n[![Docs](https://readthedocs.org/projects/tensorforce/badge)](http://tensorforce.readthedocs.io/en/latest/)\n[![Gitter](https://badges.gitter.im/tensorforce/community.svg)](https://gitter.im/tensorforce/community)\n[![Build Status](https://travis-ci.com/tensorforce/tensorforce.svg?branch=master)](https://travis-ci.com/tensorforce/tensorforce)\n[![pypi version](https://img.shields.io/pypi/v/tensorforce)](https://pypi.org/project/Tensorforce/)\n[![python version](https://img.shields.io/pypi/pyversions/tensorforce)](https://pypi.org/project/Tensorforce/)\n[![License](https://img.shields.io/badge/license-Apache%202.0-blue.svg)](https://github.com/tensorforce/tensorforce/blob/master/LICENSE)\n[![Donate](https://img.shields.io/badge/donate-GitHub_Sponsors-yellow)](https://github.com/sponsors/AlexKuhnle)\n[![Donate](https://img.shields.io/badge/donate-Liberapay-yellow)](https://liberapay.com/TensorforceTeam/donate)\n\n\n**This project is not maintained any longer!**\n\n\n#### Introduction\n\nTensorforce is an open-source deep reinforcement learning framework, with an emphasis on modularized flexible library design and straightforward usability for applications in research and practice. Tensorforce is built on top of [Google's TensorFlow framework](https://www.tensorflow.org/) and requires Python 3.\n\nTensorforce follows a set of high-level design choices which differentiate it from other similar libraries:\n\n- **Modular component-based design**: Feature implementations, above all, strive to be as generally applicable and configurable as possible, potentially at some cost of faithfully resembling details of the introducing paper.\n- **Separation of RL algorithm and application**: Algorithms are agnostic to the type and structure of inputs (states/observations) and outputs (actions/decisions), as well as the interaction with the application environment.\n- **Full-on TensorFlow models**: The entire reinforcement learning logic, including control flow, is implemented in TensorFlow, to enable portable computation graphs independent of application programming language, and to facilitate the deployment of models.\n\n\n\n#### Quicklinks\n\n- [Documentation](http://tensorforce.readthedocs.io) and [update notes](https://github.com/tensorforce/tensorforce/blob/master/UPDATE_NOTES.md)\n- [Contact](mailto:tensorforce.team@gmail.com) and [Gitter channel](https://gitter.im/tensorforce/community)\n- [Benchmarks](https://github.com/tensorforce/tensorforce/blob/master/benchmarks) and [projects using Tensorforce](https://github.com/tensorforce/tensorforce/blob/master/PROJECTS.md)\n- [Roadmap](https://github.com/tensorforce/tensorforce/blob/master/ROADMAP.md) and [contribution guidelines](https://github.com/tensorforce/tensorforce/blob/master/CONTRIBUTING.md)\n- [GitHub Sponsors](https://github.com/sponsors/AlexKuhnle) and [Liberapay](https://liberapay.com/TensorforceTeam/donate)\n\n\n\n#### Table of content\n\n- [Installation](#installation)\n- [Quickstart example code](#quickstart-example-code)\n- [Command line usage](#command-line-usage)\n- [Features](#features)\n- [Environment adapters](#environment-adapters)\n- [Support, feedback and donating](#support-feedback-and-donating)\n- [Core team and contributors](#core-team-and-contributors)\n- [Cite Tensorforce](#cite-tensorforce)\n\n\n\n## Installation\n\nA stable version of Tensorforce is periodically updated on PyPI and installed as follows:\n\n```bash\npip3 install tensorforce\n```\n\nTo always use the latest version of Tensorforce, install the GitHub version instead:\n\n```bash\ngit clone https://github.com/tensorforce/tensorforce.git\npip3 install -e tensorforce\n```\n\n**Note on installation on M1 Macs:** At the moment Tensorflow, which is a core dependency of Tensorforce, cannot be installed on M1 Macs directly. Follow the [\"M1 Macs\" section](https://tensorforce.readthedocs.io/en/latest/basics/installation.html) in the documentation for a workaround.\n\nEnvironments require additional packages for which there are setup options available (`ale`, `gym`, `retro`, `vizdoom`, `carla`; or `envs` for all environments), however, some require additional tools to be installed separately (see [environments documentation](http://tensorforce.readthedocs.io)). Other setup options include `tfa` for [TensorFlow Addons](https://www.tensorflow.org/addons) and `tune` for [HpBandSter](https://github.com/automl/HpBandSter) required for the `tune.py` script.\n\n**Note on GPU usage:** Different from (un)supervised deep learning, RL does not always benefit from running on a GPU, depending on environment and agent configuration. In particular for environments with low-dimensional state spaces (i.e., no images), it is hence worth trying to run on CPU only.\n\n\n\n## Quickstart example code\n\n```python\nfrom tensorforce import Agent, Environment\n\n# Pre-defined or custom environment\nenvironment = Environment.create(\n    environment='gym', level='CartPole', max_episode_timesteps=500\n)\n\n# Instantiate a Tensorforce agent\nagent = Agent.create(\n    agent='tensorforce',\n    environment=environment,  # alternatively: states, actions, (max_episode_timesteps)\n    memory=10000,\n    update=dict(unit='timesteps', batch_size=64),\n    optimizer=dict(type='adam', learning_rate=3e-4),\n    policy=dict(network='auto'),\n    objective='policy_gradient',\n    reward_estimation=dict(horizon=20)\n)\n\n# Train for 300 episodes\nfor _ in range(300):\n\n    # Initialize episode\n    states = environment.reset()\n    terminal = False\n\n    while not terminal:\n        # Episode timestep\n        actions = agent.act(states=states)\n        states, terminal, reward = environment.execute(actions=actions)\n        agent.observe(terminal=terminal, reward=reward)\n\nagent.close()\nenvironment.close()\n```\n\n\n\n## Command line usage\n\nTensorforce comes with a range of [example configurations](https://github.com/tensorforce/tensorforce/tree/master/benchmarks/configs) for different popular reinforcement learning environments. For instance, to run Tensorforce's implementation of the popular [Proximal Policy Optimization (PPO) algorithm](https://arxiv.org/abs/1707.06347) on the [OpenAI Gym CartPole environment](https://gym.openai.com/envs/CartPole-v1/), execute the following line:\n\n```bash\npython3 run.py --agent benchmarks/configs/ppo.json --environment gym \\\n    --level CartPole-v1 --episodes 100\n```\n\nFor more information check out the [documentation](http://tensorforce.readthedocs.io).\n\n\n\n## Features\n\n- **Network layers**: Fully-connected, 1- and 2-dimensional convolutions, embeddings, pooling, RNNs, dropout, normalization, and more; *plus* support of Keras layers.\n- **Network architecture**: Support for multi-state inputs and layer (block) reuse, simple definition of directed acyclic graph structures via register/retrieve layer, plus support for arbitrary architectures.\n- **Memory types**: Simple batch buffer memory, random replay memory.\n- **Policy distributions**: Bernoulli distribution for boolean actions, categorical distribution for (finite) integer actions, Gaussian distribution for continuous actions, Beta distribution for range-constrained continuous actions, multi-action support.\n- **Reward estimation**: Configuration options for estimation horizon, future reward discount, state/state-action/advantage estimation, and for whether to consider terminal and horizon states.\n- **Training objectives**: (Deterministic) policy gradient, state-(action-)value approximation.\n- **Optimization algorithms**: Various gradient-based optimizers provided by TensorFlow like Adam/AdaDelta/RMSProp/etc, evolutionary optimizer, natural-gradient-based optimizer, plus a range of meta-optimizers.\n- **Exploration**: Randomized actions, sampling temperature, variable noise.\n- **Preprocessing**: Clipping, deltafier, sequence, image processing.\n- **Regularization**: L2 and entropy regularization.\n- **Execution modes**: Parallelized execution of multiple environments based on Python's `multiprocessing` and `socket`.\n- **Optimized act-only SavedModel extraction**.\n- **TensorBoard support**.\n\nBy combining these modular components in different ways, a variety of popular deep reinforcement learning models/features can be replicated:\n\n- Q-learning: [Deep Q-learning](https://www.nature.com/articles/nature14236), [Double-DQN](https://arxiv.org/abs/1509.06461), [Dueling DQN](https://arxiv.org/abs/1511.06581), [n-step DQN](https://arxiv.org/abs/1602.01783), [Normalised Advantage Function (NAF)](https://arxiv.org/abs/1603.00748)\n- Policy gradient: [vanilla policy-gradient / REINFORCE](http://www-anw.cs.umass.edu/~barto/courses/cs687/williams92simple.pdf), [Actor-critic and A3C](https://arxiv.org/abs/1602.01783), [Proximal Policy Optimization](https://arxiv.org/abs/1707.06347), [Trust Region Policy Optimization](https://arxiv.org/abs/1502.05477), [Deterministic Policy Gradient](https://arxiv.org/abs/1509.02971)\n\nNote that in general the replication is not 100% faithful, since the models as described in the corresponding paper often involve additional minor tweaks and modifications which are hard to support with a modular design (and, arguably, also questionable whether it is important/desirable to support them). On the upside, these models are just a few examples from the multitude of module combinations supported by Tensorforce.\n\n\n\n## Environment adapters\n\n- [Arcade Learning Environment](https://github.com/mgbellemare/Arcade-Learning-Environment), a simple object-oriented framework that allows researchers and hobbyists to develop AI agents for Atari 2600 games.\n- [CARLA](https://github.com/carla-simulator/carla), is an open-source simulator for autonomous driving research.\n- [OpenAI Gym](https://gym.openai.com/), a toolkit for developing and comparing reinforcement learning algorithms which supports teaching agents everything from walking to playing games like Pong or Pinball.\n- [OpenAI Retro](https://github.com/openai/retro), lets you turn classic video games into Gym environments for reinforcement learning and comes with integrations for ~1000 games.\n- [OpenSim](http://osim-rl.stanford.edu/), reinforcement learning with musculoskeletal models.\n- [PyGame Learning Environment](https://github.com/ntasfi/PyGame-Learning-Environment/), learning environment which allows a quick start to Reinforcement Learning in Python.\n- [ViZDoom](https://github.com/mwydmuch/ViZDoom), allows developing AI bots that play Doom using only the visual information.\n\n\n## Support, feedback and donating\n\nPlease get in touch via [mail](mailto:tensorforce.team@gmail.com) or on [Gitter](https://gitter.im/tensorforce/community) if you have questions, feedback, ideas for features/collaboration, or if you seek support for applying Tensorforce to your problem.\n\nIf you want to support the Tensorforce core team (see below), please also consider donating: [GitHub Sponsors](https://github.com/sponsors/AlexKuhnle) or [Liberapay](https://liberapay.com/TensorforceTeam/donate).\n\n\n\n## Core team and contributors\n\nTensorforce is currently developed and maintained by [Alexander Kuhnle](https://github.com/AlexKuhnle).\n\nEarlier versions of Tensorforce (\u003c= 0.4.2) were developed by [Michael Schaarschmidt](https://github.com/michaelschaarschmidt), [Alexander Kuhnle](https://github.com/AlexKuhnle) and [Kai Fricke](https://github.com/krfricke).\n\nThe advanced parallel execution functionality was originally contributed by Jean Rabault (@jerabaul29) and Vincent Belus (@vbelus). Moreover, the pretraining feature was largely developed in collaboration with Hongwei Tang (@thw1021) and Jean Rabault (@jerabaul29).\n\nThe CARLA environment wrapper is currently developed by Luca Anzalone (@luca96).\n\nWe are very grateful for our open-source contributors (listed according to Github, updated periodically):\n\nIslandman93, sven1977, Mazecreator, wassname, lefnire, daggertye, trickmeyer, mkempers,\nmryellow, ImpulseAdventure,\njanislavjankov, andrewekhalel,\nHassamSheikh, skervim,\nbeflix, coord-e,\nbenelot, tms1337, vwxyzjn, erniejunior,\nDeathn0t, petrbel, nrhodes, batu, yellowbee686, tgianko,\nAdamStelmaszczyk, BorisSchaeling, christianhidber, Davidnet, ekerazha, gitter-badger, kborozdin, Kismuz, mannsi, milesmcc, nagachika, neitzal, ngoodger, perara, sohakes, tomhennigan.\n\n\n\n## Cite Tensorforce\n\nPlease cite the framework as follows:\n\n```\n@misc{tensorforce,\n  author       = {Kuhnle, Alexander and Schaarschmidt, Michael and Fricke, Kai},\n  title        = {Tensorforce: a TensorFlow library for applied reinforcement learning},\n  howpublished = {Web page},\n  url          = {https://github.com/tensorforce/tensorforce},\n  year         = {2017}\n}\n```\n\nIf you use the [parallel execution functionality](https://github.com/tensorforce/tensorforce/tree/master/tensorforce/contrib), please additionally cite it as follows:\n\n```\n@article{rabault2019accelerating,\n  title        = {Accelerating deep reinforcement learning strategies of flow control through a multi-environment approach},\n  author       = {Rabault, Jean and Kuhnle, Alexander},\n  journal      = {Physics of Fluids},\n  volume       = {31},\n  number       = {9},\n  pages        = {094105},\n  year         = {2019},\n  publisher    = {AIP Publishing}\n}\n```\n\nIf you use Tensorforce in your research, you may additionally consider citing the following paper:\n\n```\n@article{lift-tensorforce,\n  author       = {Schaarschmidt, Michael and Kuhnle, Alexander and Ellis, Ben and Fricke, Kai and Gessert, Felix and Yoneki, Eiko},\n  title        = {{LIFT}: Reinforcement Learning in Computer Systems by Learning From Demonstrations},\n  journal      = {CoRR},\n  volume       = {abs/1808.07903},\n  year         = {2018},\n  url          = {http://arxiv.org/abs/1808.07903},\n  archivePrefix = {arXiv},\n  eprint       = {1808.07903}\n}\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftensorforce%2Ftensorforce","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ftensorforce%2Ftensorforce","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftensorforce%2Ftensorforce/lists"}