{"id":13487119,"url":"https://github.com/tensorflow/agents","last_synced_at":"2025-05-15T00:06:31.964Z","repository":{"id":39862412,"uuid":"157936206","full_name":"tensorflow/agents","owner":"tensorflow","description":"TF-Agents: A reliable, scalable and easy to use TensorFlow library for Contextual Bandits and Reinforcement Learning.","archived":false,"fork":false,"pushed_at":"2025-04-30T16:58:13.000Z","size":13520,"stargazers_count":2902,"open_issues_count":206,"forks_count":735,"subscribers_count":77,"default_branch":"master","last_synced_at":"2025-05-07T23:41:34.805Z","etag":null,"topics":["bandits","contextual-bandits","dqn","multi-armed-bandits","reinforcement-learning","rl-algorithms","tensorflow","tf-agents"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/tensorflow.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2018-11-17T00:29:12.000Z","updated_at":"2025-04-30T16:58:17.000Z","dependencies_parsed_at":"2023-10-12T05:16:29.170Z","dependency_job_id":"f131b731-bf62-4059-811b-79db7b9593c6","html_url":"https://github.com/tensorflow/agents","commit_stats":{"total_commits":2172,"total_committers":156,"mean_commits":"13.923076923076923","dds":0.8342541436464088,"last_synced_commit":"a450b4bf33d77fcc31d89194579a7969077aa130"},"previous_names":[],"tags_count":17,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tensorflow%2Fagents","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tensorflow%2Fagents/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tensorflow%2Fagents/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tensorflow%2Fagents/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/tensorflow","download_url":"https://codeload.github.com/tensorflow/agents/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254249199,"owners_count":22039029,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["bandits","contextual-bandits","dqn","multi-armed-bandits","reinforcement-learning","rl-algorithms","tensorflow","tf-agents"],"created_at":"2024-07-31T18:00:55.595Z","updated_at":"2025-05-15T00:06:26.955Z","avatar_url":"https://github.com/tensorflow.png","language":"Python","readme":"# TF-Agents: A reliable, scalable and easy to use TensorFlow library for Contextual Bandits and Reinforcement Learning.\n\n[![PyPI tf-agents](https://badge.fury.io/py/tf-agents.svg)](https://badge.fury.io/py/tf-agents)\n![PyPI - Python Version](https://img.shields.io/pypi/pyversions/tf-agents)\n\n[TF-Agents](https://github.com/tensorflow/agents) makes implementing, deploying,\nand testing new Bandits and RL algorithms easier. It provides well tested and\nmodular components that can be modified and extended. It enables fast code\niteration, with good test integration and benchmarking.\n\nTo get started, we recommend checking out one of our Colab tutorials. If you\nneed an intro to RL (or a quick recap),\n[start here](docs/tutorials/0_intro_rl.ipynb). Otherwise, check out our\n[DQN tutorial](docs/tutorials/1_dqn_tutorial.ipynb) to get an agent up and\nrunning in the Cartpole environment. API documentation for the current stable\nrelease is on\n[tensorflow.org](https://www.tensorflow.org/agents/api_docs/python/tf_agents).\n\nTF-Agents is under active development and interfaces may change at any time.\nFeedback and comments are welcome.\n\n## Table of contents\n\n\u003ca href='#Agents'\u003eAgents\u003c/a\u003e\u003cbr\u003e\n\u003ca href='#Tutorials'\u003eTutorials\u003c/a\u003e\u003cbr\u003e\n\u003ca href='#Multi-Armed Bandits'\u003eMulti-Armed Bandits\u003c/a\u003e\u003cbr\u003e\n\u003ca href='#Examples'\u003eExamples\u003c/a\u003e\u003cbr\u003e\n\u003ca href='#Installation'\u003eInstallation\u003c/a\u003e\u003cbr\u003e\n\u003ca href='#Contributing'\u003eContributing\u003c/a\u003e\u003cbr\u003e\n\u003ca href='#Releases'\u003eReleases\u003c/a\u003e\u003cbr\u003e\n\u003ca href='#Principles'\u003ePrinciples\u003c/a\u003e\u003cbr\u003e\n\u003ca href='#Contributors'\u003eContributors\u003c/a\u003e\u003cbr\u003e\n\u003ca href='#Citation'\u003eCitation\u003c/a\u003e\u003cbr\u003e\n\u003ca href='#Disclaimer'\u003eDisclaimer\u003c/a\u003e\u003cbr\u003e\n\n\u003ca id='Agents'\u003e\u003c/a\u003e\n\n## Agents\n\nIn TF-Agents, the core elements of RL algorithms are implemented as `Agents`. An\nagent encompasses two main responsibilities: defining a Policy to interact with\nthe Environment, and how to learn/train that Policy from collected experience.\n\nCurrently the following algorithms are available under TF-Agents:\n\n*   [DQN: __Human level control through deep reinforcement learning__ Mnih et\n    al., 2015](https://deepmind.com/research/dqn/)\n*   [DDQN: __Deep Reinforcement Learning with Double Q-learning__ Hasselt et\n    al., 2015](https://arxiv.org/abs/1509.06461)\n*   [DDPG: __Continuous control with deep reinforcement learning__ Lillicrap et\n    al., 2015](https://arxiv.org/abs/1509.02971)\n*   [TD3: __Addressing Function Approximation Error in Actor-Critic Methods__\n    Fujimoto et al., 2018](https://arxiv.org/abs/1802.09477)\n*   [REINFORCE: __Simple Statistical Gradient-Following Algorithms for\n    Connectionist Reinforcement Learning__ Williams,\n    1992](https://www-anw.cs.umass.edu/~barto/courses/cs687/williams92simple.pdf)\n*   [PPO: __Proximal Policy Optimization Algorithms__ Schulman et al., 2017](https://arxiv.org/abs/1707.06347)\n*   [SAC: __Soft Actor Critic__ Haarnoja et al., 2018](https://arxiv.org/abs/1812.05905)\n\n\u003ca id='Tutorials'\u003e\u003c/a\u003e\n\n## Tutorials\n\nSee [`docs/tutorials/`](docs/tutorials) for tutorials on the major components\nprovided.\n\n\u003ca id='Multi-Armed Bandits'\u003e\u003c/a\u003e\n\n## Multi-Armed Bandits\n\nThe TF-Agents library contains a comprehensive Multi-Armed Bandits suite,\nincluding Bandits environments and agents. RL agents can also be used on Bandit\nenvironments. There is a tutorial in\n[`bandits_tutorial.ipynb`](https://github.com/tensorflow/agents/tree/master/docs/tutorials/bandits_tutorial.ipynb).\nand ready-to-run examples in\n[`tf_agents/bandits/agents/examples/v2`](https://github.com/tensorflow/agents/tree/master/tf_agents/bandits/agents/examples/v2).\n\n\u003ca id='Examples'\u003e\u003c/a\u003e\n\n## Examples\n\nEnd-to-end examples training agents can be found under each agent directory.\ne.g.:\n\n*   DQN:\n    [`tf_agents/agents/dqn/examples/v2/train_eval.py`](https://github.com/tensorflow/agents/tree/master/tf_agents/agents/dqn/examples/v2/train_eval.py)\n\n\u003ca id='Installation'\u003e\u003c/a\u003e\n\n## Installation\n\nTF-Agents publishes nightly and stable builds. For a list of releases read the\n\u003ca href='#Releases'\u003eReleases\u003c/a\u003e section. The commands below cover installing\nTF-Agents stable and nightly from [pypi.org](https://pypi.org) as well as from a\nGitHub clone.\n\n\u003e :warning: If using Reverb (replay buffer), which is very common,\nTF-Agents will only work with Linux.\n\n\u003e Note: Python 3.11 requires pygame 2.1.3+.\n\n### Stable\n\nRun the commands below to install the most recent stable release. API\ndocumentation for the release is on\n[tensorflow.org](https://www.tensorflow.org/agents/api_docs/python/tf_agents).\n\n```shell\n$ pip install --user tf-agents[reverb]\n\n# Use keras-2\n$ export TF_USE_LEGACY_KERAS=1\n# Use this tag get the matching examples and colabs.\n$ git clone https://github.com/tensorflow/agents.git\n$ cd agents\n$ git checkout v0.18.0\n```\n\nIf you want to install TF-Agents with versions of Tensorflow or\n[Reverb](https://github.com/deepmind/reverb) that are flagged as not compatible\nby the pip dependency check, use the following pattern below at your own risk.\n\n```shell\n$ pip install --user tensorflow\n$ pip install --user tf-keras\n$ pip install --user dm-reverb\n$ pip install --user tf-agents\n```\n\nIf you want to use TF-Agents with TensorFlow 1.15 or 2.0, install version 0.3.0:\n\n```shell\n# Newer versions of tensorflow-probability require newer versions of TensorFlow.\n$ pip install tensorflow-probability==0.8.0\n$ pip install tf-agents==0.3.0\n```\n\n### Nightly\n\nNightly builds include newer features, but may be less stable than the versioned\nreleases. The nightly build is pushed as `tf-agents-nightly`. We suggest\ninstalling nightly versions of TensorFlow (`tf-nightly`) and TensorFlow\nProbability (`tfp-nightly`) as those are the versions TF-Agents nightly are\ntested against.\n\nTo install the nightly build version, run the following:\n\n```shell\n# Use keras-2\n$ export TF_USE_LEGACY_KERAS=1\n\n# `--force-reinstall helps guarantee the right versions.\n$ pip install --user --force-reinstall tf-nightly\n$ pip install --user --force-reinstall tf-keras-nightly\n$ pip install --user --force-reinstall tfp-nightly\n$ pip install --user --force-reinstall dm-reverb-nightly\n\n# Installing with the `--upgrade` flag ensures you'll get the latest version.\n$ pip install --user --upgrade tf-agents-nightly\n```\n\n### From GitHub\n\nAfter cloning the repository, the dependencies can be installed by running `pip\ninstall -e .[tests]`. TensorFlow needs to be installed independently: `pip\ninstall --user tf-nightly`.\n\n\u003ca id='Contributing'\u003e\u003c/a\u003e\n\n## Contributing\n\nWe're eager to collaborate with you! See [`CONTRIBUTING.md`](CONTRIBUTING.md)\nfor a guide on how to contribute. This project adheres to TensorFlow's\n[code of conduct](CODE_OF_CONDUCT.md). By participating, you are expected to\nuphold this code.\n\n\u003ca id='Releases'\u003e\u003c/a\u003e\n\n## Releases\n\nTF Agents has stable and nightly releases. The nightly releases are often fine\nbut can have issues due to upstream libraries being in flux. The table below\nlists the version(s) of TensorFlow that align with each TF Agents' release.\nRelease versions of interest:\n\n  * 0.19.0 supports tensorflow-2.15.0.\n  * 0.18.0 dropped Python 3.8 support.\n  * 0.16.0 is the first version to support Python 3.11.\n  * 0.15.0 is the last release compatible with Python 3.7.\n  * If using numpy \u003c 1.19, then use TF-Agents 0.15.0 or earlier.\n  * 0.9.0 is the last release compatible with Python 3.6.\n  * 0.3.0 is the last release compatible with Python 2.x.\n\nRelease | Branch / Tag                                               | TensorFlow Version | dm-reverb Version\n------- | ---------------------------------------------------------- | ------------------ | -----------\nNightly | [master](https://github.com/tensorflow/agents)             | tf-nightly         | dm-reverb-nightly\n0.19.0  | [v0.19.0](https://github.com/tensorflow/agents/tree/v0.19.0) | 2.15.0           | 0.14.0\n0.18.0  | [v0.18.0](https://github.com/tensorflow/agents/tree/v0.18.0) | 2.14.0           | 0.13.0\n0.17.0  | [v0.17.0](https://github.com/tensorflow/agents/tree/v0.17.0) | 2.13.0           | 0.12.0\n0.16.0  | [v0.16.0](https://github.com/tensorflow/agents/tree/v0.16.0) | 2.12.0           | 0.11.0\n0.15.0  | [v0.15.0](https://github.com/tensorflow/agents/tree/v0.15.0) | 2.11.0           | 0.10.0\n0.14.0  | [v0.14.0](https://github.com/tensorflow/agents/tree/v0.14.0) | 2.10.0           | 0.9.0\n0.13.0  | [v0.13.0](https://github.com/tensorflow/agents/tree/v0.13.0) | 2.9.0            | 0.8.0\n0.12.0  | [v0.12.0](https://github.com/tensorflow/agents/tree/v0.12.0) | 2.8.0            | 0.7.0\n0.11.0  | [v0.11.0](https://github.com/tensorflow/agents/tree/v0.11.0) | 2.7.0            | 0.6.0\n0.10.0  | [v0.10.0](https://github.com/tensorflow/agents/tree/v0.10.0) | 2.6.0            |\n0.9.0   | [v0.9.0](https://github.com/tensorflow/agents/tree/v0.9.0) | 2.6.0              |\n0.8.0   | [v0.8.0](https://github.com/tensorflow/agents/tree/v0.8.0) | 2.5.0              |\n0.7.1   | [v0.7.1](https://github.com/tensorflow/agents/tree/v0.7.1) | 2.4.0              |\n0.6.0   | [v0.6.0](https://github.com/tensorflow/agents/tree/v0.6.0) | 2.3.0              |\n0.5.0   | [v0.5.0](https://github.com/tensorflow/agents/tree/v0.5.0) | 2.2.0              |\n0.4.0   | [v0.4.0](https://github.com/tensorflow/agents/tree/v0.4.0) | 2.1.0              |\n0.3.0   | [v0.3.0](https://github.com/tensorflow/agents/tree/v0.3.0) | 1.15.0 and 2.0.0.  |\n\n\u003ca id='Principles'\u003e\u003c/a\u003e\n\n## Principles\n\nThis project adheres to [Google's AI principles](PRINCIPLES.md). By\nparticipating, using or contributing to this project you are expected to adhere\nto these principles.\n\n\n\u003ca id='Contributors'\u003e\u003c/a\u003e\n\n## Contributors\n\n\nWe would like to recognize the following individuals for their code\ncontributions, discussions, and other work to make the TF-Agents library.\n\n* James Davidson\n* Ethan Holly\n* Toby Boyd\n* Summer Yue\n* Robert Ormandi\n* Kuang-Huei Lee\n* Alexa Greenberg\n* Amir Yazdanbakhsh\n* Yao Lu\n* Gaurav Jain\n* Christof Angermueller\n* Mark Daoust\n* Adam Wood\n\n\n\u003ca id='Citation'\u003e\u003c/a\u003e\n\n## Citation\n\nIf you use this code, please cite it as:\n\n```\n@misc{TFAgents,\n  title = {{TF-Agents}: A library for Reinforcement Learning in TensorFlow},\n  author = {Sergio Guadarrama and Anoop Korattikara and Oscar Ramirez and\n     Pablo Castro and Ethan Holly and Sam Fishman and Ke Wang and\n     Ekaterina Gonina and Neal Wu and Efi Kokiopoulou and Luciano Sbaiz and\n     Jamie Smith and Gábor Bartók and Jesse Berent and Chris Harris and\n     Vincent Vanhoucke and Eugene Brevdo},\n  howpublished = {\\url{https://github.com/tensorflow/agents}},\n  url = \"https://github.com/tensorflow/agents\",\n  year = 2018,\n  note = \"[Online; accessed 25-June-2019]\"\n}\n```\n\n\u003ca id='Disclaimer'\u003e\u003c/a\u003e\n\n## Disclaimer\n\nThis is not an official Google product.\n","funding_links":[],"categories":["The Data Science Toolbox","Uncategorized","Reinforcement Learning (RL) and Deep Reinforcement Learning (DRL)","Python","Libraries","Reinforcement Learning","Sensor Processing","时间序列","TensorFlow Models","Industry Strength Reinforcement Learning","强化学习","Technologies"],"sub_categories":["Deep Learning Packages","Uncategorized","RL/DRL Algorithm Implementations and Software Frameworks","Others","Machine Learning","网络服务_其他","Reinforcement Learning","NLP"],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftensorflow%2Fagents","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ftensorflow%2Fagents","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftensorflow%2Fagents/lists"}