{"id":17132043,"url":"https://github.com/dayyass/rllib","last_synced_at":"2025-09-01T14:34:14.865Z","repository":{"id":44383523,"uuid":"508455163","full_name":"dayyass/rllib","owner":"dayyass","description":"Reinforcement Learning Library.","archived":false,"fork":false,"pushed_at":"2022-08-16T10:55:15.000Z","size":57,"stargazers_count":28,"open_issues_count":3,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-04-11T02:14:17.337Z","etag":null,"topics":["data-science","deep-learning","machine-learning","python","reinforcement-learning"],"latest_commit_sha":null,"homepage":"https://pypi.org/project/pytorch-rllib/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/dayyass.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2022-06-28T21:06:59.000Z","updated_at":"2025-03-24T22:44:21.000Z","dependencies_parsed_at":"2022-07-14T13:47:38.568Z","dependency_job_id":null,"html_url":"https://github.com/dayyass/rllib","commit_stats":null,"previous_names":[],"tags_count":3,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dayyass%2Frllib","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dayyass%2Frllib/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dayyass%2Frllib/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/dayyass%2Frllib/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/dayyass","download_url":"https://codeload.github.com/dayyass/rllib/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248674674,"owners_count":21143760,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["data-science","deep-learning","machine-learning","python","reinforcement-learning"],"created_at":"2024-10-14T19:25:50.166Z","updated_at":"2025-04-13T06:32:13.080Z","avatar_url":"https://github.com/dayyass.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"[![tests](https://github.com/dayyass/rllib/actions/workflows/tests.yml/badge.svg)](https://github.com/dayyass/rllib/actions/workflows/tests.yml)\n[![linter](https://github.com/dayyass/rllib/actions/workflows/linter.yml/badge.svg)](https://github.com/dayyass/rllib/actions/workflows/linter.yml)\n[![codecov](https://codecov.io/gh/dayyass/rllib/branch/main/graph/badge.svg?token=45O5NRAD8G)](https://codecov.io/gh/dayyass/rllib)\n\n[![python 3.7](https://img.shields.io/badge/python-3.7-blue.svg)](https://github.com/dayyass/rllib#requirements)\n[![release (latest by date)](https://img.shields.io/github/v/release/dayyass/rllib)](https://github.com/dayyass/rllib/releases/latest)\n[![license](https://img.shields.io/github/license/dayyass/rllib?color=blue)](https://github.com/dayyass/rllib/blob/main/LICENSE)\n\n[![pre-commit](https://img.shields.io/badge/pre--commit-enabled-black)](https://github.com/dayyass/rllib/blob/main/.pre-commit-config.yaml)\n[![code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/psf/black)\n\n[![pypi version](https://img.shields.io/pypi/v/pytorch-rllib)](https://pypi.org/project/pytorch-rllib)\n[![pypi downloads](https://img.shields.io/pypi/dm/pytorch-rllib)](https://pypi.org/project/pytorch-rllib)\n\n# rllib\nReinforcement Learning Library\n\n## Installation\n```\npip install pytorch-rllib\n```\n\n## Usage\nImplemented agents:\n- [ ] CrossEntropy\n- [ ] Value / Policy Iteration\n- [x] Q-Learning\n- [x] Expected Value SARSA\n- [x] Approximate Q-Learning\n- [x] DQN\n- [ ] Rainbow\n- [ ] REINFORCE\n- [ ] A2C\n\n```python3\nimport gym\nimport numpy as np\nimport torch\n\nfrom rllib.qlearning import ApproximateQLearningAgent\nfrom rllib.trainer import TrainerTorch as Trainer\nfrom rllib.utils import set_global_seed\n\ndevice = torch.device(\"cuda\" if torch.cuda.is_available() else \"cpu\")\n\n# init environment\nenv = gym.make(\"CartPole-v0\")\nset_global_seed(seed=42, env=env)\n\nn_actions = env.action_space.n\nn_state = env.observation_space.shape[0]\n\n# init torch model\nmodel = torch.nn.Sequential()\nmodel.add_module(\"layer1\", torch.nn.Linear(n_state, 128))\nmodel.add_module(\"relu1\", torch.nn.ReLU())\nmodel.add_module(\"layer2\", torch.nn.Linear(128, 64))\nmodel.add_module(\"relu2\", torch.nn.ReLU())\nmodel.add_module(\"values\", torch.nn.Linear(64, n_actions))\nmodel = model.to(device)\n\n# init agent\nagent = ApproximateQLearningAgent(\n    model=model,\n    alpha=0.5,\n    epsilon=0.5,\n    discount=0.99,\n    n_actions=n_actions,\n)\n\n# train\noptimizer = torch.optim.Adam(model.parameters(), lr=1e-4)\n\ntrainer = Trainer(env=env)\n\ntrain_rewards = trainer.train(\n    agent=agent,\n    optimizer=optimizer,\n    n_epochs=20,\n    n_sessions=100,\n)\n\n# train results\nprint(f\"Mean train reward: {np.mean(train_rewards[-10:])}\")  # reward: 120.318\n\n# inference\ninference_reward = trainer.play_session(\n    agent=agent,\n    t_max=10**4,\n)\n\n# inference results\nprint(f\"Inference reward: {inference_reward}\")  # reward: 171.0\n```\n\nMore examples you can find [here](https://github.com/dayyass/rllib/tree/main/examples).\n\n## Requirements\nPython \u003e= 3.7\n\n## Citation\nIf you use **rllib** in a scientific publication, we would appreciate references to the following BibTex entry:\n```bibtex\n@misc{dayyass2022rllib,\n    author       = {El-Ayyass, Dani},\n    title        = {Reinforcement Learning Library},\n    howpublished = {\\url{https://github.com/dayyass/rllib}},\n    year         = {2022}\n}\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdayyass%2Frllib","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdayyass%2Frllib","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdayyass%2Frllib/lists"}