{"id":24194885,"url":"https://github.com/willclarktech/policy-gradient-implementations","last_synced_at":"2026-02-19T06:32:00.753Z","repository":{"id":37216589,"uuid":"252439207","full_name":"willclarktech/policy-gradient-implementations","owner":"willclarktech","description":"Implementing reinforcement learning algorithms based on policy gradients.","archived":false,"fork":false,"pushed_at":"2024-12-20T19:11:35.000Z","size":29575,"stargazers_count":1,"open_issues_count":0,"forks_count":1,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-10-18T07:06:41.530Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/willclarktech.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2020-04-02T11:42:47.000Z","updated_at":"2024-12-20T19:11:38.000Z","dependencies_parsed_at":"2025-01-13T18:54:16.469Z","dependency_job_id":null,"html_url":"https://github.com/willclarktech/policy-gradient-implementations","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/willclarktech/policy-gradient-implementations","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/willclarktech%2Fpolicy-gradient-implementations","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/willclarktech%2Fpolicy-gradient-implementations/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/willclarktech%2Fpolicy-gradient-implementations/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/willclarktech%2Fpolicy-gradient-implementations/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/willclarktech","download_url":"https://codeload.github.com/willclarktech/policy-gradient-implementations/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/willclarktech%2Fpolicy-gradient-implementations/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":29604789,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-02-19T05:11:50.834Z","status":"ssl_error","status_checked_at":"2026-02-19T05:11:38.921Z","response_time":117,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-01-13T18:39:58.953Z","updated_at":"2026-02-19T06:32:00.735Z","avatar_url":"https://github.com/willclarktech.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Policy Gradient Implementations\n\nImplementing reinforcement learning algorithms based on policy gradients.\n\n## Prerequisites\n\n-   Python3.6\n\n## Installation\n\nUsing poetry:\n\n```sh\npoetry install\n```\n\nUsing pip:\n\n```sh\npip install -r requirements.txt\n```\n\nUsing pip to install development dependencies too:\n\n```sh\npip install -r requirements.dev.txt\n```\n\nOn Google Colab to avoid conflicts with preinstalled packages:\n\n```sh\npip install -r requirements.colab.txt\n```\n\n## Running experiments\n\n### CLI\n\nAn executable is provided in `./bin`. From the root directory run:\n\n```sh\n./bin/policy_gradients \u003calgorithm\u003e\n```\n\nTo see the full list of options, including available algorithms:\n\n```sh\n./bin/policy_gradients --help\n```\n\nSeveral pre-trained models are provided in `./models`. For example, to view a pre-trained SAC agent operate in the `InvertedPendulumBulletEnv-v0` environment you can run:\n\n```sh\n./bin/policy_gradients sac -n 1 --env InvertedPendulumBulletEnv-v0 --eval --render --load_dir ./models\n```\n\n### Programmatic API\n\nUse the exposed `run` function with an options dictionary. This will be combined with a set of default hyperparameters for the relevant algorithm. For example:\n\n```py\nimport policy_gradients\n\npolicy_gradients.run({\n    \"algorithm\": \"sac\",\n    \"env_name\": \"LunarLanderContinuous-v2\",\n    \"n_episodes\": 250,\n    \"log_period\": 10,\n    \"save_dir\": \"./models\",\n    \"seed\": 123456,\n})\n```\n\nRefer to `parser.py` for the full list of available options as well as the `hyperparameters.py` file for the relevant algorithm to see which hyperparameters apply.\n\n### Notebooks\n\nA set of notebooks is provided in `./notebooks` which demonstrates how to train each algorithm for an appropriate environment using the programmatic API. Each notebook provides a link to open the notebook in Google Colab. To run locally start a Jupyter notebook server and open the relevant notebook in the browser window which should open automatically:\n\n```sh\njupyter notebook\n```\n\n## Development\n\nThe following scripts assume the requirements have been installed. If using poetry, they assume `poetry shell` has already been run or else they should be prefixed with `poetry run`.\n\n### Lint\n\n```sh\npylint ./policy_gradients\n```\n\n### Typecheck\n\n```sh\nmypy\n```\n\n### Format\n\n```sh\nblack ./policy_gradients\n```\n\n### Generating requirements files\n\n```sh\n./scripts/generate_requirements.sh\n```\n\n## Troubleshooting\n\n### Poetry\n\nI had trouble installing `gym` with Poetry because of its `Pillow` dependency and something to do with `zlib`. Setting `PKG_CONFIG_PATH=\"/usr/local/opt/zlib/lib/pkgconfig\"` fixed this problem.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fwillclarktech%2Fpolicy-gradient-implementations","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fwillclarktech%2Fpolicy-gradient-implementations","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fwillclarktech%2Fpolicy-gradient-implementations/lists"}