{"id":15136386,"url":"https://github.com/openai/epg","last_synced_at":"2025-03-17T14:16:00.934Z","repository":{"id":52112313,"uuid":"122554700","full_name":"openai/EPG","owner":"openai","description":"Code for the paper \"Evolved Policy Gradients\"","archived":false,"fork":false,"pushed_at":"2018-11-22T06:05:34.000Z","size":468,"stargazers_count":249,"open_issues_count":7,"forks_count":56,"subscribers_count":15,"default_branch":"master","last_synced_at":"2025-03-03T22:55:35.280Z","etag":null,"topics":["continuous-control","evolutionary-strategy","machine-learning","meta-learning","paper","reinforcement-learning"],"latest_commit_sha":null,"homepage":"https://arxiv.org/abs/1802.04821","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/openai.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.md","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2018-02-23T00:54:49.000Z","updated_at":"2025-02-10T07:54:21.000Z","dependencies_parsed_at":"2022-09-06T07:52:17.090Z","dependency_job_id":null,"html_url":"https://github.com/openai/EPG","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/openai%2FEPG","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/openai%2FEPG/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/openai%2FEPG/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/openai%2FEPG/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/openai","download_url":"https://codeload.github.com/openai/EPG/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":241753017,"owners_count":20014250,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["continuous-control","evolutionary-strategy","machine-learning","meta-learning","paper","reinforcement-learning"],"created_at":"2024-09-26T06:21:30.715Z","updated_at":"2025-03-03T22:55:41.633Z","avatar_url":"https://github.com/openai.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"**Status:** Archive (code is provided as-is, no updates expected)\n\n# Evolved Policy Gradients (EPG)\n\nThe paper is located at https://arxiv.org/abs/1802.04821. A demonstration video can be found at https://youtu.be/-Z-ieH6w0LA.\n\n\u003e Houthooft, R., Chen, R. Y., Isola, P., Stadie, B. C., Wolski, F., Ho, J., Abbeel, P. (2018). Evolved Policy\nGradients. arXiv preprint arXiv:1802.04821.\n\n### Installation\n\nInstall Anaconda:\n```\ncurl -o /tmp/miniconda.sh https://repo.continuum.io/miniconda/Miniconda3-latest-MacOSX-x86_64.sh\nbash /tmp/miniconda.sh\nconda create -n epg python=3.6.1\nsource activate epg\n```\n\nInstall necessary OSX packages for MPI:\n```\nbrew install open-mpi\n```\n\nInstall necessary Python packages:\n```\npip install mpi4py==3.0.0 scipy \\\npandas tqdm joblib cloudpickle == 0.5.2 \\\nprogressbar2 opencv-python flask \u003e= 0.11.1 matplotlib pytest cython \\\nchainer pathos mujoco_py 'gym[all]'\n```\n\n\n### Running\nFirst go to the EPG code folder:\n```\ncd \u003cpath_to_EPG_folder\u003e\n```\nThen launch the entry script:\n```\nPYTHONPATH=. python epg/launch_local.py\n```\nExperiment data is saved in `\u003chome_dir\u003e/EPG_experiments/\u003cmonth\u003e-\u003cday\u003e/\u003cexperiment_name\u003e`.\n\n### Testing\n\nFirst, set `theta_load_path = '\u003cpath_to_theta.npy\u003e/theta.npy'` in `launch_local.py` according to the `theta.npy` obtained after running the `launch_local.py` script. This file should be located in `/\u003chome_dir\u003e/EPG_experiments/\u003cmonth\u003e-\u003cday\u003e/\u003cexperiment_name\u003e/thetas/`.\n\nThen run:\n```\nPYTHONPATH=. python epg/launch_local.py --test true\n```\n\n### Visualizing experiment data\n\nAssuming the experiment data is saved in `\u003chome_dir\u003e/EPG_experiments/\u003cmonth\u003e-\u003cday\u003e/\u003cexperiment_name\u003e`, run:\n```\nPYTHONPATH=. python epg/viskit/frontend.py \u003chome_dir\u003e/EPG_experiments/\u003cmonth\u003e-\u003cday\u003e/\u003cexperiment_name\u003e\n```\nThen go to `http://0.0.0.0:5000` in your browser.\n\nViskit sourced from\n\n\u003e Duan, Y., Chen, X., Houthooft, R., Schulman, J., Abbeel, P. \"Benchmarking Deep Reinforcement Learning for Continuous Control\". Proceedings of the 33rd International Conference on Machine Learning (ICML), 2016.\n\n### BibTeX entry\n\n```\n@article{Houthooft18Evolved,\nauthor = {Houthooft, Rein and Chen, Richard Y. and Isola, Phillip and Stadie, Bradly C. and Wolski, Filip and Ho, Jonathan and Abbeel, Pieter},\ntitle = {Evolved Policy Gradients},\njournal={arXiv preprint arXiv:1802.04821},\nyear = {2018}}\n```","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fopenai%2Fepg","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fopenai%2Fepg","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fopenai%2Fepg/lists"}