Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/openai/imitation
Code for the paper "Generative Adversarial Imitation Learning"
https://github.com/openai/imitation
paper
Last synced: about 1 month ago
JSON representation
Code for the paper "Generative Adversarial Imitation Learning"
- Host: GitHub
- URL: https://github.com/openai/imitation
- Owner: openai
- License: mit
- Created: 2016-06-10T20:40:03.000Z (over 8 years ago)
- Default Branch: master
- Last Pushed: 2018-11-22T00:39:47.000Z (almost 6 years ago)
- Last Synced: 2024-10-01T10:05:47.704Z (about 1 month ago)
- Topics: paper
- Language: Python
- Homepage: https://arxiv.org/abs/1606.03476
- Size: 29.7 MB
- Stars: 687
- Watchers: 181
- Forks: 191
- Open Issues: 9
-
Metadata Files:
- Readme: README.rst
- License: LICENSE
Awesome Lists containing this project
- awesome-GAN - imitation - A implementation of the paper [Generative Adversarial Imitation Learning](https://arxiv.org/abs/1606.03476) (Codes)
README
**Status:** Archive (code is provided as-is, no updates expected)
=========================================
Generative Adversarial Imitation Learning
=========================================
-----------------------------------------
Jonathan Ho and Stefano Ermon
-----------------------------------------Contains an implementation of Trust Region Policy Optimization (Schulman et al., 2015).
Dependencies:
* OpenAI Gym >= 0.1.0, mujoco_py >= 0.4.0
* numpy >= 1.10.4, scipy >= 0.17.0, theano >= 0.8.2
* h5py, pytables, pandas, matplotlibProvided files:
* ``expert_policies/*`` are the expert policies, trained by TRPO (``scripts/run_rl_mj.py``) on the true costs
* ``scripts/im_pipeline.py`` is the main training and evaluation pipeline. This script is responsible for sampling data from experts to generate training data, running the training code (``scripts/imitate_mj.py``), and evaluating the resulting policies.
* ``pipelines/*`` are the experiment specifications provided to ``scripts/im_pipeline.py``
* ``results/*`` contain evaluation data for the learned policies