Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/takuseno/d4rl-pybullet

Datasets for data-driven deep reinforcement learning with PyBullet environments
https://github.com/takuseno/d4rl-pybullet

data-driven-reinforcement-learning dataset deep-reinforcement-learning

Last synced: about 1 month ago
JSON representation

Datasets for data-driven deep reinforcement learning with PyBullet environments

Host: GitHub
URL: https://github.com/takuseno/d4rl-pybullet
Owner: takuseno
License: mit
Created: 2020-06-27T23:26:57.000Z (over 4 years ago)
Default Branch: master
Last Pushed: 2021-03-19T14:34:02.000Z (almost 4 years ago)
Last Synced: 2024-10-23T04:06:09.003Z (2 months ago)
Topics: data-driven-reinforcement-learning, dataset, deep-reinforcement-learning
Language: Python
Homepage:
Size: 77.1 KB
Stars: 143
Watchers: 5
Forks: 11
Open Issues: 2
Metadata Files:
- Readme: README.md
- License: LICENSE
- Code of conduct: CODE_OF_CONDUCT.md

Awesome Lists containing this project

README

        ![format check](https://github.com/takuseno/d4rl-pybullet/workflows/format%20check/badge.svg)

![test](https://github.com/takuseno/d4rl-pybullet/workflows/test/badge.svg)

![MIT](https://img.shields.io/badge/license-MIT-blue)

[![Gitter](https://img.shields.io/gitter/room/d3rlpy/d4rl-pybullet)](https://gitter.im/d3rlpy/d4rl-pybullet)

# d4rl-pybullet

Datasets for Data-Driven Deep Reinforcement Learning with Pybullet environments.

This work is intending to provide datasets for data-driven deep reinforcement

learning with open-sourced bullet simulator, which encourages more people to

join this community.

This repository is built on top of [d4rl](https://github.com/rail-berkeley/d4rl).

However, currently, it is impossible to import d4rl without checking MuJoCo

activation keys, which fails the program.

Thus, `d4rl_pybullet.offline_env` is directly copied from [d4rl repository](https://github.com/rail-berkeley/d4rl/blob/1899859e3ebdac8f587abbe9cb1663761be69141/d4rl/offline_env.py).

## install

```

$ pip install git+https://github.com/takuseno/d4rl-pybullet

```

## usage

The API is mostly identical to the original d4rl.

```py

import gym

import d4rl_pybullet

# dataset will be automatically downloaded into ~/.d4rl/datasets

env = gym.make('hopper-bullet-mixed-v0')

# interaction with its environment

env.reset()

env.step(env.action_space.sample())

# access to the dataset

dataset = env.get_dataset()

dataset['observations'] # observation data in N x dim_observation

dataset['actions'] # action data in N x dim_action

dataset['rewards'] # reward data in N x 1

dataset['terminals'] # terminal flags in N x 1 

```

## available datasets

- `random` denotes datasets sampled with a randomly initialized policy.

- `medium` denotes datasets sampled with a medium-level policy.

- `mixed` denotes datasets collected during policy training.

| id | task | mean reward | std reward | max reward | min reward | samples |

|:-|:-:|:-|:-|:-|:-|:-|

| hopper-bullet-random-v0 | HopperBulletEnv-v0 | 18.64 | 3.04 | 53.21 | -8.58 | 1000000 |

| hopper-bullet-medium-v0 | HopperBulletEnv-v0 | 1078.36 | 325.52 | 1238.9569 | 220.23 | 1000000 |

| hopper-bullet-mixed-v0 | HopperBulletEnv-v0 | 139.08 | 147.62 | 1019.94 | 9.15 | 59345 |

| halfcheetah-bullet-random-v0 | HalfCheetahBulletEnv-v0 | -1304.49 | 99.30 | -945.29 | -1518.58 | 1000000 |

| halfcheetah-bullet-medium-v0 | HalfCheetahBulletEnv-v0 | 787.35 | 104.31 | 844.91 | -522.57 | 1000000 |

| halfcheetah-bullet-mixed-v0 | HalfCheetahBulletEnv-v0 | 453.12 | 498.19 | 801.02 | -1428.22 | 178178 |

| ant-bullet-random-v0 | AntBulletEnv-v0 | 10.35 | 0.31 | 13.04 | 9.82 | 1000000 |

| ant-bullet-medium-v0 | AntBulletEnv-v0 | 570.80 | 104.82 | 816.79 | 70.87 | 1000000 |

| ant-bullet-mixed-v0 | AntBulletEnv-v0 | 255.40 | 196.22 | 609.66 | -32.74 | 53572 |

| walker2d-bullet-random-v0 | Walker2DBulletEnv-v0 | 14.98 | 2.94 | 66.90 | 5.73 | 1000000 |

| walker2d-bullet-medium-v0 | Walker2DBulletEnv-v0 | 1106.68 | 417.79 | 1394.38 | 16.00 | 1000000 |

| walker2d-bullet-mixed-v0 | Walker2DBulletEnv-v0 | 181.51 | 277.71 | 1363.94 | 9.45 | 89772 |

## train policy

You can train Soft Actor-Critic policy on your own machine.

```

# giving -g option to choose GPU device

$ ./scripts/train -e HopperBulletEnv-v0 -g 0 -n 1

```

## data collection

You can collect datasets with the trained policy.

```

$ ./scripts/collect -e HopperBulletEnv-v0 -g -n 1

```

## data collection with randomly initialized policy

You can collect datasets with the random policy.

```

$ ./scripts/random_collect -e HopperBulletEnv-v0 -g -n 1

```

## contribution

Any contributions will be welcomed!!

### coding style

This repository is formatted with [yapf](https://github.com/google/yapf).

You can format the entire repository (excluding `offline_env.py`) as follows:

```

$ ./scripts/format

```

## acknowledgement

This work is supported by Information-technology Promotion Agency, Japan

(IPA), Exploratory IT Human Resources Project (MITOU Program) in the fiscal

year 2020.