https://github.com/ryanrudes/synchronous-gym

A wrapper for OpenAI gym which enables use of multiple environments in synchrony
https://github.com/ryanrudes/synchronous-gym

gym-environment gym-environments machine-learning reinforcement-learning reinforcement-learning-environments

Last synced: 26 days ago
JSON representation

A wrapper for OpenAI gym which enables use of multiple environments in synchrony

Host: GitHub
URL: https://github.com/ryanrudes/synchronous-gym
Owner: ryanrudes
Created: 2020-10-04T23:57:29.000Z (about 5 years ago)
Default Branch: main
Last Pushed: 2020-10-06T00:59:21.000Z (about 5 years ago)
Last Synced: 2025-09-02T03:55:26.345Z (about 1 month ago)
Topics: gym-environment, gym-environments, machine-learning, reinforcement-learning, reinforcement-learning-environments
Language: Python
Homepage:
Size: 47.9 KB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

          # Synchronous Gym

A wrapper for OpenAI's `gym` module which enables use of multiple environments in synchrony

- [Example](#Example)

- [Resetting](#Resetting)

- [Rendering](#Rendering)

- [Cloning and Restoring](#CloningandRestoring)

- [Other](#Other)

  * [Closing](#Closing)

  * [Setting the random seed](#Settingtherandomseed)

  

Example


To use the wrapper, simply create a gym environment as normal, than reassign the variable storing this environment to the `MultiGymWrapper` object as follows:

```python

import gym

from wrapper import MultiGymWrapper

# Make a standard gym environment

env = gym.make("Qbert-v0")

# Wrap the environment inside the multi-agent wrapper object

# The parameter n specifies the number of simultaneous simulations

env = MultiGymWrapper(env, n = 8)

# Run a random episode on all 8 environments simultaneously

states = env.reset()

while True:

    actions = env.action_space.sample()

    states, rewards, terminals, infos = env.step(actions)

    env.render()

    if any(terminals):

        break

    

# Close all 8 open simulations

env.close()

```

When writing your implementation with the wrapper, keep in mind that the standard functions from the `gym` module will instead return a list in most cases, one result for each environment. For example, when sampling actions or taking steps, the results are lists. Thankfully, the wrapper overrides some of the components of the `gym` module, enabling methods such as `Env.step()` to take a list of actions as input.

Resetting


This is self-explanatory. Simply calling `.reset()` on your `MultiGymWrapper` object will automatically iterate through each simulation, calling `Env.reset()` on each of them. While the implementation of this method in the `gym` module returns an initial RGB array, this returns the initial RGB array for each open simulation:

```python

env = gym.make("Qbert-v0")

env = MultiGymWrapper(env, n = 8)

states = env.reset()

print (len(states), states[0].shape)

>> 8 (210, 160, 3)

```

Rendering


With the wrapper, you can choose to either render just **one** environment, or all of them.

You can perform either by simply specifying the approperiate parameter for the argument `which` of the `render` method:

- `which = 'one'`: Renders solely the first of the set of open simulations (default)

- `which = 'all'`: Renders all of the open simulations; this can get a bit messy

Here's an example, assuming `env` is a `MultiGymWrapper` object:

```python

# Renders solely the first of the set of open simulations

env.render(mode = 'human')

# Render all of the open simulations

env.render(mode = 'human', which = 'all')

# Rather than displaying the rendering of the first simulation,

# this will return an RGB array of the frame which would otherwise

# be displayed

rgb_frame = env.render(mode = 'rgb_array')

# This will return a list of RGB arrays, one for the current frame of each simulation

rgb_frames = env.render(mode = 'rgb_array')

```

Here's a code sample:

```python

env = gym.make("Qbert-v0")

env = MultiGymWrapper(env, n = 8)

states = env.reset()

env.render() # default is which='one'

```



Of course, the above window would actually close immediately if you run the above code sample because the script ended immediately afterwards with no frame to replace it.

Cloning and Restoring


All this is very intuitive; most methods simply return a list of the results you would expect, one for each simulation. \

This, too, works in the way you'd expect:

- `clone_full_states` and `clone_states` returns a list of cloned states for each environment as follows:

  ```python

  def clone_full_states(self):

      return [env.env.clone_full_state() for env in self.envs]

      

  def clone_states(self):

      return [env.env.clone_state() for env in self.envs]

  ```

  

- `restore_full_states` and `restore_states` work identically. They take a list of states as input.

  ```python

  def restore_full_states(self, states):

      for state, env in zip(states, self.envs):

          env.env.restore_full_state(state)

  def restore_states(self, states):

      for state, env in zip(states, self.envs):

          env.env.restore_state(state)

  ```

  

Note that the only difference between the wrapper class and the standard `gym` module is how you call these functions. In `gym`:

- `Env.env.clone_full_state()`

- `Env.env.clone_state()`

- `Env.env.restore_full_state()`

- `Env.env.restore_state()`

In the wrapper, you simply call the function on the object (also, the function name is plural):

- `Env.clone_full_states()`

- `Env.clone_states()`

- `Env.restore_full_states()`

- `Env.restore_states()`

Here's a code sample:

```python

env = gym.make("Qbert-v0")

env = MultiGymWrapper(env, n = 8)

env.reset()

rams = env.clone_states()

print (rams)

>> [array([247,  60,   6, ...,   0,   0,   0], dtype=uint8), array([247,  60,   6, ...,   0,   0,   0], dtype=uint8), array([247,  60,   6, ...,   0,   0,   0], dtype=uint8), array([247,  60,   6, ...,   0,   0,   0], dtype=uint8), array([247,  60,   6, ...,   0,   0,   0], dtype=uint8), array([247,  60,   6, ...,   0,   0,   0], dtype=uint8), array([247,  60,   6, ...,   0,   0,   0], dtype=uint8), array([247,  60,   6, ...,   0,   0,   0], dtype=uint8)]

env.restore_states(rams)

```

Other


Essentially all other functions work as you'd expect. Here are some other examples:

Closing


To close all of the simulations, simply call `.close()` on the `MultiGymWrapper` object:

```python

env.close()

```

Setting the random seed


In the standard `gym`, you specify a non-negative integer as input. In the `MultiGymWrapper`, you also just specify one non-negative integer. It will apply to all of the open simulations.

```python

seed = 42

env.seed(seed)

```

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/ryanrudes/synchronous-gym

Awesome Lists containing this project

README

Example

Resetting

Rendering

Cloning and Restoring

Other

Closing

Setting the random seed