https://github.com/ryanrudes/synchronous-gym
A wrapper for OpenAI gym which enables use of multiple environments in synchrony
https://github.com/ryanrudes/synchronous-gym
gym-environment gym-environments machine-learning reinforcement-learning reinforcement-learning-environments
Last synced: 26 days ago
JSON representation
A wrapper for OpenAI gym which enables use of multiple environments in synchrony
- Host: GitHub
- URL: https://github.com/ryanrudes/synchronous-gym
- Owner: ryanrudes
- Created: 2020-10-04T23:57:29.000Z (about 5 years ago)
- Default Branch: main
- Last Pushed: 2020-10-06T00:59:21.000Z (about 5 years ago)
- Last Synced: 2025-09-02T03:55:26.345Z (about 1 month ago)
- Topics: gym-environment, gym-environments, machine-learning, reinforcement-learning, reinforcement-learning-environments
- Language: Python
- Homepage:
- Size: 47.9 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Synchronous Gym
A wrapper for OpenAI's `gym` module which enables use of multiple environments in synchrony
- [Example](#Example)
- [Resetting](#Resetting)
- [Rendering](#Rendering)
- [Cloning and Restoring](#CloningandRestoring)
- [Other](#Other)
* [Closing](#Closing)
* [Setting the random seed](#Settingtherandomseed)
Example
To use the wrapper, simply create a gym environment as normal, than reassign the variable storing this environment to the `MultiGymWrapper` object as follows:
```python
import gym
from wrapper import MultiGymWrapper# Make a standard gym environment
env = gym.make("Qbert-v0")# Wrap the environment inside the multi-agent wrapper object
# The parameter n specifies the number of simultaneous simulations
env = MultiGymWrapper(env, n = 8)# Run a random episode on all 8 environments simultaneously
states = env.reset()while True:
actions = env.action_space.sample()
states, rewards, terminals, infos = env.step(actions)
env.render()
if any(terminals):
break
# Close all 8 open simulations
env.close()
```When writing your implementation with the wrapper, keep in mind that the standard functions from the `gym` module will instead return a list in most cases, one result for each environment. For example, when sampling actions or taking steps, the results are lists. Thankfully, the wrapper overrides some of the components of the `gym` module, enabling methods such as `Env.step()` to take a list of actions as input.
Resetting
This is self-explanatory. Simply calling `.reset()` on your `MultiGymWrapper` object will automatically iterate through each simulation, calling `Env.reset()` on each of them. While the implementation of this method in the `gym` module returns an initial RGB array, this returns the initial RGB array for each open simulation:
```python
env = gym.make("Qbert-v0")
env = MultiGymWrapper(env, n = 8)
states = env.reset()
print (len(states), states[0].shape)
>> 8 (210, 160, 3)
```Rendering
With the wrapper, you can choose to either render just **one** environment, or all of them.
You can perform either by simply specifying the approperiate parameter for the argument `which` of the `render` method:
- `which = 'one'`: Renders solely the first of the set of open simulations (default)
- `which = 'all'`: Renders all of the open simulations; this can get a bit messyHere's an example, assuming `env` is a `MultiGymWrapper` object:
```python
# Renders solely the first of the set of open simulations
env.render(mode = 'human')# Render all of the open simulations
env.render(mode = 'human', which = 'all')# Rather than displaying the rendering of the first simulation,
# this will return an RGB array of the frame which would otherwise
# be displayed
rgb_frame = env.render(mode = 'rgb_array')# This will return a list of RGB arrays, one for the current frame of each simulation
rgb_frames = env.render(mode = 'rgb_array')
```Here's a code sample:
```python
env = gym.make("Qbert-v0")
env = MultiGymWrapper(env, n = 8)
states = env.reset()
env.render() # default is which='one'
```Of course, the above window would actually close immediately if you run the above code sample because the script ended immediately afterwards with no frame to replace it.
Cloning and Restoring
All this is very intuitive; most methods simply return a list of the results you would expect, one for each simulation. \
This, too, works in the way you'd expect:- `clone_full_states` and `clone_states` returns a list of cloned states for each environment as follows:
```python
def clone_full_states(self):
return [env.env.clone_full_state() for env in self.envs]
def clone_states(self):
return [env.env.clone_state() for env in self.envs]
```
- `restore_full_states` and `restore_states` work identically. They take a list of states as input.
```python
def restore_full_states(self, states):
for state, env in zip(states, self.envs):
env.env.restore_full_state(state)def restore_states(self, states):
for state, env in zip(states, self.envs):
env.env.restore_state(state)
```
Note that the only difference between the wrapper class and the standard `gym` module is how you call these functions. In `gym`:
- `Env.env.clone_full_state()`
- `Env.env.clone_state()`
- `Env.env.restore_full_state()`
- `Env.env.restore_state()`In the wrapper, you simply call the function on the object (also, the function name is plural):
- `Env.clone_full_states()`
- `Env.clone_states()`
- `Env.restore_full_states()`
- `Env.restore_states()`Here's a code sample:
```python
env = gym.make("Qbert-v0")
env = MultiGymWrapper(env, n = 8)
env.reset()
rams = env.clone_states()
print (rams)
>> [array([247, 60, 6, ..., 0, 0, 0], dtype=uint8), array([247, 60, 6, ..., 0, 0, 0], dtype=uint8), array([247, 60, 6, ..., 0, 0, 0], dtype=uint8), array([247, 60, 6, ..., 0, 0, 0], dtype=uint8), array([247, 60, 6, ..., 0, 0, 0], dtype=uint8), array([247, 60, 6, ..., 0, 0, 0], dtype=uint8), array([247, 60, 6, ..., 0, 0, 0], dtype=uint8), array([247, 60, 6, ..., 0, 0, 0], dtype=uint8)]
env.restore_states(rams)
```Other
Essentially all other functions work as you'd expect. Here are some other examples:
Closing
To close all of the simulations, simply call `.close()` on the `MultiGymWrapper` object:
```python
env.close()
```Setting the random seed
In the standard `gym`, you specify a non-negative integer as input. In the `MultiGymWrapper`, you also just specify one non-negative integer. It will apply to all of the open simulations.
```python
seed = 42
env.seed(seed)
```