https://github.com/mimoralea/gym-walk
Random walk OpenAI Gym environment.
https://github.com/mimoralea/gym-walk
environment reinforcement-learning reinforcement-learning-excercises
Last synced: about 1 month ago
JSON representation
Random walk OpenAI Gym environment.
- Host: GitHub
- URL: https://github.com/mimoralea/gym-walk
- Owner: mimoralea
- License: mit
- Created: 2018-06-14T01:16:09.000Z (almost 7 years ago)
- Default Branch: master
- Last Pushed: 2024-12-16T17:23:23.000Z (5 months ago)
- Last Synced: 2025-03-28T12:21:19.346Z (about 2 months ago)
- Topics: environment, reinforcement-learning, reinforcement-learning-excercises
- Language: Python
- Homepage:
- Size: 35.2 KB
- Stars: 20
- Watchers: 3
- Forks: 11
- Open Issues: 6
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# gym-walk
## Installation
```bash
git clone https://github.com/mimoralea/gym-walk.git
cd gym-walk
pip install .
```or:
```bash
pip install git+https://github.com/mimoralea/gym-walk#egg=gym-walk
```## Use
```python
import gym, gym_walk, numpy as np
env = gym.make('WalkFive-v0')
pi = lambda x: np.random.randint(2)def td(pi, env, gamma=1.0, alpha=0.01, n_episodes=100000):
V = np.zeros(env.observation_space.n)
for t in range(n_episodes):
state, done = env.reset(), False
while not done:
action = pi(state)
next_state, reward, done, _ = env.step(action)
td_target = reward + gamma * V[next_state] * (not done)
td_error = td_target - V[state]
V[state] = V[state] + alpha * td_error
state = next_state
return VV = td(pi, env)
V
```