Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/ucla-rlcourse/RLexample
Some basic examples of playing with RL
https://github.com/ucla-rlcourse/RLexample
Last synced: 3 months ago
JSON representation
Some basic examples of playing with RL
- Host: GitHub
- URL: https://github.com/ucla-rlcourse/RLexample
- Owner: ucla-rlcourse
- Created: 2019-01-09T03:25:23.000Z (almost 6 years ago)
- Default Branch: master
- Last Pushed: 2023-10-11T04:55:34.000Z (about 1 year ago)
- Last Synced: 2024-06-04T14:38:57.793Z (5 months ago)
- Language: Python
- Size: 17.9 MB
- Stars: 1,192
- Watchers: 20
- Forks: 301
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Some basic examples for reinforcement learning
## Installing Anaconda and Gymnasium
* Download and install Anaconda [here](https://www.anaconda.com/download)
* Create conda env for managing dependencies and activate the conda env
```
conda create -n conda_env
conda activate conda_env
```
* Install gymnasium (Dependencies installed by pip will also go to the conda env)
```
pip install gymnasium[all]
pip install gymnasium[accept-rom-license]# Try the next line if box2d-py fails to install.
conda install swig
```
* Install ai2thor if you want to run navigation_agent.py
```
pip install ai2thor==2.4.10
```
* Install torch with either conda or pip
```
conda install pytorch torchvision torchaudio pytorch-cuda=12.1 -c pytorch -c nvidia
```
```
pip install torch torchvision torchaudio
```
* Install other dependencies
```
pip install numpy pandas matplotlib
```## Examples
* Play with the environment and visualize the agent behaviour
```
import gymnasium as gym
render = True # switch if visualize the agent
if render:
env = gym.make('CartPole-v0', render_mode='human')
else:
env = gym.make('CartPole-v0')
env.reset(seed=0)
for _ in range(1000):
env.step(env.action_space.sample()) # take a random action
env.close()
```* Random play with ```CartPole-v0```
```
import gymnasium as gym
env = gym.make('CartPole-v0')
for i_episode in range(20):
observation = env.reset()
for t in range(100):
print(observation)
action = env.action_space.sample()
observation, reward, terminated, truncated, info = env.step(action)
done = np.logical_or(terminated, truncated)
env.close()
```* Example code for random playing (```Pong-ram-v0```,```Acrobot-v1```,```Breakout-v0```)
```
python my_random_agent.py Pong-ram-v0
```* Very naive learnable agent playing ```CartPole-v0``` or ```Acrobot-v1```
```
python my_learning_agent.py CartPole-v0```
* Playing Pong on CPU (with a great [blog](http://karpathy.github.io/2016/05/31/rl/)). One pretrained model is ```pong_model_bolei.p```(after training 20,000 episodes), which you can load in by replacing [save_file](https://github.com/metalbubble/RLexample/blob/master/pg-pong.py#L15) in the script.
```
python pg-pong.py```
* Random navigation agent in [AI2THOR](https://github.com/allenai/ai2thor)
```
python navigation_agent.py
```