https://github.com/manchery/iql-pytorch

Unofficial PyTorch implementation (replicating paper results) of Implicit Q-Learning (In-sample Q-Learning) for offline RL
https://github.com/manchery/iql-pytorch

implicit-q-learning offline-reinforcement-learning pytorch reinforcement-learning

Last synced: 11 months ago
JSON representation

Unofficial PyTorch implementation (replicating paper results) of Implicit Q-Learning (In-sample Q-Learning) for offline RL

Host: GitHub
URL: https://github.com/manchery/iql-pytorch
Owner: Manchery
Created: 2022-03-17T02:25:29.000Z (about 4 years ago)
Default Branch: master
Last Pushed: 2024-11-04T15:09:34.000Z (over 1 year ago)
Last Synced: 2025-04-13T21:06:29.918Z (about 1 year ago)
Topics: implicit-q-learning, offline-reinforcement-learning, pytorch, reinforcement-learning
Language: Python
Homepage:
Size: 261 KB
Stars: 23
Watchers: 1
Forks: 1
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

          # IQL Implementation in PyTorch

## IQL

This repo is an unofficial implementation of **Implicit Q-Learning (In-sample Q-Learning)** in PyTorch.

```

@inproceedings{

    kostrikov2022offline,

    title={Offline Reinforcement Learning with Implicit Q-Learning},

    author={Ilya Kostrikov and Ashvin Nair and Sergey Levine},

    booktitle={International Conference on Learning Representations},

    year={2022},

    url={https://openreview.net/forum?id=68n2s9ZJWF8}

}

```

**Note**: Reward standardization (_We standardize MuJoCo locomotion task rewards by dividing by the difference of returns of the best and worst trajectories in each dataset_) used in [official implementation](https://github.com/ikostrikov/implicit_q_learning/blob/09d700248117881a75cb21f0adb95c6c8a694cb2/train_offline.py#L51C18-L51C18) is missed in this implementation. One can easily add it by itself.

## Train

### Gym-MuJoCo

```

python main_iql.py --env halfcheetah-medium-v2 --expectile 0.7 --temperature 3.0 --eval_freq 5000 --eval_episodes 10 --normalize

```

### AntMaze

```

python main_iql.py --env antmaze-medium-play-v2 --expectile 0.9 --temperature 10.0 --eval_freq 50000 --eval_episodes 100

```

## Results

![mujoco_results](imgs/mujoco_results.png)

![antmaze_results](imgs/antmaze_results.png)

## Acknowledgement

This repo borrows heavily from [sfujim/TD3_BC](https://github.com/sfujim/TD3_BC) and [ikostrikov/implicit_q_learning](https://github.com/ikostrikov/implicit_q_learning).

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/manchery/iql-pytorch

Awesome Lists containing this project

README