https://github.com/XanderJC/attention-based-credit

Code for the paper: Dense Reward for Free in Reinforcement Learning from Human Feedback (ICML 2024) by Alex J. Chan, Hao Sun, Samuel Holt, and Mihaela van der Schaar
https://github.com/XanderJC/attention-based-credit

Last synced: 4 months ago
JSON representation

Code for the paper: Dense Reward for Free in Reinforcement Learning from Human Feedback (ICML 2024) by Alex J. Chan, Hao Sun, Samuel Holt, and Mihaela van der Schaar

Host: GitHub
URL: https://github.com/XanderJC/attention-based-credit
Owner: XanderJC
License: mit
Created: 2024-02-01T16:30:43.000Z (over 1 year ago)
Default Branch: main
Last Pushed: 2024-08-11T10:42:35.000Z (11 months ago)
Last Synced: 2024-10-15T08:02:47.125Z (8 months ago)
Language: Python
Homepage:
Size: 4.64 MB
Stars: 18
Watchers: 4
Forks: 1
Open Issues: 1
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

awesome-RLHF - official

README

# [Dense Reward for Free in Reinforcement Learning from Human Feedback](https://openreview.net/forum?id=eyxVRMrZ4m)

### Alex J. Chan, Hao Sun, Samuel Holt, and Mihaela van der Schaar

### International Conference on Machine Learning (ICML) 2024

[![License: MIT](https://img.shields.io/badge/License-MIT-blue.svg)](https://opensource.org/licenses/MIT)

Last Updated: 18 July 2024

Primary Code Author: Alex J. Chan ([email protected])

This repo is pip installable - clone it, optionally create a virtual env, and install it:

```shell
git clone https://github.com/XanderJC/attention-based-credit.git

cd attention-based-credit

pip install -r requirements.txt

pip install -e .
```

The PPO implementation used is a small update to the [TRL](https://github.com/huggingface/trl) implementation, which it inherits from. Thus, please pay attention to the version used as TRL is very actively updated and breaking changes may have been introduced.

Scripts used to run the algorithms are in `experiments/scripts`, and each experiment in the paper has essentially its own corresponding script in `experiments/bash` which runs the necessary scripts to compile all the results, for example, to reproduce the experiment in Figure 3 you run:

```shell
bash experiments/bash/IMDb_base.sh
```
Note: You will need to update some paths in the bash scripts as well as the WandB entities for experiment tracking and result loading. The experiments were run on a machine with a single NVIDIA A6000 Ada card with 48GB VRAM, so any changes in setup may also require attention.

You can then generate the results and plots using:
```shell
python experiments/plotting/IMDb.py
```
Note: These can actually already be run without re-doing the experiments as I've saved cached results in `results/numerics` that the plotting scripts can access if `--use_wandb false`.

### Citing

If you use this software please cite as follows:

```
@inproceedings{chan2024dense,
title={Dense Reward for Free in Reinforcement Learning from Human Feedback},
author={Alex James Chan and Hao Sun and Samuel Holt and Mihaela van der Schaar},
booktitle={International Conference on Machine Learning}
year={2024},
url={https://openreview.net/forum?id=eyxVRMrZ4m}
}
```

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/XanderJC/attention-based-credit

Awesome Lists containing this project

README