https://github.com/general-preference/general-preference-model

Official implementation of ICML 2025 paper "Beyond Bradley-Terry Models: A General Preference Model for Language Model Alignment" (https://arxiv.org/abs/2410.02197)
https://github.com/general-preference/general-preference-model

alignment large-language-models preference-modeling preference-optimization rlhf

Last synced: 10 months ago
JSON representation

Official implementation of ICML 2025 paper "Beyond Bradley-Terry Models: A General Preference Model for Language Model Alignment" (https://arxiv.org/abs/2410.02197)

Host: GitHub
URL: https://github.com/general-preference/general-preference-model
Owner: general-preference
License: apache-2.0
Created: 2024-10-02T08:54:42.000Z (almost 2 years ago)
Default Branch: main
Last Pushed: 2025-08-15T09:25:28.000Z (11 months ago)
Last Synced: 2025-08-15T11:24:12.466Z (11 months ago)
Topics: alignment, large-language-models, preference-modeling, preference-optimization, rlhf
Language: Python
Homepage: https://arxiv.org/abs/2410.02197
Size: 382 KB
Stars: 25
Watchers: 2
Forks: 3
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE.txt

Awesome Lists containing this project

README

          # General Preference Model (GPM)

[![arXiv](https://img.shields.io/badge/arXiv-2410.02197-b31b1b.svg)](https://arxiv.org/abs/2410.02197) 

[![Website](https://img.shields.io/badge/Project-Website-green)](https://general-preference.github.io/general-preference-model) 

[![Huggingface Paper](https://img.shields.io/badge/Huggingface-Papers-blue)](https://huggingface.co/papers/2410.02197) 

[ICML 2025] Beyond Bradley-Terry Models: A General Preference

Model for Language Model Alignment, [https://arxiv.org/abs/2410.02197](https://arxiv.org/abs/2410.02197). As Huggingface Daily Papers: [https://huggingface.co/papers/2410.02197](https://huggingface.co/papers/2410.02197)

![](GPM.png)

## Introduction

This repository is designed for training and evaluating the General Preference embedding model (GPM). It includes the following:

* Training code for both GPM and BT reward models.

* Evaluation code adapted from [RewardBench](https://github.com/allenai/reward-bench) for evaluating GPM and BT RM.

* Code for LLM Alignment using GPM + GPO/SPPO: `./LLM-Alignment`.

## Key Components:

`scripts/run_train_rm_general_preference_single.sh`: Run training for reward models on a single node. `scripts/run_train_rm_general_preference_multi.sh`: Run training for reward models across multiple nodes.  

`rewardbench_eval/run_rm_rewardbench.sh`: Run RewardBench evaluations for reward models.

`eval/batch_inference_rm_general_preference.sh`: Run evaluations on a custom-defined dataset.

`general_preference`: Useful code for training, heavily based on [OpenRLHF](https://github.com/OpenRLHF/OpenRLHF).

## Quick Start

### Installation

rewardbench==0.1.2 depends on transformers==4.44.0. Install rewardbench first to avoid dependency conflicts. 

```bash

git clone https://github.com/general-preference/general-preference-model

cd general-preference-model

pip install rewardbench==0.1.2

pip install -e .

```

Reinstall deepspeed with specific build options. 

```bash

export DS_BUILD_SPARSE_ATTN=0; export DS_BUILD_EVOFORMER_ATTN=0; DS_BUILD_OPS=1 pip install deepspeed==0.13.5 --no-cache --force-reinstall

```

If the installation fails, install torch before other packages.

```bash

pip install torch==2.3.0

pip install -e .

```

## Example Usage of the GPM

Below is an example code snippet (see `./gpm_example_usage.py`):

```python

prompt_text = "Describe the importance of reading books in today's digital age."

response1 = "Books remain crucial in the digital era, offering in-depth knowledge and fostering critical thinking. They provide a unique, immersive experience that digital media can't replicate, contributing significantly to personal and intellectual growth."

response2 = "Books are still useful for learning new things. They help you relax and can be a good break from screens."

context1 = [

    {"role": "user", "content": prompt_text},

    {"role": "assistant", "content": response1}

]

context2 = [

    {"role": "user", "content": prompt_text},

    {"role": "assistant", "content": response2}

]

rm = GPMPipeline("general-preference/GPM-Llama-3.1-8B", value_head_dim=6)

reward1, prompt_hidden_state = rm([context1], return_prompt=True)

reward2 = rm([context2])

result = generate_high_dim_result_with_prompt(rm.model, rm.value_head_dim, reward1, reward2, prompt_hidden_state)

result_batch = result.float().cpu().detach().numpy().tolist()

results = []

[

    results.append(1) if result > 0 else results.append(0)

    for result in result_batch

]

print(result_batch)

```

## Citations

Please cite the paper and star this repo if you use the General Preference embedding Model (GPM) and General Preference Optimization (GPO) and find it interesting/useful, thanks! Feel free to contact zhangge19951114@gmail.com | yifanzhangresearch@gmail.com or open an issue if you have any questions.

```

@inproceedings{zhang2024beyond,

  title={Beyond Bradley-Terry Models: A General Preference Model for Language Model Alignment},

  author={Zhang, Yifan and Zhang, Ge and Wu, Yue and Xu, Kangping and Gu, Quanquan},

  booktitle={Forty-second International Conference on Machine Learning}

}

```

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/general-preference/general-preference-model

Awesome Lists containing this project

README