https://github.com/dannylee1020/openpo
Building synthetic data for preference tuning
https://github.com/dannylee1020/openpo
ai ai-feedback dpo evaluation finetuning huggingface llm llm-evaluation python rlaif rlhf synthetic-data synthetic-data-generation
Last synced: about 1 month ago
JSON representation
Building synthetic data for preference tuning
- Host: GitHub
- URL: https://github.com/dannylee1020/openpo
- Owner: dannylee1020
- License: apache-2.0
- Created: 2024-10-28T22:09:54.000Z (12 months ago)
- Default Branch: master
- Last Pushed: 2024-12-26T23:02:32.000Z (10 months ago)
- Last Synced: 2025-09-10T07:40:58.706Z (about 1 month ago)
- Topics: ai, ai-feedback, dpo, evaluation, finetuning, huggingface, llm, llm-evaluation, python, rlaif, rlhf, synthetic-data, synthetic-data-generation
- Language: Python
- Homepage: https://docs.openpo.dev
- Size: 10.7 MB
- Stars: 27
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# OpenPO 🐼
[](https://pypi.org/project/openpo/)
[](https://opensource.org/licenses/Apache-2.0)
[](https://docs.openpo.dev)OpenPO simplifies building synthetic dataset with AI feedback and state-of-art evaluation methods.
| Resources | Notebooks |
|----------|----------|
| Building dataset with OpenPO and PairRM |📔 [Notebook](https://colab.research.google.com/drive/1G1T-vOTXjIXuRX3h9OlqgnE04-6IpwIf?usp=sharing) |
| Using OpenPO with Prometheus 2 | 📔 [Notebook](https://colab.research.google.com/drive/1dro0jX1MOfSg0srfjA_DZyeWIWKOuJn2?usp=sharing) |
| Evaluating with LLM Judge| 📔 [Notebook](https://colab.research.google.com/drive/1_QrmejW2Ym8yzP5RLJbLpVNA_FsEt2ZG?usp=sharing) |
| Building dataset using vLLM| 📔 [Notebook](https://colab.research.google.com/drive/1GKHpOv4jRaWhwSDKCEZpl_kIfIyHGs73?usp=sharing) |## Key Features
- 🤖 **Multiple LLM Support**: Collect diverse set of outputs from 200+ LLMs
- ⚡ **High Performance Inference**: Native vLLM support for optimized inference
- 🚀 **Scalable Processing**: Built-in batch processing capabilities for efficient large-scale data generation
- 📊 **Research-Backed Evaluation Methods**: Support for state-of-art evaluation methods for data synthesis
- 💾 **Flexible Storage:** Out of the box storage providers for HuggingFace and S3.
## Installation
### Install from PyPI (recommended)
OpenPO uses pip for installation. Run the following command in the terminal to install OpenPO:```bash
pip install openpo# to use vllm
pip install openpo[vllm]# for running evaluation models
pip install openpo[eval]
```### Install from source
Clone the repository first then run the follow command
```bash
cd openpo
poetry install
```## Getting Started
set your environment variable first```bash
# for completions
export HF_API_KEY=
export OPENROUTER_API_KEY=# for evaluations
export OPENAI_API_KEY=
export ANTHROPIC_API_KEY=
```### Completion
To get started with collecting LLM responses, simply pass in a list of model names of your choice> [!NOTE]
> OpenPO requires provider name to be prepended to the model identifier.```python
import os
from openpo import OpenPOclient = OpenPO()
response = client.completion.generate(
models = [
"huggingface/Qwen/Qwen2.5-Coder-32B-Instruct",
"huggingface/mistralai/Mistral-7B-Instruct-v0.3",
"huggingface/microsoft/Phi-3.5-mini-instruct",
],
messages=[
{"role": "system", "content": PROMPT},
{"role": "system", "content": MESSAGE},
],
)
```You can also call models with OpenRouter.
```python
# make request to OpenRouter
client = OpenPO()response = client.completion.generate(
models = [
"openrouter/qwen/qwen-2.5-coder-32b-instruct",
"openrouter/mistralai/mistral-7b-instruct-v0.3",
"openrouter/microsoft/phi-3.5-mini-128k-instruct",
],
messages=[
{"role": "system", "content": PROMPT},
{"role": "system", "content": MESSAGE},
],)
```OpenPO takes default model parameters as a dictionary. Take a look at the documentation for more detail.
```python
response = client.completion.generate(
models = [
"huggingface/Qwen/Qwen2.5-Coder-32B-Instruct",
"huggingface/mistralai/Mistral-7B-Instruct-v0.3",
"huggingface/microsoft/Phi-3.5-mini-instruct",
],
messages=[
{"role": "system", "content": PROMPT},
{"role": "system", "content": MESSAGE},
],
params={
"max_tokens": 500,
"temperature": 1.0,
}
)```
### Evaluation
OpenPO offers various ways to synthesize your dataset.#### LLM-as-a-Judge
To use single judge to evaluate your response data, use `evaluate.eval````python
client = OpenPO()res = openpo.evaluate.eval(
models=['openai/gpt-4o'],
questions=questions,
responses=responses,
)
```To use multi judge, pass multiple judge models
```python
res_a, res_b = openpo.evaluate.eval(
models=["openai/gpt-4o", "anthropic/claude-sonnet-3-5-latest"],
questions=questions,
responses=responses,
)# get consensus for multi judge responses.
result = openpo.evaluate.get_consensus(
eval_A=res_a,
eval_B=res_b,
)
```OpnePO supports batch processing for evaluating large dataset in a cost-effective way.
> [!NOTE]
> Batch processing is an asynchronous operation and could take up to 24 hours (usually completes much faster).```python
info = openpo.batch.eval(
models=["openai/gpt-4o", "anthropic/claude-sonnet-3-5-latest"],
questions=questions,
responses=responses,
)# check status
status = openpo.batch.check_status(batch_id=info.id)
```For multi-judge with batch processing:
```python
batch_a, batch_b = openpo.batch.eval(
models=["openai/gpt-4o", "anthropic/claude-sonnet-3-5-latest"],
questions=questions,
responses=responses,
)result = openpo.batch.get_consensus(
batch_A=batch_a_result,
batch_B=batch_b_result,
)
```#### Pre-trained Models
You can use pre-trained open source evaluation models. OpenPo currently supports two types of models: `PairRM` and `Prometheus2`.> [!NOTE]
> Appropriate hardware with GPU and memory is required to make inference with pre-trained models.To use PairRM to rank responses:
```python
from openpo import PairRMpairrm = PairRM()
res = pairrm.eval(prompts, responses)
```To use Prometheus2:
```python
from openpo import Prometheus2pm = Prometheus2(model="prometheus-eval/prometheus-7b-v2.0")
feedback = pm.eval_relative(
instructions=instructions,
responses_A=response_A,
responses_B=response_B,
rubric='reasoning',
)
```### Storing Data
Use out of the box storage class to easily upload and download data.```python
from openpo.storage import HuggingFaceStorage
hf_storage = HuggingFaceStorage()# push data to repo
preference = {"prompt": "text", "preferred": "response1", "rejected": "response2"}
hf_storage.push_to_repo(repo_id="my-hf-repo", data=preference)# Load data from repo
data = hf_storage.load_from_repo(path="my-hf-repo")
```## Contributing
Contributions are what makes open source amazingly special! Here's how you can help:### Development Setup
1. Clone the repository
```bash
git clone https://github.com/yourusername/openpo.git
cd openpo
```1. Install Poetry (dependency management tool)
```bash
curl -sSL https://install.python-poetry.org | python3 -
```1. Install dependencies
```bash
poetry install
```### Development Workflow
1. Create a new branch for your feature
```bash
git checkout -b feature-name
```2. Submit a Pull Request
- Write a clear description of your changes
- Reference any related issues