https://github.com/dotpyu/llm-rewards
A lightweight package for composing and stacking rewards for foundation model alignment.
https://github.com/dotpyu/llm-rewards
Last synced: about 2 months ago
JSON representation
A lightweight package for composing and stacking rewards for foundation model alignment.
- Host: GitHub
- URL: https://github.com/dotpyu/llm-rewards
- Owner: dotpyu
- Created: 2025-01-26T16:10:44.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2025-02-05T22:27:30.000Z (over 1 year ago)
- Last Synced: 2025-12-15T17:57:17.934Z (6 months ago)
- Language: Python
- Size: 17.6 KB
- Stars: 3
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# LLM Rewards
A lean, modular reward functions for RLHF training with LLMs. Framework-agnostic design with built-in support for trlx, trl, and custom training loops.
## Install
```bash
pip install llm-rewards
```
## Quick Start
```python
from llm_rewards import RewardModel, SimpleThinkReward, LengthReward, XMLReward, create_reward_fn
# Create reward stack
rewards = [
LengthReward(target_length=1024, weight=0.1),
XMLReward(weight=0.5, partial_credit=True),
RewardModel("your/reward/model", weight=1.0),
SimpleThinkReward(weight=0.5)
]
# Get framework-agnostic reward function
reward_fn = create_reward_fn(rewards, normalize=True)
# Use with trlx
from trlx import Trainer
trainer = Trainer(reward_fn=reward_fn)
trainer.train(...)
```
## Key Features
- Transformer reward models
- Reasoning validation (ThinkingReward)
- Length, format, XML validation
- Reference similarity
- Prompt relevance
- Framework adapters
- Batched inference
- Reward normalization
## Example Training Script
See `example/train_example.py` for full Qwen-2.5 0.5B training example.
## Custom Rewards
```python
from llm_rewards import RewardFunction, RewardOutput
class MyReward(RewardFunction):
def compute(self, texts, **kwargs) -> RewardOutput:
rewards = [score(text) for text in texts]
return RewardOutput(values=torch.tensor(rewards))
```
## License
MIT