https://github.com/carperai/autocrit

A repository for transformer critique learning and generation
https://github.com/carperai/autocrit

Last synced: about 1 year ago
JSON representation

A repository for transformer critique learning and generation

Host: GitHub
URL: https://github.com/carperai/autocrit
Owner: CarperAI
Created: 2023-04-07T13:57:34.000Z (over 3 years ago)
Default Branch: main
Last Pushed: 2023-12-07T17:58:14.000Z (over 2 years ago)
Last Synced: 2023-12-10T19:33:29.247Z (over 2 years ago)
Language: Python
Size: 1.12 MB
Stars: 68
Watchers: 6
Forks: 12
Open Issues: 6
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# AutoCrit
A repository for transformer critique learning and generation.

## Scalar reward models
Train [OpenLLaMA-13B](https://github.com/openlm-research/open_llama) on [Helpful and Harmless dataset](https://github.com/anthropics/hh-rlhf):

```bash
accelerate launch --config_file configs/accelerate/zero2.yaml \
train_reward_model.py \
--model_path openlm-research/open_llama_13b \
--dataset pvduy/rm_oa_hh \
--batch_size 1 \
--eval_interval 1000 \
--lr 0.00001 \
--weight_decay 0 \
--num_unfrozen_layers 12 \
--gradient_checkpointing \
--checkpoint_dir checkpoints \
--calibration_datasets reciprocate/vicuna-fair-eval
```

Usage:
```python
from transformers import AutoModelForSequenceClassification, AutoTokenizer

ckpt = "reciprocate/openllama-13b_rm_oasst-hh"
model = AutoModelForSequenceClassification.from_pretrained(ckpt, load_in_4bit=True)
tokenizer = AutoTokenizer.from_pretrained(ckpt)

model(**tokenizer("ASSISTANT: This sentence is a lie.", return_tensors="pt"))[0].item()
```

Output:
```python
-1.626953125
```

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/carperai/autocrit

Awesome Lists containing this project

README