https://github.com/rucaibox/figa
https://github.com/rucaibox/figa
Last synced: 21 days ago
JSON representation
- Host: GitHub
- URL: https://github.com/rucaibox/figa
- Owner: RUCAIBox
- Created: 2024-03-01T06:15:15.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2024-05-05T14:58:18.000Z (over 1 year ago)
- Last Synced: 2024-05-05T15:25:01.747Z (over 1 year ago)
- Language: Python
- Size: 5.66 MB
- Stars: 3
- Watchers: 1
- Forks: 2
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# FIGA
This repository is the official implementation of ICLR 2024 paper: **[Beyond Imitation: Leveraging Fine-grained Quality Signals for Alignment](https://arxiv.org/pdf/2311.04072.pdf)**.## Quick Start
Considering that a modified version of transformers will be installed, it is recommended to create a new conda environment:
```bash
conda create -n FIGA python=3.8
conda activate FIGA
conda install pytorch==2.1.0 torchvision==0.16.0 torchaudio==2.1.0 pytorch-cuda=12.1 -c pytorch -c nvidia
```
You should clone the FIGA repository and follow its instructions.
```bash
git clone https://github.com/RUCAIBox/FIGA.git && cd FIGA
pip install -r requirements.txt
```After this, you need to replace the `trainer_utils.py` and `modeling_llama.py` files in the transformers library with the corresponding files from this repository. This is necessary for fine-tuning using the FIGA method.
## SPA Dataset
You can download SPA dataset in: https://huggingface.co/datasets/RUCAIBox/SPA.
For our publicly available SPA dataset, the `output` field is the ground truth response, the `original_output` field contains results generated by the alpaca-7b model, and the `revised_output` field contains results modified by using a more powerful model (i.e. ChatGPT-3.5). For a detailed description of the construction process of the SPA dataset, please refer to our paper.
## Instruction tuning
After setting up the environment, you can utilize the FIGA method to fine-tune the model by referring to the code provided below:```bash
bash bash/run_7b.sh > output.log 2>&1
```## Acknowledgment
Please cite the following paper if you find our code or data helpful.
```
@article{guo2023beyond,
title={Beyond imitation: Leveraging fine-grained quality signals for alignment},
author={Guo, Geyang and Zhao, Ranchi and Tang, Tianyi and Zhao, Wayne Xin and Wen, Ji-Rong},
journal={arXiv preprint arXiv:2311.04072},
year={2023}
}
```