https://github.com/molereddy/alternate-preference-optimization
[COLING 2025] Official implementation for "Alternate Preference Optimization for Unlearning Factual Knowledge in Large Language Models".
https://github.com/molereddy/alternate-preference-optimization
large-language-models unlearning
Last synced: 10 months ago
JSON representation
[COLING 2025] Official implementation for "Alternate Preference Optimization for Unlearning Factual Knowledge in Large Language Models".
- Host: GitHub
- URL: https://github.com/molereddy/alternate-preference-optimization
- Owner: molereddy
- Created: 2024-12-12T20:41:31.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2025-01-14T03:21:53.000Z (over 1 year ago)
- Last Synced: 2025-07-20T15:40:05.200Z (11 months ago)
- Topics: large-language-models, unlearning
- Language: Python
- Homepage: https://arxiv.org/abs/2409.13474
- Size: 4 MB
- Stars: 4
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Alternate Preference Optimization for Unlearning Knowledge
Implementation for [Alternate Preference Optimization for Unlearning Factual Knowledge in Large Language Models", COLING 2025](https://arxiv.org/abs/2409.13474).

In all our experiments, we rely on [TOFU](https://github.com/locuslab/tofu) checkpoints and eval logs (in the `data` folder) in our experiments. For Llama3.2 we train our own models with parameters as mentioned in the paths and configs.
## Installation
```script
conda create -n tofu python=3.12
conda activate tofu
pip install -r requirements.txt
pip install flash-attn --no-build-isolation
```
## Quick Start
### Generate Alternate Dataset
```script
python generate.py dataset_config.dataset_kwargs.name=forget10
python generate.py dataset_config.dataset_kwargs.name=forget05
python generate.py dataset_config.dataset_kwargs.name=forget01
```
### AltPO
```script
python forget.py --config-name=unlearn_llama2.yaml forget_loss=subdpo beta=0.1 retain_wt=1 seed=0 lr=5e-05 num_epochs=2 augment_k=5 batch_size=5
```
Unlearned model weights for llama2-7b can be found [here](https://huggingface.co/Dornavineeth/Llama2-7b-tofu_unlearn-forget10-altpo).
```script
model_kwargs = {
'attn_implementation': 'flash_attention_2',
'torch_dtype': torch.bfloat16,
'trust_remote_code': True,
'cache_dir': os.getenv('HF_HOME', '~/.cache/huggingface')
}
model_path = "Dornavineeth/Llama2-7b-tofu_unlearn-altpo"
model = AutoModelForCausalLM.from_pretrained(model_path, **model_kwargs)
```
### NPO
```script
python forget.py --config-name=unlearn_llama2.yaml forget_loss=npo beta=0.05 retain_wt=2 seed=0 lr=2e-05 num_epochs=10 batch_size=5
```
### IdkDPO
```script
python forget.py --config-name=unlearn_llama2.yaml forget_loss=idkdpo beta=0.1 retain_wt=1 seed=0 lr=2e-05 num_epochs=10 batch_size=5
```
You can find the stored results in `paper_models////`
## Citing Our Work
If you find this repository or our method beneficial, please cite our work:
```
@article{mekala2024alternate,
title={Alternate preference optimization for unlearning factual knowledge in large language models},
author={Mekala, Anmol and Dorna, Vineeth and Dubey, Shreya and Lalwani, Abhishek and Koleczek, David and Rungta, Mukund and Hasan, Sadid and Lobo, Elita},
journal={arXiv preprint arXiv:2409.13474},
year={2024}
}
```