https://github.com/molereddy/alternate-preference-optimization

[COLING 2025] Official implementation for "Alternate Preference Optimization for Unlearning Factual Knowledge in Large Language Models".
https://github.com/molereddy/alternate-preference-optimization

large-language-models unlearning

Last synced: 10 months ago
JSON representation

[COLING 2025] Official implementation for "Alternate Preference Optimization for Unlearning Factual Knowledge in Large Language Models".

Host: GitHub
URL: https://github.com/molereddy/alternate-preference-optimization
Owner: molereddy
Created: 2024-12-12T20:41:31.000Z (over 1 year ago)
Default Branch: main
Last Pushed: 2025-01-14T03:21:53.000Z (over 1 year ago)
Last Synced: 2025-07-20T15:40:05.200Z (11 months ago)
Topics: large-language-models, unlearning
Language: Python
Homepage: https://arxiv.org/abs/2409.13474
Size: 4 MB
Stars: 4
Watchers: 2
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

          # Alternate Preference Optimization for Unlearning Knowledge

Implementation for [Alternate Preference Optimization for Unlearning Factual Knowledge in Large Language Models", COLING 2025](https://arxiv.org/abs/2409.13474).

![SVG Image](./assets/AltPO.svg)

In all our experiments, we rely on [TOFU](https://github.com/locuslab/tofu) checkpoints and eval logs (in the `data` folder) in our experiments. For Llama3.2 we train our own models with parameters as mentioned in the paths and configs.

## Installation

```script

conda create -n tofu python=3.12

conda activate tofu

pip install -r requirements.txt

pip install flash-attn --no-build-isolation

```

## Quick Start

### Generate Alternate Dataset

```script

python generate.py dataset_config.dataset_kwargs.name=forget10

python generate.py dataset_config.dataset_kwargs.name=forget05

python generate.py dataset_config.dataset_kwargs.name=forget01

```

### AltPO

```script

python forget.py --config-name=unlearn_llama2.yaml forget_loss=subdpo beta=0.1 retain_wt=1 seed=0 lr=5e-05 num_epochs=2 augment_k=5 batch_size=5

```

Unlearned model weights for llama2-7b can be found [here](https://huggingface.co/Dornavineeth/Llama2-7b-tofu_unlearn-forget10-altpo).

```script

model_kwargs = {

        'attn_implementation': 'flash_attention_2',

        'torch_dtype': torch.bfloat16,

        'trust_remote_code': True,

        'cache_dir': os.getenv('HF_HOME', '~/.cache/huggingface')

    }

model_path = "Dornavineeth/Llama2-7b-tofu_unlearn-altpo"

model = AutoModelForCausalLM.from_pretrained(model_path, **model_kwargs)

```

### NPO

```script

python forget.py --config-name=unlearn_llama2.yaml forget_loss=npo beta=0.05 retain_wt=2 seed=0 lr=2e-05 num_epochs=10 batch_size=5

```

### IdkDPO

```script

python forget.py --config-name=unlearn_llama2.yaml forget_loss=idkdpo beta=0.1 retain_wt=1 seed=0 lr=2e-05 num_epochs=10 batch_size=5

```

You can find the stored results in `paper_models////`

## Citing Our Work

If you find this repository or our method beneficial, please cite our work:

```

@article{mekala2024alternate,

  title={Alternate preference optimization for unlearning factual knowledge in large language models},

  author={Mekala, Anmol and Dorna, Vineeth and Dubey, Shreya and Lalwani, Abhishek and Koleczek, David and Rungta, Mukund and Hasan, Sadid and Lobo, Elita},

  journal={arXiv preprint arXiv:2409.13474},

  year={2024}

}

```

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/molereddy/alternate-preference-optimization

Awesome Lists containing this project

README