https://github.com/yaojin17/Unlearning_LLM
[ACL 2024] Code and data for "Machine Unlearning of Pre-trained Large Language Models"
https://github.com/yaojin17/Unlearning_LLM
copyright-protection data-privacy llm llm-unlearning machine-unlearning unlearning
Last synced: about 2 months ago
JSON representation
[ACL 2024] Code and data for "Machine Unlearning of Pre-trained Large Language Models"
- Host: GitHub
- URL: https://github.com/yaojin17/Unlearning_LLM
- Owner: yaojin17
- License: mit
- Created: 2024-02-26T02:46:44.000Z (about 1 year ago)
- Default Branch: main
- Last Pushed: 2024-09-30T01:22:07.000Z (8 months ago)
- Last Synced: 2024-10-29T04:34:30.672Z (7 months ago)
- Topics: copyright-protection, data-privacy, llm, llm-unlearning, machine-unlearning, unlearning
- Language: Python
- Homepage:
- Size: 49.8 KB
- Stars: 45
- Watchers: 2
- Forks: 1
- Open Issues: 2
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
- awesome-llm-unlearning - GitHub
README
# 🤖 Unlearning_LLM
This repo contains code and data for the ACL 2024 paper "[Machine Unlearning of Pre-trained Large Language Models](https://arxiv.org/abs/2402.15159)"[Paper](https://arxiv.org/pdf/2402.15159.pdf) | [Dataset](https://huggingface.co/datasets/llmunlearn/unlearn_dataset)
## 🌟 Abstract
Abstract
This study investigates the concept of the `right to be forgotten' within the context of large language models (LLMs). We explore machine unlearning as a pivotal solution, with a focus on pre-trained models--a notably under-researched area. Our research delineates a comprehensive framework for machine unlearning in pre-trained LLMs, encompassing a critical analysis of seven diverse unlearning methods. Through rigorous evaluation using curated datasets from arXiv, books, and GitHub, we establish a robust benchmark for unlearning performance, demonstrating that these methods are over $10^5$ times more computationally efficient than retraining. Our results show that integrating gradient ascent with gradient descent on in-distribution data improves hyperparameter robustness. We also provide detailed guidelines for efficient hyperparameter tuning in the unlearning process. Our findings advance the discourse on ethical AI practices, offering substantive insights into the mechanics of machine unlearning for pre-trained LLMs and underscoring the potential for responsible AI development.
## 📊 Dataset
We collect and provide the **unlearn_dataset**, which serves as a benchmark for evaluating unlearning methodologies in pre-trained large language models across diverse domains, including arXiv, GitHub. Access our **unlearn_dataset** directly on [Hugging Face](https://huggingface.co/datasets/llmunlearn/unlearn_dataset).### 🔍 Loading the datasets
To load the dataset:
```python
from datasets import load_dataset
dataset = load_dataset("llmunlearn/unlearn_dataset", name="arxiv", split="forget")
```
* Available configuration names and corresponding splits:
- `arxiv`: `forget, approximate, retain`
- `github`: `forget, approximate, retain`
- `general`: `evaluation, retain`## ✈️ How to run
### Environment Setup
```
git clone https://github.com/yaojin17/Unlearning_LLM.git
cd Unlearning_LLM
conda install pytorch torchvision torchaudio cudatoolkit=11.8 -c pytorch
pip install -e .
pip install -r requirements.txt
```
### Download Yi-6B model
```
mkdir models
cd models
git lfs install
git clone https://huggingface.co/01-ai/Yi-6B
```
### Prepare tokenized datasets
```
cd utils
python save_tokenized_dataset.py --tokenizer_name_or_path ../../models/Yi-6B
python ascent_plus_descent_tokenizer.py --tokenizer_name_or_path ../../models/Yi-6B
```
### Unlearning experiments
Remember to replace `` in the [run_unlearn.py](llm_unlearn/run_unlearn.py#L90), [run_eval.py](llm_unlearn/run_eval.py#L84), and [run_mia.py](llm_unlearn/run_mia.py#L85) files to your own key.
```
# Make sure you are under the llm_unlearn dir
torchrun --nproc_per_node=8 --master_port=20001 run_unlearn.py \
--target_model_name_or_path ../../models/Yi-6B \
--per_device_train_batch_size 1 \
--do_unlearn \
--output_dir ./output \
--overwrite_output_dir \
--num_train_epochs 1 \
--logging_steps 1 \
--learning_rate 2e-5 \
--warmup_ratio 0.03 \
--overwrite_cache \
--save_total_limit 1 \
--fsdp "full_shard auto_wrap" \
--fsdp_transformer_layer_cls_to_wrap 'LlamaDecoderLayer' \
--bf16 True \
--tf32 True \
--weight_decay 0. \
--lr_scheduler_type "cosine" \
--domain github \
--gradient_accumulation_steps 85 \
--unlearn_method gradient_ascent
```
- Available domains with corresponding arguments: :
- `--domain arxiv --gradient_accumulation_steps 60 `
- `--domain github --gradient_accumulation_steps 85 `
- Available methods with corresponding arguments:
- `--unlearn_method gradient_ascent `
- `--unlearn_method random_label --completely_random True` (named Fine-tuning with Random Labels in the paper)
- `--unlearn_method random_label --top_k 1 --rm_groundtruth True ` (named Unlearning with Adversarial Samples in the paper)
- `--unlearn_method ascent_plus_descent`
- `--unlearn_method ascent_plus_kl_divergence`
- `--unlearn_method ascent_plus_descent --general True`
- `--unlearn_method ascent_plus_kl_divergence --general True`### Eval unlearned model
```
torchrun --nproc_per_node=8 --master_port=20001 run_eval.py \
--model_name_or_path ./output/github/Yi-6B/8_gpu_bs_1_gas_85_lr_2.0e_5_epoch1/unlearn/gradient_ascent \
--per_device_eval_batch_size 1 \
--do_eval \
--output_dir ./output/github/Yi-6B-eval \
--overwrite_output_dir \
--overwrite_cache \
--tf32 True \
--domain github
```
### Membership inference attack
```
torchrun --nproc_per_node=8 --master_port=20001 run_mia.py \
--model_name_or_path ./output/github/Yi-6B/8_gpu_bs_1_gas_85_lr_2.0e_5_epoch1general/unlearn/ascent_plus_kl_divergence \
--per_device_eval_batch_size 1 \
--do_eval \
--output_dir ./output/arxiv/Yi-6B-mia \
--overwrite_output_dir \
--overwrite_cache \
--tf32 True \
--domain github
```## ⭐ Citation Information
If you find this code or dataset useful, please consider citing our paper:
```bib
@article{yao2024machine,
title={Machine Unlearning of Pre-trained Large Language Models},
author={Yao, Jin and Chien, Eli and Du, Minxin and Niu, Xinyao and Wang, Tianhao and Cheng, Zezhou and Yue, Xiang},
journal={arXiv preprint arXiv:2402.15159},
year={2024}
}
```### Contact
Feel free to reach out if you have any questions. [Jin Yao](mailto:[email protected]), [Eli Chien](mailto:[email protected]), [Xiang Yue](mailto:[email protected])