https://github.com/zjunlp/knowundo
[EMNLP 2024] To Forget or Not? Towards Practical Knowledge Unlearning for Large Language Models
https://github.com/zjunlp/knowundo
artificial-intelligence benchmark dataset knowledge-editing knowledge-unlearning knowundo large-language-models localization memflex model-editing natural-language-processing unlearning
Last synced: 7 months ago
JSON representation
[EMNLP 2024] To Forget or Not? Towards Practical Knowledge Unlearning for Large Language Models
- Host: GitHub
- URL: https://github.com/zjunlp/knowundo
- Owner: zjunlp
- License: mit
- Created: 2024-06-18T06:46:45.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2025-01-23T01:51:21.000Z (12 months ago)
- Last Synced: 2025-03-01T13:37:08.892Z (11 months ago)
- Topics: artificial-intelligence, benchmark, dataset, knowledge-editing, knowledge-unlearning, knowundo, large-language-models, localization, memflex, model-editing, natural-language-processing, unlearning
- Language: Python
- Homepage:
- Size: 1.65 MB
- Stars: 40
- Watchers: 4
- Forks: 1
- Open Issues: 2
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
KnowUnDo
To Forget or Not? Towards Practical Knowledge Unlearning for LLMs
[](https://github.com/zjunlp/KnowUnDo)
[](https://opensource.org/licenses/MIT)


---
🔔 Overview •
📊 Load Datasets •
🚀 How to Run •
📖 Citation •
## 🔔 Overview
We provide the **KnowUnDo** (EMNLP 2025 Findings), a benchmark containing copyrighted content and user privacy domains to evaluate if the unlearning process inadvertently erases essential knowledge. Access our **KnowUnDo** directly on [Hugging Face](https://huggingface.co/datasets/zjunlp/KnowUnDo).
To address this, we propose a simple yet effective method, **MemFlex**, which utilizes gradient information to precisely target and unlearn sensitive parameters.
## 📊 Load Datasets
You can easily load the datasets following below.
```python
from datasets import load_dataset
dataset = load_dataset("zjunlp/KnowUnDo", name='copyright', split='unlearn')
```
* Available configuration names and corresponding splits:
- `copyright`: `unlearn`, `retention`;
- `privacy`: `unlearn`, `retention`;
## 🚀 How to run
### Environment Setup
```bash
git clone https://github.com/zjunlp/KnowUnDo.git
cd KnowUnDo
conda create -n KnowUnDo python==3.10
conda activate KnowUnDo
pip install -e .
pip install -r requirements.txt
cd llm_unlearn/apex
pip install -v --no-cache-dir ./
```
### Download Large Language Models (LLMs)
```bash
# directory: KnowUnDo
mkdir models
cd models
git lfs install
git clone https://huggingface.co/meta-llama/Llama-2-7b-chat-hf
git clone https://huggingface.co/Qwen/Qwen1.5-7B-Chat
```
### Pretrain LLMs in Our Setting
```bash
# directory: pretrain
bash run_finetune_lora.sh
```
### Knowledge Localization (Optional)
We have released the localized knowledge region. You can perform the localization yourself as follows.
```bash
# directory: pretrain
bash run_localization.sh
```
### Prepare tokenized datasets
```bash
# directory: llm_unlearn
cd utils
bash tokenize_datasets.sh
```
+ `--val` for the `val` split of the dataset.
+ `--prompt` for concating `direct_prompt` before the `question` in the datasets.
### Unlearning experiments
```bash
# directory: llm_unlearn
bash run_baselines_lora.sh
bash run_ours_lora.sh
```
- Available methods with corresponding arguments:
- `--unlearn_method gradient_ascent `
- `--unlearn_method random_label --completely_random True` (named Fine-tuning with Random Labels in the paper)
- `--unlearn_method random_label --top_k 1 --rm_groundtruth True` (named Unlearning with Adversarial Samples in the paper)
- `--unlearn_method ascent_plus_descent`
- `--unlearn_method ascent_plus_kl_divergence`
- `--unlearn_method ascent_plus_descent --general True`
- `--unlearn_method ascent_plus_kl_divergence --general True`
- `--unlearn_method memflex` (the strong baseline proposed by us)
### Eval Unlearned Model
You can evaluate multiple unlearned models together by running our script **only once**.
```bash
# directory: llm_unlearn
bash run_eval_baselines_lora.sh
```
+ `--direct_prompt=True` means concating `direct_prompt` before the `question` in the datasets.
## 🎉 Acknowledgement
We would like to express our sincere gratitude to the excellent work [Unlearning LLM](https://github.com/yaojin17/Unlearning_LLM), [TOFU](https://github.com/locuslab/tofu), [LLaMA](https://huggingface.co/meta-llama/Llama-2-7b-chat-hf), and [Qwen](https://github.com/QwenLM/Qwen2?tab=readme-ov-file).
## 📖 Citation
If you use or extend our work, please cite the paper as follows:
```bibtex
@article{tian2024forget,
title={To forget or not? towards practical knowledge unlearning for large language models},
author={Tian, Bozhong and Liang, Xiaozhuan and Cheng, Siyuan and Liu, Qingbin and Wang, Mengru and Sui, Dianbo and Chen, Xi and Chen, Huajun and Zhang, Ningyu},
journal={arXiv preprint arXiv:2407.01920},
year={2024}
}
```