https://github.com/zjunlp/KnowUnDo
To Forget or Not? Towards Practical Knowledge Unlearning for Large Language Models
https://github.com/zjunlp/KnowUnDo
artificial-intelligence benchmark dataset knowledge-editing knowledge-unlearning knowundo large-language-models localization memflex model-editing natural-language-processing unlearning
Last synced: 2 months ago
JSON representation
To Forget or Not? Towards Practical Knowledge Unlearning for Large Language Models
- Host: GitHub
- URL: https://github.com/zjunlp/KnowUnDo
- Owner: zjunlp
- License: mit
- Created: 2024-06-18T06:46:45.000Z (11 months ago)
- Default Branch: main
- Last Pushed: 2024-08-20T06:28:15.000Z (9 months ago)
- Last Synced: 2024-08-21T07:59:39.946Z (9 months ago)
- Topics: artificial-intelligence, benchmark, dataset, knowledge-editing, knowledge-unlearning, knowundo, large-language-models, localization, memflex, model-editing, natural-language-processing, unlearning
- Language: Python
- Homepage:
- Size: 1.64 MB
- Stars: 15
- Watchers: 5
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
- awesome-llm-unlearning - GitHub
README
KnowUnDo
To Forget or Not? Towards Practical Knowledge Unlearning for LLMs
[](https://github.com/zjunlp/KnowUnDo)
[](https://opensource.org/licenses/MIT)

---
🔔 Overview •
📊 Load Datasets •
🚀 How to Run •
📖 Citation •## 🔔 Overview
We provide the **KnowUnDo** (EMNLP 2025 Findings), a benchmark containing copyrighted content and user privacy domains to evaluate if the unlearning process inadvertently erases essential knowledge. Access our **KnowUnDo** directly on [Hugging Face](https://huggingface.co/datasets/zjunlp/KnowUnDo).
To address this, we propose a simple yet effective method, **MemFlex**, which utilizes gradient information to precisely target and unlearn sensitive parameters.
## 📊 Load Datasets
You can easily load the datasets following below.```python
from datasets import load_datasetdataset = load_dataset("zjunlp/KnowUnDo", name='copyright', split='unlearn')
```
* Available configuration names and corresponding splits:
- `copyright`: `unlearn`, `retention`;
- `privacy`: `unlearn`, `retention`;## 🚀 How to run
### Environment Setup
```bash
git clone https://github.com/zjunlp/KnowUnDo.git
cd KnowUnDo
conda create -n KnowUnDo python==3.10conda activate KnowUnDo
pip install -e .
pip install -r requirements.txtcd llm_unlearn/apex
pip install -v --no-cache-dir ./
```
### Download Large Language Models (LLMs)
```bash
# directory: KnowUnDo
mkdir models
cd models
git lfs install
git clone https://huggingface.co/meta-llama/Llama-2-7b-chat-hf
git clone https://huggingface.co/Qwen/Qwen1.5-7B-Chat
```
### Pretrain LLMs in Our Setting
```bash
# directory: pretrain
bash run_finetune_lora.sh
```
### Knowledge Localization (Optional)
We have released the localized knowledge region. You can perform the localization yourself as follows.
```bash
# directory: pretrain
bash run_localization.sh
```
### Prepare tokenized datasets
```bash
# directory: llm_unlearn
cd utils
bash tokenize_datasets.sh
```
+ `--val` for the `val` split of the dataset.
+ `--prompt` for concating `direct_prompt` before the `question` in the datasets.### Unlearning experiments
```bash
# directory: llm_unlearn
bash run_baselines_lora.sh
bash run_ours_lora.sh
```
- Available methods with corresponding arguments:
- `--unlearn_method gradient_ascent `
- `--unlearn_method random_label --completely_random True` (named Fine-tuning with Random Labels in the paper)
- `--unlearn_method random_label --top_k 1 --rm_groundtruth True` (named Unlearning with Adversarial Samples in the paper)
- `--unlearn_method ascent_plus_descent`
- `--unlearn_method ascent_plus_kl_divergence`
- `--unlearn_method ascent_plus_descent --general True`
- `--unlearn_method ascent_plus_kl_divergence --general True`
- `--unlearn_method memflex` (the strong baseline proposed by us)
### Eval Unlearned ModelYou can evaluate multiple unlearned models together by running our script **only once**.
```bash
# directory: llm_unlearn
bash run_eval_baselines_lora.sh
```
+ `--direct_prompt=True` means concating `direct_prompt` before the `question` in the datasets.
## 🎉 AcknowledgementWe would like to express our sincere gratitude to the excellent work [Unlearning LLM](https://github.com/yaojin17/Unlearning_LLM), [TOFU](https://github.com/locuslab/tofu), [LLaMA](https://huggingface.co/meta-llama/Llama-2-7b-chat-hf), and [Qwen](https://github.com/QwenLM/Qwen2?tab=readme-ov-file).
## 📖 Citation
If you use or extend our work, please cite the paper as follows:
```bibtex
@article{tian2024forget,
title={To forget or not? towards practical knowledge unlearning for large language models},
author={Tian, Bozhong and Liang, Xiaozhuan and Cheng, Siyuan and Liu, Qingbin and Wang, Mengru and Sui, Dianbo and Chen, Xi and Chen, Huajun and Zhang, Ningyu},
journal={arXiv preprint arXiv:2407.01920},
year={2024}
}
```