https://github.com/zhliu0106/learning-to-refuse
Official Implementation of "Learning to Refuse: Towards Mitigating Privacy Risks in LLMs"
https://github.com/zhliu0106/learning-to-refuse
Last synced: about 2 months ago
JSON representation
Official Implementation of "Learning to Refuse: Towards Mitigating Privacy Risks in LLMs"
- Host: GitHub
- URL: https://github.com/zhliu0106/learning-to-refuse
- Owner: zhliu0106
- Created: 2024-07-14T02:45:39.000Z (10 months ago)
- Default Branch: main
- Last Pushed: 2024-12-13T09:17:55.000Z (5 months ago)
- Last Synced: 2024-12-13T10:19:37.734Z (5 months ago)
- Language: Python
- Size: 8.26 MB
- Stars: 8
- Watchers: 1
- Forks: 4
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
- awesome-llm-unlearning - GitHub
README
# learning-to-refuse
Official Implementation of [Learning to Refuse: Towards Mitigating Privacy Risks in LLMs](https://arxiv.org/abs/2407.10058)## RETURN: Real-world pErsonal daTa UnleaRNing dataset
RETURN is avaliable in `data/RETURN.jsonl`. You also can access RETURN directly on [Hugging Face](https://huggingface.co/datasets/zhliu/RETURN).
```python
from datasets import load_datasetdataset = load_dataset("zhliu/RETURN")
```## Reproduction
### Environment Setup
```shell
# Clone the repository
git clone [email protected]:zhliu0106/learning-to-refuse.git
cd learning-to-refuse# Create and activate conda environment
conda create -n refuse python==3.10
conda activate refuse# Install dependencies
pip install -r requirements.txt
```### Data Preprocessing
```shell
bash scripts/data_process.sh
```### Training and Evaluation
```shell
bash scripts/run.sh
```**Note:** Due to differences in hardware environments and random seed settings, there might be slight variations in the experimental results.
## Citation
```bibtex
@article{liu2024learning,
title={Learning to refuse: Towards mitigating privacy risks in llms},
author={Liu, Zhenhua and Zhu, Tong and Tan, Chuanyuan and Chen, Wenliang},
journal={arXiv preprint arXiv:2407.10058},
year={2024}
}
```