{"id":13458879,"url":"https://github.com/UCSB-NLP-Chang/causal_unlearn","last_synced_at":"2025-03-24T16:31:06.326Z","repository":{"id":251008100,"uuid":"832274468","full_name":"UCSB-NLP-Chang/causal_unlearn","owner":"UCSB-NLP-Chang","description":"[EMNLP 2024] \"Revisiting Who's Harry Potter: Towards Targeted Unlearning from a Causal Intervention Perspective\"","archived":false,"fork":false,"pushed_at":"2024-07-22T17:19:52.000Z","size":1548,"stargazers_count":12,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2024-12-01T03:36:11.534Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/UCSB-NLP-Chang.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-07-22T17:17:47.000Z","updated_at":"2024-11-28T10:53:19.000Z","dependencies_parsed_at":"2024-07-31T09:21:44.528Z","dependency_job_id":null,"html_url":"https://github.com/UCSB-NLP-Chang/causal_unlearn","commit_stats":null,"previous_names":["ucsb-nlp-chang/causal_unlearn"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/UCSB-NLP-Chang%2Fcausal_unlearn","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/UCSB-NLP-Chang%2Fcausal_unlearn/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/UCSB-NLP-Chang%2Fcausal_unlearn/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/UCSB-NLP-Chang%2Fcausal_unlearn/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/UCSB-NLP-Chang","download_url":"https://codeload.github.com/UCSB-NLP-Chang/causal_unlearn/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":245308474,"owners_count":20594257,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-07-31T09:00:59.219Z","updated_at":"2025-03-24T16:31:05.740Z","avatar_url":"https://github.com/UCSB-NLP-Chang.png","language":"Python","funding_links":[],"categories":["Papers"],"sub_categories":["Methods"],"readme":"# Revisiting *Who's Harry Potter*: Towards Targeted Unlearning from a Causal Intervention Perspective\n\nThis is the official implementation for the paper **Revisiting *Who's Harry Potter*: Towards Targeted Unlearning from a Causal Intervention Perspective**.\n\nWe introduce the *targeted unlearning* task for LLMs, where given an unlearning target (*e.g.,* a person) and some unlearning documents (*e.g.,* corresponding Wikipedia article), we aim to unlearn **only** the information about the target, rather than everything in the unlearning documents.\n\nBelow is an example of the targeted unlearning task, where we aim to unlearn the place of birth of *Wilhelm Wattenbach* but do not want to forget the city of *Rantzau*.\n\n\u003cimg src=\"assets/example.png\" width=600px\u003e\n\n## Quick Links\n- [**Dataset on Hugging Face**](https://huggingface.co/datasets/Shiyu-Lab/Wikipedia_Person_Unlearn): Link to download our newly created **WPU** dataset for targeted unlearning.\n\n## Installation\n\n```\nconda create -n unlearn python=3.10\nconda activate unlearn\nconda install pytorch==2.2.0 pytorch-cuda=11.8 -c pytorch -c nvidia\nconda install -c \"nvidia/label/cuda-11.8.0\" cuda-toolkit\npip install -r requirements.txt\npip install flash-attn==2.5.3 --no-build-isolation\npython -m spacy download en_core_web_sm\n```\n\nSet your OpenAI API key for GPT evaluations:\n```\nexport OPENAI_API_KEY=${apikey}\n```\n\n## Usage\n\n### WPU Dataset\nTo load the dataset, use the following code:\n\n```python\nfrom datasets import load_dataset\nds = load_dataset(\"Shiyu-Lab/Wikipedia_Person_Unlearn\", \"forget_100\")\n```\nThe possible splits are:\n- `forget_${n}_${index}`: Contains the unlearning documents and QA pairs to evaluate unlearned models (should forget). `n` is the number of persons to unlearn (choose from {2, 20, 100}); `index` is the index of the split. We have multiple sets of persons to unlearn for `n=[2|20]`. Ignore `index` for `n=100`.\n- `forget_${n}_${index}_hard_retain`: Contains the QA pairs to evaluate unlearned models on entities that are closely related to the unlearning targets (should not forget).\n- `general_retain`: Contains the QA pairs to evaluate unlearned models on a set of popular persons (should not forget).\n- `retain`: Contains Wikipedia articles of 100 persons used for retain loss (preserve model utility).\n\n### WPU Experiments\nTo run experiments on WPU, use\n\n```\nbash scripts/wpu.sh -f intervention -s ${save_dir_root} -g 0,1 -n 20\n```\nThe four arguments are:\n- `f forget_loss`: Which forget loss to use. Set to `intervention` for our method. Other choices are `{whp, npo, grad_diff}`.\n- `s save_dir_root`: Directory to save trained models and evaluation results.\n- `g gpu_ids`: GPU ids to use. Our experiments use 2 GPUs.\n- `n num_distribution`: Number of distributions to aggregate for our method. Use 20 on WPU and 1 on TOFU.\n\nPlease refer to `scripts/wpu.sh` for details of each training and evaluation step.\n\nThe final results are summarized in `${save_dir_root}/meta-llama/Llama-2-7b-chat-hf/${forget_loss}/${num_distribution}_${seed}_${split}/checkpoint-${i}/aggregate_stat.csv`\n\n### TOFU Experiments\nTo run experiments on TOFU, use\n\n```\nbash scripts/tofu.sh -f intervention -s ${save_dir_root} -g 0,1 -n 1\n```\nPlease refer to `scripts/tofu.sh` for details of each training and evaluation step.\n\nThe final results are summarized in `${save_dir_root}/locuslab/tofu_ft_llama2-7b/${forget_loss}/${num_distribution}_${seed}_${split}/checkpoint-${i}/eval/aggregate_stat.csv`\n\n## Acknowledgement\nOur implementation is based on following repos:\n* [https://github.com/locuslab/tofu](https://github.com/locuslab/tofu)\n* [https://github.com/licong-lin/negative-preference-optimization](https://github.com/licong-lin/negative-preference-optimization)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FUCSB-NLP-Chang%2Fcausal_unlearn","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FUCSB-NLP-Chang%2Fcausal_unlearn","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FUCSB-NLP-Chang%2Fcausal_unlearn/lists"}