https://github.com/jjbrophy47/machine_unlearning
Existing Literature about Machine Unlearning
https://github.com/jjbrophy47/machine_unlearning
data-deletion data-removal machine-unlearning
Last synced: 5 months ago
JSON representation
Existing Literature about Machine Unlearning
- Host: GitHub
- URL: https://github.com/jjbrophy47/machine_unlearning
- Owner: jjbrophy47
- Created: 2020-09-11T20:25:56.000Z (over 5 years ago)
- Default Branch: master
- Last Pushed: 2024-07-27T19:42:09.000Z (almost 2 years ago)
- Last Synced: 2024-07-28T01:42:00.044Z (almost 2 years ago)
- Topics: data-deletion, data-removal, machine-unlearning
- Homepage:
- Size: 735 KB
- Stars: 730
- Watchers: 25
- Forks: 90
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
- awesome-privacy-engineering - Machine Unlearning - A compilation of existing literature about machine unlearning, a process through which a machine learning model can be made to forget one of its training data points. (Awesome Privacy Engineering [](https://awesome.re) / Machine Learning and Algorithmic Bias)
README
# Machine Unlearning Papers and Benchmarks
[](https://github.com/sindresorhus/awesome)


## Frameworks
[OpenUnlearning](https://github.com/locuslab/open-unlearning)
[Machine Unlearning Comparator](https://github.com/gnueaj/Machine-Unlearning-Comparator)
## Papers
[2025](#2025)
[2024](#2024)
[2023](#2023)
[2022](#2022)
[2021](#2021)
[2020](#2020)
[2019](#2019)
[2018](#2018)
[2017](#2017)
[< 2017](#before-2017)
### 2025
| Author(s) | Title | Venue |
| :------------------ | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ---------------------------------------------------- |
| Jiang et al. | [Backdoor Token Unlearning: Exposing and Defending Backdoors in Pretrained Language Models](https://ojs.aaai.org/index.php/AAAI/article/view/34605) | AAAI |
| Han et al. | [DuMo: Dual Encoder Modulation Network for Precise Concept Erasure](https://ojs.aaai.org/index.php/AAAI/article/view/32343)|AAAI|
| Wu et al. | [Unlearning Concepts in Diffusion Model via Concept Domain Correction and Concept Preserving Gradient](https://ojs.aaai.org/index.php/AAAI/article/view/32917) | AAAI |
| Wang et al. | [Selective Forgetting: Advancing Machine Unlearning Techniques and Evaluation in Language Models](https://ojs.aaai.org/index.php/AAAI/article/view/32068) | AAAI |
| Yuan et al. | [Towards Robust Knowledge Unlearning: An Adversarial Framework for Assessing and Improving Unlearning Robustness in Large Language Models](https://arxiv.org/abs/2408.10682) | AAAI |
| Jin et al. | [Unlearning as multi-task optimization: A normalized gradient difference approach with an adaptive learning rate](https://aclanthology.org/2025.naacl-long.563/) | ACL |
| Yang et al. | [CLIPErase: Efficient Unlearning of Visual-Textual Associations in CLIP](https://aclanthology.org/2025.acl-long.1469/) | ACL |
| Choi et al. | [Opt-Out: Investigating Entity-Level Unlearning for Large Language Models via Optimal Transport](https://arxiv.org/abs/2406.12329) | ACL |
| Bhaila et al. | [Soft Prompting for Unlearning in Large Language Models](https://aclanthology.org/2025.naacl-long.204/) | ACL |
| Sun et al. | [Aligned but Blind: Alignment Increases Implicit Bias by Reducing Awareness of Race](https://aclanthology.org/2025.acl-long.1078/) | ACL |
| Xu et al. | [ReLearn: Unlearning via Learning for Large Language Models](https://aclanthology.org/2025.acl-long.297/) | ACL |
| Huo et al. | [MMUnlearner: Reformulating Multimodal Machine Unlearning in the Era of Multimodal Large Language Models](https://arxiv.org/abs/2502.11051) | ACL |
| Liu et al. | [Modality-Aware Neuron Pruning for Unlearning in Multimodal Large Language Models](https://aclanthology.org/2025.acl-long.295/) | ACL |
| Tran et al. | [Tokens for Learning, Tokens for Unlearning: Mitigating Membership Inference Attacks in Large Language Models via Dual-Purpose Training](https://aclanthology.org/2025.findings-acl.1174/) | ACL |
| Zhuang et al. | [SEUF: Is Unlearning One Expert Enough for Mixture-of-Experts LLMs?](https://aclanthology.org/2025.acl-long.424/) | ACL |
| Liu et al. | [Rethinking Machine Unlearning in Image Generation Models](https://arxiv.org/abs/2506.02761) | ACM CCS |
| Chowdhury et al. | [Fundamental Limits of Perfect Concept Erasure](https://proceedings.mlr.press/v258/chowdhury25a.html) | AISTATS |
| Xue et al. | [CRCE: Coreference-Retention Concept Erasure in Text-to-Image Diffusion Models](https://arxiv.org/abs/2503.14232) | BMVC |
| Mekala et al. | [Alternate Preference Optimization for Unlearning Factual Knowledge in Large Language Models](https://aclanthology.org/2025.coling-main.252/) | COLING |
| Ma et al. | [Unveiling Entity-Level Unlearning for Large Language Models: A Comprehensive Analysis](https://aclanthology.org/2025.coling-main.358/) | COLING |
| Sanyal et al. | [Agents Are All You Need for LLM Unlearning](https://arxiv.org/abs/2502.00406) | COLM |
| Zhou et al. | [Decoupled Distillation to Erase: A General Unlearning Method for Any Class-centric Tasks](https://openaccess.thecvf.com/content/CVPR2025/html/Zhou_Decoupled_Distillation_to_Erase_A_General_Unlearning_Method_for_Any_CVPR_2025_paper.html) | CVPR |
| Li et al. | [Detect-and-Guide: Self-regulation of Diffusion Models for Safe Text-to-Image Generation via Guideline Token Optimization](https://openaccess.thecvf.com/content/CVPR2025/html/Li_Detect-and-Guide_Self-regulation_of_Diffusion_Models_for_Safe_Text-to-Image_Generation_via_CVPR_2025_paper.html) | CVPR |
|Wang et al| [Precise, Fast, and Low-cost Concept Erasure in Value Space: Orthogonal Complement Matters](https://openaccess.thecvf.com/content/CVPR2025/html/Wang_Precise_Fast_and_Low-cost_Concept_Erasure_in_Value_Space__CVPR_2025_paper.html)|CVPR|
| Wang et al. | [ACE: Anti-Editing Concept Erasure in Text-to-Image Models](https://openaccess.thecvf.com/content/CVPR2025/html/Wang_ACE_Anti-Editing_Concept_Erasure_in_Text-to-Image_Models_CVPR_2025_paper.html) | CVPR |
| Wu et al. | [EraseDiff: Erasing Data Influence In Diffusion Models](https://openaccess.thecvf.com/content/CVPR2025/html/Wu_Erasing_Undesirable_Influence_in_Diffusion_Models_CVPR_2025_paper.html) | CVPR |
| Lee et al. | [ESC: Erasing Space Concept for Knowledge Deletion](https://openaccess.thecvf.com/content/CVPR2025/html/Lee_ESC_Erasing_Space_Concept_for_Knowledge_Deletion_CVPR_2025_paper.html) | CVPR |
| Thakral et al. | [Fine-Grained Erasure in Text-to-Image Diffusion-based Foundation Models](https://arxiv.org/abs/2503.19783) | CVPR |
| Srivatsan et al. | [STEREO: A Two-Stage Framework for Adversarially Robust Concept Erasing from Text-to-Image Diffusion Models](https://openaccess.thecvf.com/content/CVPR2025/html/Srivatsan_STEREO_A_Two-Stage_Framework_for_Adversarially_Robust_Concept_Erasing_from_CVPR_2025_paper.html)|CVPR|
| Lee et al. | [Localized Concept Erasure for Text-to-Image Diffusion Models Using Training-Free Gated Low-Rank Adaptation](https://openaccess.thecvf.com/content/CVPR2025/html/Lee_Localized_Concept_Erasure_for_Text-to-Image_Diffusion_Models_Using_Training-Free_Gated_CVPR_2025_paper.html) | CVPR |
|Shirkavand et al. | [Efficient Fine-Tuning and Concept Suppression for Pruned Diffusion Models](https://openaccess.thecvf.com/content/CVPR2025/html/Shirkavand_Efficient_Fine-Tuning_and_Concept_Suppression_for_Pruned_Diffusion_Models_CVPR_2025_paper.html)|CVPR|
| Pan et al. | [Multi-Objective Large Language Model Unlearning](https://ieeexplore.ieee.org/abstract/document/10889776) | ICASSP |
| Wang et al. | [Large Scale Knowledge Washing](https://openreview.net/forum?id=dXCpPgjTtd) | ICLR |
|Koulischer et al. |[Dynamic Negative Guidance of Diffusion Models](https://openreview.net/forum?id=6p74UyAdLa)|ICLR|
| Feng et al. | [Controllable Unlearning for Image-to-Image Generative Models via epsilon-Constrained Optimization](https://openreview.net/forum?id=9OJflnNu6C) | ICLR |
| Ding et al. | [Unified Parameter-Efficient Unlearning for LLMs](https://openreview.net/forum?id=zONMuIVCAT) | ICLR |
| Jin et al. | [Unlearning as Multi-Task Optimization: a normalized gradient difference approach with adaptive learning rate](https://openreview.net/forum?id=OknsPawlUf) | ICLR |
| Farrell et al. | [Applying Sparse Autoencoders to Unlearn Knowledge in Language Models](https://openreview.net/forum?id=ZtvRqm6oBu) | ICLR |
|Cywinski et al. | [SAeUron: Interpretable Concept Unlearning in Diffusion Models with Sparse Autoencoders](https://openreview.net/forum?id=6N0GxaKdX9)|ICLR|
| Yoon et al. | [SAFREE: Training-Free and Adaptive Guard for Safe Text-to-Image And Video Generation](https://openreview.net/forum?id=hgTFotBRKl)|ICLR|
| Choi et al. | [Unlearning-based Neural Interpretations](https://openreview.net/forum?id=PBjCTeDL6o) | ICLR |
| Di et al. | [Adversarial Machine Unlearning](https://openreview.net/pdf?id=swWF948IiC) | ICLR |
| Sakarvadia et al. | [Mitigating Memorization in Language Models](https://openreview.net/forum?id=MGKDBuyv4p) | ICLR |
| Li et al. | [When is Task Vector Provably Effective for Model Editing? A Generalization Analysis of Nonlinear Transformers](https://openreview.net/forum?id=vRvVVb0NAz) | ICLR |
| Scholten et al. | [A Probabilistic Perspective on Unlearning and Alignment for Large Language Models](https://openreview.net/forum?id=51WraMid8K) | ICLR |
| Zhang et al. | [Catastrophic Failure of LLM Unlearning via Quantization](https://openreview.net/forum?id=lHSeDYamnz) | ICLR |
| Cha et al. | [Towards Robust and Parameter-Efficient Knowledge Unlearning for LLMs](https://arxiv.org/abs/2408.06621) | ICLR |
| Shi et al. | [MUSE: Machine Unlearning Six-Way Evaluation for Language Models](https://openreview.net/forum?id=TArmA033BU) | ICLR |
| Bui et al. | [Fantastic Targets for Concept Erasure in Diffusion Models and Where To Find Them](https://openreview.net/forum?id=tZdqL5FH7w) | ICLR |
| Yuan et al. | [A Closer Look at Machine Unlearning for Large Language Models](https://openreview.net/forum?id=Q1MHvGmhyT) | ICLR |
| Du et al. | [Textual Unlearning Gives a False Sense of Unlearning](https://openreview.net/forum?id=jyxwWQjU4J) | ICML |
| Li et al. | [One Image is Worth a Thousand Words: A Usability Preservable Text-Image Collaborative Erasing Framework](https://arxiv.org/abs/2505.11131) | ICML |
| Karvonen et al. | [SAEBench: A Comprehensive Benchmark for Sparse Autoencoders in Language Model Interpretability](https://arxiv.org/abs/2503.09532) | ICML |
| Zhang et al. | [Minimalist Concept Erasure in Generative Models](https://arxiv.org/abs/2507.13386) | ICML |
| Fan et al. | [Towards LLM Unlearning Resilient to Relearning Attacks: A Sharpness-Aware Minimization Perspective and Beyond](https://arxiv.org/abs/2502.05374) | ICML |
| Pathak et al. | [Quantum-Inspired Audio Unlearning: Towards Privacy-Preserving Voice Biometrics](https://www.arxiv.org/abs/2507.22208) | IJCB |
| Dou et al. | [Avoiding Copyright Infringement via Large Language Model Unlearning](https://aclanthology.org/2025.findings-naacl.288/) | NAACL |
| Liu et al. | [Protecting Privacy in Multimodal Large Language Models with MLLMU-Bench](https://aclanthology.org/2025.naacl-long.207.pdf) | NAACL |
| Dong et al. | [UNDIAL: Self-Distillation with Adjusted Logits for Robust Unlearning in Large Language Models](https://aclanthology.org/2025.naacl-long.444/) | NAACL |
| Ye et al. | [Reinforcement Unlearning](https://www.ndss-symposium.org/wp-content/uploads/2025-80-paper.pdf) | NDSS |
| Bother et al. | [Modyn: A Platform for Model Training on Dynamic Datasets With Sample-Level Data Selection](https://dl.acm.org/doi/abs/10.1145/3709705) | PACMMOD |
| Thaker et al. | [Position: LLM Unlearning Benchmarks are Weak Measures of Progress](https://ieeexplore.ieee.org/abstract/document/10992346) | SaTML |
| Xia et al. | [Edge Unlearning is Not "on Edge"! an Adaptive Exact Unlearning System on Resource-Constrained Devices](https://ieeexplore.ieee.org/document/11023432) | SP |
| Wang et al. | [Towards Lifecycle Unlearning Commitment Management: Measuring Sample-level Unlearning Completeness](https://arxiv.org/abs/2506.06112) | USENIX Security |
| Wang et al. | [TAPE: Tailored Posterior Difference for Auditing of Machine Unlearning](https://openreview.net/forum?id=LedrHK34jZ#discussion) | WWW |
| |
| Justicia et al. | [Digital forgetting in large language models: a survey of unlearning methods](https://link.springer.com/article/10.1007/s10462-024-11078-6) | Artificial Intelligence Review |
| Qu et al. | [The Frontier of Data Erasure: A Survey on Machine Unlearning for Large Language Models](https://ieeexplore.ieee.org/abstract/document/10834145) | Computer |
| Liu et al. | [Threats, Attacks, and Defenses in Machine Unlearning: A Survey](https://ieeexplore.ieee.org/abstract/document/10892039) | IEEE Open Journal of the Computer Society |
| Sun et al. | [Generative Adversarial Networks Unlearning](https://ieeexplore.ieee.org/abstract/document/10979463) | IEEE Transactions on Dependable and Secure Computing |
| Zuo et al. | [Machine unlearning through fine-grained model parameters perturbation](https://ieeexplore.ieee.org/abstract/document/10839062) | IEEE Transactions on Knowledge and Data Engineering |
| Li et al. | [Class-wise federated unlearning: Harnessing active forgetting with teacher–student memory generation](https://www.sciencedirect.com/science/article/abs/pii/S0950705125004009) | Knowledge-Based Systems |
| Liu et al. | [Rethinking Machine Unlearning for Large Language Models](https://www.nature.com/articles/s42256-025-00985-0) | Nature Machine Intelligence |
| Cooper et al. | [Machine Unlearning Doesn't Do What You Think: Lessons for Generative AI Policy, Research, and Practice](https://papers.ssrn.com/sol3/papers.cfm?abstract_id=5060253) | SSRN |
| Tiwary et al. | [Adapt then Unlearn: Exploiting Parameter Space Semantics for Unlearning in Generative Adversarial Networks](https://openreview.net/forum?id=jAHEBivObO) | TMLR |
| MIranda et al. | [Preserving Privacy in Large Language Models: A Survey on Current Threats and Solutions](https://openreview.net/forum?id=Ss9MTTN7OL) | TMLR |
| Huang et al. | [Offset Unlearning for Large Language Models](https://openreview.net/forum?id=A4RLpHPXCu) | TMLR |
| Sinha et al. | [UnSTAR: Unlearning with Self-Taught Anti-Sample Reasoning for LLMs](https://openreview.net/forum?id=mNXCViKZbI) | TMLR |
| Che et al. | [Model Tampering Attacks Enable More Rigorous Evaluations of LLM Capabilities](https://arxiv.org/abs/2502.05209) | TMLR |
| | |
| Vidal et al. | [Machine Unlearning in Hyperbolic vs. Euclidean Multimodal Contrastive Learning: Adapting Alignment Calibration to MERU](https://openaccess.thecvf.com/content/CVPR2025W/TMM-OpenWorld/html/Vidal_Machine_Unlearning_in_Hyperbolic_vs._Euclidean_Multimodal_Contrastive_Learning_Adapting_CVPRW_2025_paper.html) | CVPR Workshop |
| Cai et al. | [AegisLLM: Scaling Agentic Systems for Self-Reflective Defense in LLM Security](https://arxiv.org/abs/2504.20965) | ICLR Workshop |
| Kim et al. | [Training-Free Safe Denoisers For Safe Use of Diffusion Models](https://openreview.net/forum?id=R9lU8ZeJjS)|ICLR Workshop|
| Bui et al. | [Hiding and Recovering Knowledge in Text-to-Image Diffusion Models via Learnable Prompts](https://openreview.net/forum?id=KeJ6dGkiqb)|ICLR Workshop|
| Sanga et al. | [Train Once, Forget Precisely: Anchored Optimization for Efficient Post-Hoc Unlearning](https://arxiv.org/abs/2506.14515) | ICML Workshop |
| Wu et al. | [Evaluating Deep Unlearning in Large Language Models](https://openreview.net/forum?id=376xPmmHoV) | ICML Workshop |
| Spohn et al. | [Align-then-Unlearn: Embedding Alignment for LLM Unlearning](https://arxiv.org/abs/2506.13181) | ICML Workshop |
| Dosajh et al. | [Unlearning Factual Knowledge from LLMs Using Adaptive RMU](https://arxiv.org/abs/2506.16548) | SemEval |
| Xu et al. | [Unlearning via Model Merging](https://arxiv.org/abs/2503.21088) | SemEval |
| Bronec et al. | [Low-Rank Negative Preference Optimization](https://arxiv.org/abs/2503.13690) | SemEval |
| Srivasthav P et al. | [Forgotten but Not Lost: The Balancing Act of Selective Unlearning in Large Language Models](https://arxiv.org/abs/2503.04795) | SemEval |
| Premptis et al. | [Parameter-Efficient Unlearning for Large Language Models using Data Chunking](https://arxiv.org/abs/2503.02443) | SemEval |
| | |
| Kim et al. | [Are We Truly Forgetting? A Critical Re-examination of Machine Unlearning Evaluation Protocols](https://arxiv.org/pdf/2503.06991) | arxiv |
| Kwak et al. | [NegMerge: Consensual Weight Negation for Strong Machine Unlearning](https://arxiv.org/pdf/2410.05583) | arxiv |
| Wang et al. | [GRU: Mitigating the Trade-off between Unlearning and Retention for Large Language Models](https://arxiv.org/pdf/2503.09117) | arxiv |
| Geng et al. | [A Comprehensive Survey of Machine Unlearning Techniques for Large Language Models](https://arxiv.org/abs/2503.01854) | arxiv |
| Barez et al. | [Open Problems in Machine Unlearning for AI Safety](https://arxiv.org/abs/2501.04952) | arxiv |
| Fan et al. | [Simplicity Prevails: Rethinking Negative Preference Optimization for LLM Unlearning](https://openreview.net/forum?id=Pd3jVGTacT) | arxiv |
| Staufer et al. | [What Should LLMs Forget? Quantifying Personal Data in LLMs for Right-to-Be-Forgotten Requests](https://arxiv.org/abs/2507.11128) | arxiv |
| Yeats et al. | [Automating Evaluation of Diffusion Model Unlearning with (Vision-) Language Model World Knowledge](https://arxiv.org/abs/2507.07137) | arxiv |
| Xiong et al. | [The Landscape of Memorization in LLMs: Mechanisms, Measurement, and Mitigation](https://openreview.net/forum?id=Pd3jVGTacT) | arxiv |
| Scholten et al. | [Model Collapse Is Not a Bug but a Feature in Machine Unlearning for LLMs](https://arxiv.org/abs/2507.04219) | arxiv |
| Han et al. | [Unlearning the Noisy Correspondence Makes CLIP More Robust](https://arxiv.org/abs/2507.03434) | arxiv |
| Kawakami et al. | [PULSE: Practical Evaluation Scenarios for Large Multimodal Model Unlearning](https://arxiv.org/abs/2507.01271) | arxiv |
| Ma et al. | [SoK: Semantic Privacy in Large Language Models](https://arxiv.org/abs/2506.23603) | arxiv |
| Rezaei et al. | [Model State Arithmetic for Machine Unlearning](https://arxiv.org/abs/2506.20941) | arxiv |
| Sinha et al. | [Step-by-Step Reasoning Attack: Revealing 'Erased' Knowledge in Large Language Models](https://arxiv.org/abs/2506.17279) | arxiv |
| Zhang et al. | [Does Multimodal Large Language Model Truly Unlearn? Stealthy MLLM Unlearning Attack](https://arxiv.org/abs/2506.17265) | arxiv |
| Jiang et al. | [Large Language Model Unlearning for Source Code](https://arxiv.org/abs/2506.17125) | arxiv |
| Hu et al. | [BLUR: A Benchmark for LLM Unlearning Robust to Forget-Retain Overlap](https://arxiv.org/abs/2506.15699) | arxiv |
| Wu et al. | [Learning-Time Encoding Shapes Unlearning in LLMs](https://arxiv.org/abs/2506.15076) | arxiv |
| Chen et al. | [Unlearning Isn't Invisible: Detecting Unlearning Traces in LLMs from Model Outputs](https://arxiv.org/abs/2506.14003) | arxiv |
| Wang et al. | [Reasoning Model Unlearning: Forgetting Traces, Not Just Answers, While Preserving Reasoning Skills](https://arxiv.org/abs/2506.12963) | arxiv |
| Songdej et al. | [Robust LLM Unlearning with MUDMAN: Meta-Unlearning with Disruption Masking And Normalization](https://arxiv.org/abs/2506.12484) | arxiv |
| Suriyakumar et al. | [UCD: Unlearning in LLMs via Contrastive Decoding](https://arxiv.org/abs/2506.12097) | arxiv |
| Ma et al. | [GUARD: Guided Unlearning and Retention via Data Attribution for Large Language Models](https://arxiv.org/abs/2506.10946) | arxiv |
| Ren et al. | [SoK: Machine Unlearning for Large Language Models](https://arxiv.org/abs/2506.09227) | arxiv |
| Reisizadeh et al. | [BLUR: A Bi-Level Optimization Approach for LLM Unlearning](https://arxiv.org/abs/2506.08164) | arxiv |
| Ye et al. | [LLM Unlearning Should Be Form-Independent](https://arxiv.org/abs/2506.07795) | arxiv |
| Zhang et al. | [RULE: Reinforcement UnLEarning Achieves Forget-Retain Pareto Optimality](https://arxiv.org/abs/2506.07171) | arxiv |
| Lee et al. | [Distillation Robustifies Unlearning](https://arxiv.org/abs/2506.06278) | arxiv |
| Wang et al. | [Towards Lifecycle Unlearning Commitment Management: Measuring Sample-level Unlearning Completeness](https://arxiv.org/abs/2506.06112) | arxiv |
| Wei et al. | [Do LLMs Really Forget? Evaluating Unlearning with Knowledge Correlation and Confidence Awareness](https://arxiv.org/abs/2506.05735) | arxiv |
| Entesari et al. | [Constrained Entropic Unlearning: A Primal-Dual Framework for Large Language Models](https://arxiv.org/abs/2506.05314) | arxiv |
| Wen et al. | [Quantifying Cross-Modality Memorization in Vision-Language Models](https://arxiv.org/abs/2506.05198) | arxiv |
| Chen et al. | [Vulnerability-Aware Alignment: Mitigating Uneven Forgetting in Harmful Fine-Tuning](https://arxiv.org/abs/2506.03850) | arxiv |
| Zhou et al. | [Not All Tokens Are Meant to Be Forgotten](https://arxiv.org/abs/2506.03142) | arxiv |
| Kim et al. | [Rethinking Post-Unlearning Behavior of Large Vision-Language Models](https://arxiv.org/abs/2506.02541) | arxiv |
| Wang et al. | [Invariance Makes LLM Unlearning Resilient Even to Unanticipated Downstream Fine-Tuning](https://arxiv.org/abs/2506.01339) | arxiv |
| Wan et al. | [Not Every Token Needs Forgetting: Selective Unlearning to Limit Change in Utility in Large Language Model Unlearning](https://arxiv.org/abs/2506.00876) | arxiv |
| Feng et al. | [Existing Large Language Model Unlearning Evaluations Are Inconclusive](https://arxiv.org/abs/2506.00688) | arxiv |
| Wang et al. | [Model Unlearning via Sparse Autoencoder Subspace Guided Projections](https://arxiv.org/abs/2505.24428) | arxiv |
| Wu et al. | [Breaking the Gold Standard: Extracting Forgotten Data under Exact Unlearning in Large Language Models](https://arxiv.org/abs/2505.24379) | arxiv |
| Chen et al. | [Does Machine Unlearning Truly Remove Model Knowledge? A Framework for Auditing Unlearning in LLMs](https://arxiv.org/abs/2505.23270) | arxiv |
| Siddiqui et al. | [From Dormant to Deleted: Tamper-Resistant Unlearning Through Weight-Space Regularization](https://arxiv.org/abs/2505.22310) | arxiv |
| Li et al. | [Editing as Unlearning: Are Knowledge Editing Methods Strong Baselines for Large Language Model Unlearning?](https://arxiv.org/abs/2505.19855) | arxiv |
| Jiang et al. | [Graceful Forgetting in Generative Language Models](https://arxiv.org/abs/2505.19715) | arxiv |
| Shi et al. | [Safety Alignment via Constrained Knowledge Unlearning](https://arxiv.org/abs/2505.18588) | arxiv |
| Ye et al. | [T2VUnlearning: A Concept Erasing Method for Text-to-Video Diffusion Models](https://arxiv.org/abs/2505.17550) | arxiv |
| To et al. | [Harry Potter is Still Here! Probing Knowledge Leakage in Targeted Unlearned Large Language Models via Automated Adversarial Prompting](https://arxiv.org/abs/2505.17160) | arxiv |
| Xu et al. | [Unlearning Isn't Deletion: Investigating Reversibility of Machine Unlearning in LLMs](https://arxiv.org/abs/2505.16831) | arxiv |
| Lee et al. | [Does Localization Inform Unlearning? A Rigorous Examination of Local Parameter Attribution for Knowledge Unlearning in Language Models](https://arxiv.org/abs/2505.16252) | arxiv |
| Ma et al. | [Losing is for Cherishing: Data Valuation Based on Machine Unlearning and Shapley Value](https://arxiv.org/abs/2505.16147) | arxiv |
| Yu et al. | [UniErase: Unlearning Token as a Universal Erasure Primitive for Language Models](https://arxiv.org/abs/2505.15674) | arxiv |
| Yoon et al. | [R-TOFU: Unlearning in Large Reasoning Models](https://arxiv.org/abs/2505.15214) | arxiv |
| Jeung et al. | [DUSK: Do Not Unlearn Shared Knowledge](https://arxiv.org/abs/2505.15209) | arxiv |
| Jeung et al. | [SEPS: A Separability Measure for Robust Unlearning in LLMs](https://arxiv.org/abs/2505.14832) | arxiv |
| Deng et al. | [GUARD: Generation-time LLM Unlearning via Adaptive Restriction and Detection](https://arxiv.org/abs/2505.13312) | arxiv |
| Yang et al. | [Exploring Criteria of Loss Reweighting to Enhance LLM Unlearning](https://arxiv.org/abs/2505.11953) | arxiv |
| Qian et al. | [Layered Unlearning for Adversarial Relearning](https://arxiv.org/abs/2505.09500) | arxiv |
| Vasilev et al. | [Unilogit: Robust Machine Unlearning for LLMs Using Uniform-Target Self-Distillation](https://arxiv.org/abs/2505.06027) | arxiv |
| Lu et al. | [WaterDrum: Watermarking for Data-centric Unlearning Metric](https://arxiv.org/abs/2505.05064) | arxiv |
| Xu et al. | [OBLIVIATE: Robust and Practical Machine Unlearning for Large Language Models](https://arxiv.org/abs/2505.04416) | arxiv |
| Sun et al. | [Unlearning vs. Obfuscation: Are We Truly Removing Knowledge?](https://arxiv.org/abs/2505.02884) | arxiv |
| Patil et al. | [Unlearning Sensitive Information in Multimodal LLMs: Benchmark and Attack-Defense Evaluation](https://arxiv.org/abs/2505.01456) | arxiv |
| Zhong et al. | [DualOptim: Enhancing Efficacy and Stability in Machine Unlearning with Dual Optimizers](https://arxiv.org/abs/2504.15827) | arxiv |
| Chen et al. | [ParaPO: Aligning Language Models to Reduce Verbatim Reproduction of Pre-training Data](https://arxiv.org/abs/2504.14452) | arxiv |
| Mahmud et al. | [DP2Unlearning: An Efficient and Guaranteed Unlearning Framework for LLMs](https://arxiv.org/abs/2504.13774) | arxiv |
| Klochkov et al. | [A mean teacher algorithm for unlearning of language models](https://arxiv.org/abs/2504.13388) | arxiv |
| Kim et al. | [GRAIL: Gradient-Based Adaptive Unlearning for Privacy and Copyright in LLMs](https://arxiv.org/abs/2504.12681) | arxiv |
| Pal et al. | [LLM Unlearning Reveals a Stronger-Than-Expected Coreset Effect in Current Benchmarks](https://arxiv.org/abs/2504.10185) | arxiv |
| Muhamed et al. | [SAEs Can Improve Unlearning: Dynamic Sparse Autoencoder Guardrails for Precision Unlearning in LLMs](https://arxiv.org/abs/2504.08192) | arxiv |
| Feng et al. | [Bridging the Gap Between Preference Alignment and Machine Unlearning](https://arxiv.org/abs/2504.06659) | arxiv |
| Feng et al. | [A Neuro-inspired Interpretation of Unlearning in Large Language Models through Sample-level Unlearning Difficulty](https://arxiv.org/abs/2504.06658) | arxiv |
| Krishnan et al. | [Not All Data Are Unlearned Equally](https://arxiv.org/abs/2504.05058) | arxiv |
| Kuo et al. | [Exact Unlearning of Finetuning Data via Model Merging at Scale](https://arxiv.org/abs/2504.04626) | arxiv |
| Xu et al. | [SUV: Scalable Large Language Model Copyright Compliance with Regularized Selective Unlearning](https://arxiv.org/abs/2503.22948) | arxiv |
| Li et al. | [Effective Skill Unlearning through Intervention and Abstention](https://arxiv.org/abs/2503.21730) | arxiv |
| Xu et al. | [PEBench: A Fictitious Dataset to Benchmark Machine Unlearning for Multimodal Large Language Models](https://arxiv.org/abs/2503.12545) | arxiv |
| Poppi et al. | [Hyperbolic Safety-Aware Vision-Language Models](https://arxiv.org/abs/2503.12127) | arxiv |
| Chen et al. | [Safety Mirage: How Spurious Correlations Undermine VLM Safety Fine-tuning](https://arxiv.org/abs/2503.11832) | arxiv |
| Wang et al. | [UIPE: Enhancing LLM Unlearning by Removing Knowledge Related to Forgetting Targets](https://arxiv.org/abs/2503.04693) | arxiv |
| Zhao et al. | [Improving LLM Safety Alignment with Dual-Objective Optimization](https://arxiv.org/abs/2503.03710) | arxiv |
| Yang et al. | [CE-U: Cross Entropy Unlearning](https://arxiv.org/abs/2503.01224) | arxiv |
| Wang et al. | [Erasing Without Remembering: Implicit Knowledge Forgetting in Large Language Models](https://arxiv.org/abs/2502.19982) | arxiv |
| Wang et al. | [Rethinking LLM Unlearning Objectives: A Gradient Perspective and Go Beyond](https://arxiv.org/abs/2502.19301) | arxiv |
| Yang et al. | [FaithUn: Toward Faithful Forgetting in Language Models by Investigating the Interconnectedness of Knowledge](https://arxiv.org/abs/2502.19207) | arxiv |
| Jiang et al. | [Holistic Audit Dataset Generation for LLM Unlearning via Knowledge Graph Traversal and Redundancy Removal](https://arxiv.org/abs/2502.18810) | arxiv |
| Chen et al. | [Soft Token Attacks Cannot Reliably Audit Unlearning in Large Language Models](https://arxiv.org/abs/2502.15836) | arxiv |
| Jung et al. | [CoME: An Unlearning-based Approach to Conflict-free Model Editing](https://arxiv.org/abs/2502.15826) | arxiv |
| Ramakrishna et al. | [LUME: LLM Unlearning with Multitask Evaluations](https://arxiv.org/abs/2502.15097) | arxiv |
| Patil et al. | [UPCORE: Utility-Preserving Coreset Selection for Balanced Unlearning](https://arxiv.org/abs/2502.15082) | arxiv |
| Russinovich et al. | [Obliviate: Efficient Unmemorization for Protecting Intellectual Property in Large Language Models](https://arxiv.org/abs/2502.15010) | arxiv |
| Chen et al. | [SafeEraser: Enhancing Safety in Multimodal Large Language Models through Multimodal Machine Unlearning](https://arxiv.org/abs/2502.12520) | arxiv |
| Chang et al. | [Which Retain Set Matters for LLM Unlearning? A Case Study on Entity Unlearning](https://arxiv.org/abs/2502.11441) | arxiv |
| Shen et al. | [LUNAR: LLM Unlearning via Neural Activation Redirection](https://arxiv.org/abs/2502.07218) | arxiv |
| Geng et al. | [Mitigating Sensitive Information Leakage in LLMs4Code through Machine Unlearning](https://arxiv.org/abs/2502.05739) | arxiv |
| Hu et al. | [FALCON: Fine-grained Activation Manipulation by Contrastive Orthogonal Unalignment for Large Language Model](https://arxiv.org/abs/2502.01472) | arxiv |
| Cheng et al. | [Tool Unlearning for Tool-Augmented LLMs](https://arxiv.org/abs/2502.01083) | arxiv |
| Zhang et al. | [Resolving Editing-Unlearning Conflicts: A Knowledge Codebook Framework for Large Language Model Updating](https://arxiv.org/abs/2502.00158) | arxiv |
| Huu-Tien et al. | [Improving LLM Unlearning Robustness via Random Perturbations](https://arxiv.org/abs/2501.19202) | arxiv |
| He et al. | [Deep Contrastive Unlearning for Language Models](https://arxiv.org/abs/2503.14900) | arxiv |
| Khoriaty et al. | [Don't Forget It! Conditional Sparse Autoencoder Clamping Works for Unlearning](https://arxiv.org/abs/2503.11127) | arxiv |
| Ren et al. | [A General Framework to Enhance Fine-tuning-based LLM Unlearning](https://arxiv.org/abs/2502.17823) | arxiv |
| Lang et al. | [Beyond Single-Value Metrics: Evaluating and Enhancing LLM Unlearning with Cognitive Diagnosis](https://arxiv.org/abs/2502.13996) | arxiv |
| Amara et al. | [EraseBench: Understanding The Ripple Effects of Concept Erasure Techniques](https://arxiv.org/abs/2501.09833) | arxiv |
| Brannvall et al. | [Technical Report for the Forgotten-by-Design Project: Targeted Obfuscation for Machine Learning](https://arxiv.org/abs/2501.11525) | arxiv |
| Chen et al. | [Comprehensive Assessment and Analysis for NSFW Content Erasure in Text-to-Image Diffusion Models](https://arxiv.org/abs/2502.12527) | arxiv |
| Fuchi et al. | [Erasing with Precision: Evaluating Specific Concept Erasure from Text-to-Image Generative Models](https://arxiv.org/abs/2502.13989) | arxiv |
| Kim et al. | [A Comprehensive Survey on Concept Erasure in Text-to-Image Diffusion Models](https://arxiv.org/abs/2502.14896) | arxiv |
| Meng et al. | [Concept Corrector: Erase concepts on the fly for text-to-image diffusion models](https://arxiv.org/abs/2502.16368) | arxiv |
| Beerens et al. | [On the Vulnerability of Concept Erasure in Diffusion Models](https://arxiv.org/abs/2502.17537) | arxiv |
| Chen et al. | [TRCE: Towards Reliable Malicious Concept Erasure in Text-to-Image Diffusion Models](https://arxiv.org/abs/2503.07389) | arxiv |
| Li et al. | [SPEED: Scalable, Precise, and Efficient Concept Erasure for Diffusion Models](https://arxiv.org/abs/2503.07392) | arxiv |
| Tian et al. | [Sparse Autoencoder as a Zero-Shot Classifier for Concept Erasing in Text-to-Image Diffusion Models](https://arxiv.org/abs/2503.09446) | arxiv |
| Carter et al. | [ACE: Attentional Concept Erasure in Diffusion Models](https://arxiv.org/abs/2504.11850) | arxiv |
| Li et al. | [Set You Straight: Auto-Steering Denoising Trajectories to Sidestep Unwanted Concepts](https://arxiv.org/abs/2504.12782) | arxiv |
| Grebe et al. | [Erased but Not Forgotten: How Backdoors Compromise Concept Erasure](https://arxiv.org/abs/2504.21072) | arxiv |
| Gao et al. | [Towards Dataset Copyright Evasion Attack against Personalized Text-to-Image Diffusion Models](https://arxiv.org/abs/2505.02824) | arxiv |
| Biswas et al. | [CURE: Concept Unlearning via Orthogonal Representation Editing in Diffusion Models](https://arxiv.org/abs/2505.12677) | arxiv |
| Chen et al. | [Comprehensive Evaluation and Analysis for NSFW Concept Erasure in Text-to-Image Diffusion Models](https://arxiv.org/abs/2505.15450) | arxiv |
| Liu et al. | [Erased or Dormant? Rethinking Concept Erasure Through Reversibility](https://arxiv.org/abs/2505.16174) | arxiv |
| Lu et al. | [When Are Concepts Erased From Diffusion Models?](https://arxiv.org/abs/2505.17013) | arxiv |
| Xie et al. | [Erasing Concepts, Steering Generations: A Comprehensive Survey of Concept Suppression](https://arxiv.org/abs/2505.19398) | arxiv |
| Gur-Arieh et al. | [Precise In-Parameter Concept Erasure in Large Language Models](https://arxiv.org/abs/2505.22586) | arxiv |
| Carter et al. | [TRACE: Trajectory-Constrained Concept Erasure in Diffusion Models](https://arxiv.org/abs/2505.23312) | arxiv |
| Zhu et al. | [SAGE: Exploring the Boundaries of Unsafe Concept Domain with Semantic-Augment Erasing](https://arxiv.org/abs/2506.09363) | arxiv |
| Fan et al. | [EAR: Erasing Concepts from Unified Autoregressive Models](https://arxiv.org/abs/2506.20151) | arxiv |
| Lee et al. | [Concept Pinpoint Eraser for Text-to-image Diffusion Models via Residual Attention Gate](https://arxiv.org/abs/2506.22806) | arxiv |
| Fu et al. | [FADE: Adversarial Concept Erasure in Flow Models](https://arxiv.org/abs/2507.12283) | arxiv |
|Wu et al. | [MUNBa: Machine Unlearning via Nash Bargaining](https://arxiv.org/abs/2411.15537)|arxiv|
### 2024
| Author(s) | Title | Venue |
| :----------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | --------------------------------------------------------- |
| Tian et al. | [DeRDaVa: Deletion-Robust Data Valuation for Machine Learning](https://ojs.aaai.org/index.php/AAAI/article/view/29462) | AAAI |
|Ni et al. | [ORES: open-vocabulary responsible visual synthesis](https://dl.acm.org/doi/10.1609/aaai.v38i19.30144)|AAAI|
| Moon et al. | [Feature Unlearning for Pre-trained GANs and VAEs](https://ojs.aaai.org/index.php/AAAI/article/view/30138) | AAAI |
| Rashid et al. | [Forget to Flourish: Leveraging Machine-Unlearning on Pretrained Language Models for Privacy Leakage](https://ojs.aaai.org/index.php/AAAI/article/view/34218) | AAAI |
| Cha et al. | [Learning to Unlearn: Instance-wise Unlearning for Pre-trained Classifiers](https://ojs.aaai.org/index.php/AAAI/article/view/28996) | AAAI |
| Hong et al. | [All but One: Surgical Concept Erasing with Model Preservation in Text-to-Image Diffusion Models](https://ojs.aaai.org/index.php/AAAI/article/view/30107) | AAAI |
| Kim et al. | [Layer Attack Unlearning: Fast and Accurate Machine Unlearning via Layer Level Attack and Knowledge Distillation](https://ojs.aaai.org/index.php/AAAI/article/view/30118) | AAAI |
| Foster et al. | [Fast Machine Unlearning Without Retraining Through Selective Synaptic Dampening](https://ojs.aaai.org/index.php/AAAI/article/view/29092) | AAAI |
| Hu et al. | [Separate the Wheat from the Chaff: Model Deficiency Unlearning via Parameter-Efficient Module Operation](https://ojs.aaai.org/index.php/AAAI/article/view/29784) | AAAI |
| Li et al. | [Towards Effective and General Graph Unlearning via Mutual Evolution](https://ojs.aaai.org/index.php/AAAI/article/view/29273) | AAAI |
| Liu et al. | [Backdoor Attacks via Machine Unlearning](https://ojs.aaai.org/index.php/AAAI/article/view/29321) | AAAI |
| You et al. | [RRL: Recommendation Reverse Learning](https://ojs.aaai.org/index.php/AAAI/article/view/28782) | AAAI |
| Moon et al. | [Feature Unlearning for Generative Models via Implicit Feedback](https://ojs.aaai.org/index.php/AAAI/article/view/30138) | AAAI |
|Li et al. | [SafeGen: Mitigating Sexually Explicit Content Generation in Text-to-Image Models](https://dl.acm.org/doi/10.1145/3658644.3670295)|ACM CCS|
| Lin et al. | [GDR-GMA: Machine Unlearning via Direction-Rectified and Magnitude-Adjusted Gradients](https://dl.acm.org/doi/abs/10.1145/3664647.3680775) | ACM MM |
| Huang et al. | [Your Code Secret Belongs to Me: Neural Code Completion Tools Can Memorize Hard-Coded Credentials](https://dl.acm.org/doi/abs/10.1145/3660818) | ACM SE |
| Feng et al. | [Fine-grained Pluggable Gradient Ascent for Knowledge Unlearning in Language Models](https://aclanthology.org/2024.emnlp-main.566/) | ACL |
| Arad et al. |[ReFACT: Updating Text-to-Image Models by Editing the Text Encoder](https://aclanthology.org/2024.naacl-long.140/)|ACL|
|Wu et al. | [Universal Prompt Optimizer for Safe Text-to-Image Generation](https://aclanthology.org/2024.naacl-long.351/)|ACL|
| Liu et al. | [Towards Safer Large Language Models through Machine Unlearning](https://aclanthology.org/2024.findings-acl.107/) | ACL |
| Kim et al. | [Towards Robust and Generalized Parameter-Efficient Fine-Tuning for Noisy Label Learning](https://aclanthology.org/2024.acl-long.322/) | ACL |
| Lee et al. | [Protecting Privacy Through Approximating Optimal Parameters for Sequence Unlearning in Language Models](https://aclanthology.org/2024.findings-acl.936/) | ACL |
| Choi et al. | [Cross-Lingual Unlearning of Selective Knowledge in Multilingual Language Models](https://aclanthology.org/2024.findings-emnlp.630/) | ACL |
| Isonuma et al. | [Unlearning Traces the Influential Training Data of Language Models](https://aclanthology.org/2024.acl-long.343.pdf) | ACL |
| Zhou et al. | [Visual In-Context Learning for Large Vision-Language Models](https://aclanthology.org/2024.findings-acl.940/) | ACL |
| Xing et al. | [EFUF: Efficient Fine-Grained Unlearning Framework for Mitigating Hallucinations in Multimodal Large Language Models](https://aclanthology.org/2024.emnlp-main.67/) | ACL |
| Yao et al. | [Machine Unlearning of Pre-trained Large Language Models](https://aclanthology.org/2024.acl-long.457/) | ACL |
| Zhao et al. | [Deciphering the Impact of Pretraining Data on Large Language Models through Machine Unlearning](https://aclanthology.org/2024.findings-acl.559/) | ACL |
| Ni et al. | [Forgetting before Learning: Utilizing Parametric Arithmetic for Knowledge Updating in Large Language Models](https://aclanthology.org/2024.acl-long.310/) | ACL |
| Zhou et al. | [Making Harmful Behaviors Unlearnable for Large Language Models](https://openreview.net/forum?id=a8cMY6s88u) | ACL |
| Yamashita et al. | [One-Shot Machine Unlearning with Mnemonic Code](https://openreview.net/forum?id=JQ7Ri3ccx6) | ACML |
| Fraboni et al. | [SIFU: Sequential Informed Federated Unlearning for Efficient and Provable Client Unlearning in Federated Optimization](https://proceedings.mlr.press/v238/fraboni24a.html) | AISTATS |
| Alshehri and Zhang | [Forgetting User Preference in Recommendation Systems with Label-Flipping](https://ieeexplore.ieee.org/abstract/document/10386603/authors#authors) | BigData |
| Qiu et al. | [FedCIO: Efficient Exact Federated Unlearning with Clustering, Isolation, and One-shot Aggregation](https://ieeexplore.ieee.org/document/10386788) | BigData |
| Yang and Li | [When Contrastive Learning Meets Graph Unlearning: Graph Contrastive Unlearning for Link Prediction](https://ieeexplore.ieee.org/abstract/document/10386624) | BigData |
| Hu et al. | [ERASER: Machine Unlearning in MLaaS via an Inference Serving-Aware Approach](https://dl.acm.org/doi/abs/10.1145/3658644.3670398) | CCS |
| Zhang et al. | [Negative Preference Optimization: From Catastrophic Collapse to Effective Unlearning](https://openreview.net/forum?id=MXLBXjQkmb) | COLM |
| Maini et al. | [TOFU: A Task of Fictitious Unlearning for LLMs](https://openreview.net/forum?id=B41hNBoWLo) | COLM |
| Abbasi et al. | [Brainwash: A Poisoning Attack to Forget in Continual Learning](https://openaccess.thecvf.com/content/CVPR2024/html/Abbasi_BrainWash_A_Poisoning_Attack_to_Forget_in_Continual_Learning_CVPR_2024_paper.html) | CVPR |
| Chen et al. | [Towards Memorization-Free Diffusion Models](https://openaccess.thecvf.com/content/CVPR2024/html/Chen_Towards_Memorization-Free_Diffusion_Models_CVPR_2024_paper.html)|CVPR|
| Lyu et al. | [One-Dimensional Adapter to Rule Them All: Concepts, Diffusion Models and Erasing Applications](https://openaccess.thecvf.com/content/CVPR2024/html/Lyu_One-dimensional_Adapter_to_Rule_Them_All_Concepts_Diffusion_Models_and_CVPR_2024_paper.html) | CVPR |
|Wallace et al. |[Diffusion Model Alignment Using Direct Preference Optimization](https://openaccess.thecvf.com/content/CVPR2024/html/Wallace_Diffusion_Model_Alignment_Using_Direct_Preference_Optimization_CVPR_2024_paper.html)|CVPR|
| Lu et al. | [MACE: Mass Concept Erasure in Diffusion Models](https://openaccess.thecvf.com/content/CVPR2024/html/Lu_MACE_Mass_Concept_Erasure_in_Diffusion_Models_CVPR_2024_paper.html)|CVPR|
| Chen et al. | [WPN: An Unlearning Method Based on N-pair Contrastive Learning in Language Models](https://ebooks.iospress.nl/doi/10.3233/FAIA240662) | ECAI |
| Fan et al. | [Challenging Forgets: Unveiling the Worst-Case Forget Sets in Machine Unlearning](https://link.springer.com/chapter/10.1007/978-3-031-72664-4_16) | ECCV |
|Gong et al. | [Reliable and Efficient Concept Erasure of Text-to-Image Diffusion Models](https://dl.acm.org/doi/10.1007/978-3-031-73668-1_5)|ECCV|
|Kim et al. | [R.A.C.E. : Robust Adversarial Concept Erasure for Secure Text-to-Image Diffusion M