Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/trae1oung/awesome-parametric-knowledge-in-llms

Must-read papers and blogs about parametric knowledge mechanism in LLMs.
https://github.com/trae1oung/awesome-parametric-knowledge-in-llms

List: awesome-parametric-knowledge-in-llms

awesome knowledge large-language-models llms papers parametric-analysis prompt prompt-engineering survey

Last synced: 17 days ago
JSON representation

Must-read papers and blogs about parametric knowledge mechanism in LLMs.

Awesome Lists containing this project

README

        

Awesome Parametric Knowledge in LLMs

[![LICENSE](https://img.shields.io/github/license/Xnhyacinth/Awesome-LLM-Long-Context-Modeling)](https://github.com/Trae1ounG/Awesome-parametric-Knowledge-in-LLMs/blob/main/LICENSE)
![Awesome](https://cdn.rawgit.com/sindresorhus/awesome/d7305f38d29fed78fa85652e3a63e154dd8e8829/media/badge.svg)
[![commit](https://img.shields.io/github/last-commit/Trae1ounG/Awesome-parametric-Knowledge-in-LLMs?color=blue)](https://github.com/Xnhyacinth/Long_Text_Modeling_Papers/commits/main)
[![PR](https://img.shields.io/badge/PRs-Welcome-red)](https://github.com/Trae1ounG/Awesome-parametric-Knowledge-in-LLMs/pulls)
[![GitHub Repo stars](https://img.shields.io/github/stars/Trae1ounG/Awesome-parametric-Knowledge-in-LLMs)](https://github.com/Trae1ounG/Awesome-parametric-Knowledge-in-LLMs)


This repo includes papers about parametric knowledge in LLMs, now we have parametric knowledge detection and parametric knowledge application these two main categories!👻

We believe that the parametric knowledge in LLMs is still a largely unexplored area, and we hope this repository will provide you with some valuable insights!😶‍🌫️

# Prametric Knowledge Detection
## Knowledge in Transformer-based Model🧠
### 2024
1. **[What does the knowledge neuron thesis have to do with knowledge? ](https://arxiv.org/abs/2405.02421)**

*Jingcheng Niu, Andrew Liu, Zining Zhu, Gerald Penn.* ICLR'24(Spotlight)
2. **[Identifying query-relevant neurons in large language models for long-form texts](https://arxiv.org/abs/2406.10868)**

*Lihu Chen, Adam Dejl, Francesca Toni.* Preprint'24
### 2022
1. **[Knowledge Neurons in Pretrained Transformers](https://arxiv.org/abs/2104.08696)**[[code](https://github.comHunter-DDM/knowledge-neurons)]

*Damai Dai, Li Dong, Yaru Hao, Zhifang Sui, Baobao Chang, Furu Wei.* ACL'22
### 2021
1. **[Transformer Feed-Forward Layers Are Key-Value Memories](https://arxiv.org/abs/2012.14913)**

*Mor Geva, Roei Schuster, Jonathan Berant, Omer Levy.* EMNLP'21
## Different Type Neurons in LLMs👀
### 2024
1. **[Language-specific neurons: The key to multilingual capabilities in large language models.](https://arxiv.org/abs/2402.16438)**

*Tianyi Tang, Wenyang Luo, Haoyang Huang, Dongdong Zhang, Xiaolei Wang, Xin Zhao, Furu Wei, Ji-Rong Wen.* ACL'24
2. **[Does Large Language Model contain Task-Specific Neurons.]**

*Waiting* EMNLP'24
# Parametric Knowledge Application
## Knowledge Editing 🧑‍⚕️
### 2024
1. **[A Comprehensive Study of Knowledge Editing for Large Language Models](https://arxiv.org/abs/2401.01286)**

*Ningyu Zhang, Yunzhi Yao, Bozhong Tian, Peng Wang, Shumin Deng, Mengru Wang, Zekun Xi, Shengyu Mao, Jintian Zhang, Yuansheng Ni, Siyuan Cheng, Ziwen Xu, Xin Xu, Jia-Chen Gu, Yong Jiang, Pengjun Xie, Fei Huang, Lei Liang, Zhiqiang Zhang, Xiaowei Zhu, Jun Zhou, Huajun Chen.* Preprint'24

2. **[FAME: Towards Factual Multi-Task Model Editing](https://arxiv.org/abs/2410.10859)**[[code](https://github.com/BITHLP/FAME)]

*Li Zeng, Yingyu Shan, Zeming Liu, Jiashu Yao, Yuhang Guo.* EMNLP'24

3. **[To Forget or Not? Towards Practical Knowledge Unlearning for Large Language Models](https://arxiv.org/abs/2407.01920)**[[code](https://github.com/zjunlp/KnowUnDo)]

*Bozhong Tian, Xiaozhuan Liang, Siyuan Cheng, Qingbin Liu, Mengru Wang, Dianbo Sui, Xi Chen, Huajun Chen, Ningyu Zhang.* EMNLP'24 findings

4. **[Understanding the Collapse of LLMs in Model Editing](https://arxiv.org/abs/2406.11263)**

*Wanli Yang, Fei Sun, Jiajun Tan, Xinyu Ma, Du Su, Dawei Yin, Huawei Shen.* EMNLP'24 findings

5. **[Is it possible to edit large language models robustly?](https://arxiv.org/pdf/2402.05827)**[[code](https://github.com/xbmxb/edit_analysis)]

*Xinbei Ma, Tianjie Ju, Jiyang Qiu, Zhuosheng Zhang, Hai Zhao, Lifeng Liu, Yulong Wang.* Preprint'24

6. **[Retrieval-enhanced Knowledge Editing in Language Models for Multi-Hop Question Answering](https://arxiv.org/pdf/2403.19631)**[[code](https://github.com/sycny/RAE)]

*Yucheng Shi, Qiaoyu Tan, Xuansheng Wu, Shaochen Zhong, Kaixiong Zhou, Ninghao Liu.* CIKM'24
### 2023
1. **[Editing Large Language Models: Problems, Methods, and Opportunities](https://arxiv.org/abs/2305.13172)**

*Yunzhi Yao, Peng Wang, Bozhong Tian, Siyuan Cheng, Zhoubo Li, Shumin Deng, Huajun Chen, Ningyu Zhang.* EMNLP'23
### 2022
1. **[Locating and Editing Factual Associations in GPT](https://arxiv.org/abs/2202.05262)**

*Kevin Meng, David Bau, Alex Andonian, Yonatan Belinkov.* NIPS'22
2. **[Memory-Based Model Editing at Scale](https://arxiv.org/abs/2206.06520)**

*Eric Mitchell, Charles Lin, Antoine Bosselut, Christopher D. Manning, Chelsea Finn.* ICLR'22
### 2021
1. **[Editing Factual Knowledge in Language Models](https://arxiv.org/abs/2104.08164)**

*Nicola De Cao, Wilker Aziz, Ivan Titov.* EMNLP'21
### 2020
1. **[Editable neural networks.](https://arxiv.org/abs/2004.00345)**

*Anton Sinitsin, Vsevolod Plokhotnyuk, Dmitriy Pyrkin, Sergei Popov, Artem Babenko.* ICLR'20
## Knowledge Transfer🧚‍♀️
### 2024
1. **[Seeking Neural Nuggets: Knowledge Transfer in Large Language Models from a Parametric Perspective](https://arxiv.org/abs/2310.11451)**

*Ming Zhong, Chenxin An, Weizhu Chen, Jiawei Han, Pengcheng He.* ICLR'24

2. **[Initializing models with larger ones](https://arxiv.org/abs/2311.18823)**[[code](https://github.com/OscarXZQ/weight-selection)]

*Zhiqiu Xu, Yanjie Chen, Kirill Vishniakov, Yida Yin, Zhiqiang Shen, Trevor Darrell, Lingjie Liu, Zhuang Liu.* ICLR'24 **Spotlight**

3. **[Cross-model Control: Improving Multiple Large Language Models in One-time Training](https://www.arxiv.org/abs/2410.17599)**[[code](https://github.com/wujwyi/CMC)]

*Jiayi Wu, Hao Sun, Hengyi Cai, Lixin Su, Shuaiqiang Wang, Dawei Yin, Xiang Li, Ming Gao.* NIPS'24
### 2023
1. **[Mutual enhancement of large and small language models with cross-silo knowledge transfer](https://arxiv.org/abs/2312.05842)**

*Yongheng Deng, Ziqing Qiao, Ju Ren, Yang Liu, Yaoxue Zhang.* Preprint'23

2. **[Learning to grow pretrained models for efficient transformer training](https://arxiv.org/abs/2303.00980)**[[code](https://github.com/VITA-Group/LiGO)]

*Peihao Wang, Rameswar Panda, Lucas Torroba Hennigen, Philip Greengard, Leonid Karlinsky, Rogerio Feris, David D. Cox, Zhangyang Wang, Yoon Kim.* ICLR'23

3. **[Retrieval-based knowledge transfer: An effective approach for extreme large language model compression](https://arxiv.org/abs/2310.15594)**

*Jiduan Liu, Jiahao Liu, Qifan Wang, Jingang Wang, Xunliang Cai, Dongyan Zhao, Ran Lucien Wang, Rui Yan.* EMNLP'23 Findings
### 2021
1. **[Weight distillation: Transferring the knowledge in neural network parameters](https://arxiv.org/abs/2009.09152)**[[code](https://github.com/Lollipop321/weight-distillation)]

*Ye Lin, Yanyang Li, Ziyang Wang, Bei Li, Quan Du, Tong Xiao, Jingbo Zhu.* ACL'21
## Knowledge Distillation
### 2024
1. **[PromptKD: Distilling Student-Friendly Knowledge for Generative Language Models via Prompt Tuning](https://arxiv.org/abs/2402.12842)**[[code](https://promptkd.github.io/)] (Note: not parametric)

*Gyeongman Kim, Doohyuk Jang, Eunho Yang.* EMNLP'24 findings

2. **[From Instance Training to Instruction Learning: Task Adapters Generation from Instructions](https://arxiv.org/abs/2406.12382)**[[code](https://github.com/Xnhyacinth/TAGI/tree/master)]

*Huanxuan Liao, Yao Xu, Shizhu He, Yuanzhe Zhang, Yanchao Hao, Shengping Liu, Kang Liu, Jun Zhao.* NIPS'24