https://github.com/zjunlp/knowledgecircuits
Knowledge Circuits in Pretrained Transformers
https://github.com/zjunlp/knowledgecircuits
artificial-intelligence circuit hallucination interpretability knowledge-circuit knowledge-editing knowledge-edting large-language-models model-editing natural-language-processing transformer
Last synced: 10 months ago
JSON representation
Knowledge Circuits in Pretrained Transformers
- Host: GitHub
- URL: https://github.com/zjunlp/knowledgecircuits
- Owner: zjunlp
- License: mit
- Created: 2024-01-16T07:23:21.000Z (about 2 years ago)
- Default Branch: main
- Last Pushed: 2024-09-18T06:35:24.000Z (over 1 year ago)
- Last Synced: 2024-09-18T09:04:36.399Z (over 1 year ago)
- Topics: artificial-intelligence, circuit, hallucination, interpretability, knowledge-circuit, knowledge-editing, knowledge-edting, large-language-models, model-editing, natural-language-processing, transformer
- Language: Python
- Homepage: http://knowledgecircuits.zjukg.cn/
- Size: 5.93 MB
- Stars: 46
- Watchers: 6
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
Knowledge Circuits
Knowledge Circuits in Pretrained Transformers
📄arXiv •
🌐Demo •
Youtube •
𝕏 Blog
[](https://github.com/zjunlp/KnowledgeCircuits)
[](https://opensource.org/licenses/MIT)

## 🔔News
- [2025-02-16] We release our new paper [How Do LLMs Acquire New Knowledge? A Knowledge Circuits Perspective on Continual Pre-Training](https://arxiv.org/abs/2502.11196), analyzing the evolution of knowledge circuits throughout continual pre-training. Check it out:)
- [2024-09-26] Our paper [Knowledge Circuits in Pretrained Transformers](https://arxiv.org/abs/2405.17969) is accepetd by NeurIPS 2024!
- [2024-05-28] We release our paper [Knowledge Circuits in Pretrained Transformers](https://arxiv.org/abs/2405.17969).
## Table of Contents
- 🌟[Overview](#overview)
- 🔧[Installation](#installation)
- 📚[Get the circuit](#get-the-circuit)
- 🧐[Analyze Component](#analyze-component)
- 🌻[Acknowledgement](#acknowledgement)
- 🚩[Citation](#citation)
---
## 🌟Overview
This work aims to build the circuits in the pretrained language models that are responsible for the specific knowledge and analyze the behavior of these components.
We construct a [demo](http://knowledgecircuits.zjukg.cn/) to see the discovered circuit.
* A new method [EAP-IG](https://arxiv.org/abs/2403.17806) is integrated in the eap folder. This method takes less time than the ACDC method and you can use it in the `knowledge_eap.ipynb`. If you are using the LLaMA2-7B-Chat model, running this file on a single GPU will require approximately 57,116M of GPU memory and 3-4 minutes.
## 🔧Installation
The filtered data for each kind of model is at [here](https://pan.zju.edu.cn/share/7c613d16095c504605f83eba72). Please download it and put it in the data folder.
Build the environement:
```
conda create -n knowledgecircuit python=3.10
pip install -r requirements.txt
```
❗️The code may fail under torch 2.x.x. We recommend torch 1.x.x
## 📚Get the circuit
Just run the following commond:
```
cd acdc
sh run.sh
```
Here is an example to run the circuit for the `country_capital_city` in `GPT2-Medium`.
```
MODEL_PATH=/path/to/the/model
KT=factual
KNOWLEDGE=country_capital_city
NUM_EXAMPLES=20
MODEL_NAME=gpt2-medium
python main.py --task=knowledge \
--zero-ablation \
--threshold=0.01 \
--device=cuda:0 \
--metric=match_nll \
--indices-mode=reverse \
--first-cache-cpu=False \
--second-cache-cpu=False \
--max-num-epochs=10000 \
--specific-knowledge=$KNOWLEDGE \
--num-examples=$NUM_EXAMPLES \
--relation-reverse=False \
--knowledge-type=$KT \
--model-name=$MODEL_NAME \
--model-path=$MODEL_PATH
```
You would get the results in `acdc/factual_results/gpt2-medium` and the `final_graph.pdf` is the computed circuits.
## 🧐Analyze component
Run the component.ipynb in notebook.
## 🌻Acknowledgement
We thank for the project of [transformer_lens](https://github.com/TransformerLensOrg/TransformerLens), [ACDC](https://github.com/ArthurConmy/Automatic-Circuit-Discovery) and [LRE](https://lre.baulab.info/).
The code in this work is built on top of these three projects' codes.
## 🚩Citation
Please cite our repository if you use Knowledge Circuit in your work. Thanks!
```bibtex
@article{DBLP:journals/corr/abs-2405-17969,
author = {Yunzhi Yao and
Ningyu Zhang and
Zekun Xi and
Mengru Wang and
Ziwen Xu and
Shumin Deng and
Huajun Chen},
title = {Knowledge Circuits in Pretrained Transformers},
journal = {CoRR},
volume = {abs/2405.17969},
year = {2024},
url = {https://doi.org/10.48550/arXiv.2405.17969},
doi = {10.48550/ARXIV.2405.17969},
eprinttype = {arXiv},
eprint = {2405.17969},
timestamp = {Fri, 21 Jun 2024 22:39:09 +0200},
biburl = {https://dblp.org/rec/journals/corr/abs-2405-17969.bib},
bibsource = {dblp computer science bibliography, https://dblp.org}
}
```