https://github.com/zjunlp/knowledgecircuits

Knowledge Circuits in Pretrained Transformers
https://github.com/zjunlp/knowledgecircuits

artificial-intelligence circuit hallucination interpretability knowledge-circuit knowledge-editing knowledge-edting large-language-models model-editing natural-language-processing transformer

Last synced: 11 months ago
JSON representation

Knowledge Circuits in Pretrained Transformers

Host: GitHub
URL: https://github.com/zjunlp/knowledgecircuits
Owner: zjunlp
License: mit
Created: 2024-01-16T07:23:21.000Z (over 2 years ago)
Default Branch: main
Last Pushed: 2024-09-18T06:35:24.000Z (over 1 year ago)
Last Synced: 2024-09-18T09:04:36.399Z (over 1 year ago)
Topics: artificial-intelligence, circuit, hallucination, interpretability, knowledge-circuit, knowledge-editing, knowledge-edting, large-language-models, model-editing, natural-language-processing, transformer
Language: Python
Homepage: http://knowledgecircuits.zjukg.cn/
Size: 5.93 MB
Stars: 46
Watchers: 6
Forks: 1
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

          
 Knowledge Circuits 

 Knowledge Circuits in Pretrained Transformers 




  📄arXiv •

  🌐Demo •

    Youtube • 

    𝕏 Blog



[![Awesome](https://awesome.re/badge.svg)](https://github.com/zjunlp/KnowledgeCircuits) 

[![License: MIT](https://img.shields.io/badge/License-MIT-green.svg)](https://opensource.org/licenses/MIT)

![](https://img.shields.io/github/last-commit/zjunlp/KnowledgeCircuits?color=green) 

## 🔔News

- [2025-02-16] We release our new paper [How Do LLMs Acquire New Knowledge? A Knowledge Circuits Perspective on Continual Pre-Training](https://arxiv.org/abs/2502.11196), analyzing the evolution of knowledge circuits throughout continual pre-training. Check it out:)

- [2024-09-26] Our paper [Knowledge Circuits in Pretrained Transformers](https://arxiv.org/abs/2405.17969) is accepetd by NeurIPS 2024!

- [2024-05-28] We release our paper [Knowledge Circuits in Pretrained Transformers](https://arxiv.org/abs/2405.17969).

## Table of Contents

- 🌟[Overview](#overview)

- 🔧[Installation](#installation)

- 📚[Get the circuit](#get-the-circuit)

- 🧐[Analyze Component](#analyze-component)

- 🌻[Acknowledgement](#acknowledgement)

- 🚩[Citation](#citation)

---

## 🌟Overview

This work aims to build the circuits in the pretrained language models that are responsible for the specific knowledge and analyze the behavior of these components.

We construct a [demo](http://knowledgecircuits.zjukg.cn/) to see the discovered circuit.

* A new method [EAP-IG](https://arxiv.org/abs/2403.17806) is integrated in the eap folder. This method takes less time than the ACDC method and you can use it in the `knowledge_eap.ipynb`. If you are using the LLaMA2-7B-Chat model, running this file on a single GPU will require approximately 57,116M of GPU memory and 3-4 minutes. 

## 🔧Installation

The filtered data for each kind of model is at [here](https://pan.zju.edu.cn/share/7c613d16095c504605f83eba72). Please download it and put it in the data folder.

Build the environement:

```

conda create -n knowledgecircuit python=3.10

pip install -r requirements.txt

```

❗️The code may fail under torch 2.x.x. We recommend torch 1.x.x

## 📚Get the circuit

Just run the following commond:

```

cd acdc

sh run.sh

```

Here is an example to run the circuit for the `country_capital_city` in `GPT2-Medium`.

```

MODEL_PATH=/path/to/the/model

KT=factual 

KNOWLEDGE=country_capital_city

NUM_EXAMPLES=20

MODEL_NAME=gpt2-medium

python main.py --task=knowledge \

--zero-ablation \

--threshold=0.01 \

--device=cuda:0 \

--metric=match_nll \

--indices-mode=reverse \

--first-cache-cpu=False \

--second-cache-cpu=False \

--max-num-epochs=10000 \

--specific-knowledge=$KNOWLEDGE \

--num-examples=$NUM_EXAMPLES \

--relation-reverse=False \

--knowledge-type=$KT \

--model-name=$MODEL_NAME \

--model-path=$MODEL_PATH

```

You would get the results in `acdc/factual_results/gpt2-medium` and the  `final_graph.pdf` is the computed circuits.

## 🧐Analyze component

Run the component.ipynb in notebook.

## 🌻Acknowledgement

We thank for the project of [transformer_lens](https://github.com/TransformerLensOrg/TransformerLens), [ACDC](https://github.com/ArthurConmy/Automatic-Circuit-Discovery) and [LRE](https://lre.baulab.info/).

The code in this work is built on top of these three projects' codes.

## 🚩Citation

Please cite our repository if you use Knowledge Circuit in your work. Thanks!

```bibtex

@article{DBLP:journals/corr/abs-2405-17969,

  author       = {Yunzhi Yao and

                  Ningyu Zhang and

                  Zekun Xi and

                  Mengru Wang and

                  Ziwen Xu and

                  Shumin Deng and

                  Huajun Chen},

  title        = {Knowledge Circuits in Pretrained Transformers},

  journal      = {CoRR},

  volume       = {abs/2405.17969},

  year         = {2024},

  url          = {https://doi.org/10.48550/arXiv.2405.17969},

  doi          = {10.48550/ARXIV.2405.17969},

  eprinttype    = {arXiv},

  eprint       = {2405.17969},

  timestamp    = {Fri, 21 Jun 2024 22:39:09 +0200},

  biburl       = {https://dblp.org/rec/journals/corr/abs-2405-17969.bib},

  bibsource    = {dblp computer science bibliography, https://dblp.org}

}

```

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/zjunlp/knowledgecircuits

Awesome Lists containing this project

README

Knowledge Circuits

Knowledge Circuits in Pretrained Transformers