An open API service indexing awesome lists of open source software.

https://github.com/zjunlp/knowledgecircuits

Knowledge Circuits in Pretrained Transformers
https://github.com/zjunlp/knowledgecircuits

artificial-intelligence circuit hallucination interpretability knowledge-circuit knowledge-editing knowledge-edting large-language-models model-editing natural-language-processing transformer

Last synced: 10 months ago
JSON representation

Knowledge Circuits in Pretrained Transformers

Awesome Lists containing this project

README

          

Knowledge Circuits


Knowledge Circuits in Pretrained Transformers


📄arXiv
🌐Demo
Youtube
𝕏 Blog

[![Awesome](https://awesome.re/badge.svg)](https://github.com/zjunlp/KnowledgeCircuits)
[![License: MIT](https://img.shields.io/badge/License-MIT-green.svg)](https://opensource.org/licenses/MIT)
![](https://img.shields.io/github/last-commit/zjunlp/KnowledgeCircuits?color=green)

## 🔔News

- [2025-02-16] We release our new paper [How Do LLMs Acquire New Knowledge? A Knowledge Circuits Perspective on Continual Pre-Training](https://arxiv.org/abs/2502.11196), analyzing the evolution of knowledge circuits throughout continual pre-training. Check it out:)
- [2024-09-26] Our paper [Knowledge Circuits in Pretrained Transformers](https://arxiv.org/abs/2405.17969) is accepetd by NeurIPS 2024!
- [2024-05-28] We release our paper [Knowledge Circuits in Pretrained Transformers](https://arxiv.org/abs/2405.17969).

## Table of Contents
- 🌟[Overview](#overview)
- 🔧[Installation](#installation)
- 📚[Get the circuit](#get-the-circuit)
- 🧐[Analyze Component](#analyze-component)
- 🌻[Acknowledgement](#acknowledgement)
- 🚩[Citation](#citation)

---

## 🌟Overview

This work aims to build the circuits in the pretrained language models that are responsible for the specific knowledge and analyze the behavior of these components.
We construct a [demo](http://knowledgecircuits.zjukg.cn/) to see the discovered circuit.
* A new method [EAP-IG](https://arxiv.org/abs/2403.17806) is integrated in the eap folder. This method takes less time than the ACDC method and you can use it in the `knowledge_eap.ipynb`. If you are using the LLaMA2-7B-Chat model, running this file on a single GPU will require approximately 57,116M of GPU memory and 3-4 minutes.

## 🔧Installation

The filtered data for each kind of model is at [here](https://pan.zju.edu.cn/share/7c613d16095c504605f83eba72). Please download it and put it in the data folder.

Build the environement:
```
conda create -n knowledgecircuit python=3.10
pip install -r requirements.txt
```
❗️The code may fail under torch 2.x.x. We recommend torch 1.x.x

## 📚Get the circuit

Just run the following commond:
```
cd acdc
sh run.sh
```
Here is an example to run the circuit for the `country_capital_city` in `GPT2-Medium`.
```
MODEL_PATH=/path/to/the/model
KT=factual
KNOWLEDGE=country_capital_city
NUM_EXAMPLES=20
MODEL_NAME=gpt2-medium

python main.py --task=knowledge \
--zero-ablation \
--threshold=0.01 \
--device=cuda:0 \
--metric=match_nll \
--indices-mode=reverse \
--first-cache-cpu=False \
--second-cache-cpu=False \
--max-num-epochs=10000 \
--specific-knowledge=$KNOWLEDGE \
--num-examples=$NUM_EXAMPLES \
--relation-reverse=False \
--knowledge-type=$KT \
--model-name=$MODEL_NAME \
--model-path=$MODEL_PATH
```

You would get the results in `acdc/factual_results/gpt2-medium` and the `final_graph.pdf` is the computed circuits.

## 🧐Analyze component

Run the component.ipynb in notebook.

## 🌻Acknowledgement

We thank for the project of [transformer_lens](https://github.com/TransformerLensOrg/TransformerLens), [ACDC](https://github.com/ArthurConmy/Automatic-Circuit-Discovery) and [LRE](https://lre.baulab.info/).
The code in this work is built on top of these three projects' codes.

## 🚩Citation

Please cite our repository if you use Knowledge Circuit in your work. Thanks!

```bibtex
@article{DBLP:journals/corr/abs-2405-17969,
author = {Yunzhi Yao and
Ningyu Zhang and
Zekun Xi and
Mengru Wang and
Ziwen Xu and
Shumin Deng and
Huajun Chen},
title = {Knowledge Circuits in Pretrained Transformers},
journal = {CoRR},
volume = {abs/2405.17969},
year = {2024},
url = {https://doi.org/10.48550/arXiv.2405.17969},
doi = {10.48550/ARXIV.2405.17969},
eprinttype = {arXiv},
eprint = {2405.17969},
timestamp = {Fri, 21 Jun 2024 22:39:09 +0200},
biburl = {https://dblp.org/rec/journals/corr/abs-2405-17969.bib},
bibsource = {dblp computer science bibliography, https://dblp.org}
}
```