{"id":13472361,"url":"https://github.com/kmeng01/rome","last_synced_at":"2025-03-26T15:32:00.969Z","repository":{"id":39749217,"uuid":"457995289","full_name":"kmeng01/rome","owner":"kmeng01","description":"Locating and editing factual associations in GPT (NeurIPS 2022)","archived":false,"fork":false,"pushed_at":"2024-04-20T05:32:37.000Z","size":23173,"stargazers_count":569,"open_issues_count":23,"forks_count":120,"subscribers_count":7,"default_branch":"main","last_synced_at":"2024-10-30T04:13:45.045Z","etag":null,"topics":["gpt","interpretability","pytorch","transformers"],"latest_commit_sha":null,"homepage":"https://rome.baulab.info","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/kmeng01.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":"CITATION.cff","codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2022-02-11T00:40:23.000Z","updated_at":"2024-10-27T19:00:19.000Z","dependencies_parsed_at":"2024-04-20T06:29:57.948Z","dependency_job_id":"cc3f0ddb-2fb5-4207-a850-18bc3474780a","html_url":"https://github.com/kmeng01/rome","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kmeng01%2Frome","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kmeng01%2Frome/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kmeng01%2Frome/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kmeng01%2Frome/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/kmeng01","download_url":"https://codeload.github.com/kmeng01/rome/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":245681426,"owners_count":20655192,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["gpt","interpretability","pytorch","transformers"],"created_at":"2024-07-31T16:00:54.064Z","updated_at":"2025-03-26T15:32:00.465Z","avatar_url":"https://github.com/kmeng01.png","language":"Python","funding_links":[],"categories":["Table of Contents","Others","Repositories","A01_文本生成_文本对话","Python"],"sub_categories":["LLM Interpretability Tools","大语言对话模型及数据"],"readme":"# Rank-One Model Editing (ROME)\n\nThis repository provides an implementation of Rank-One Model Editing (ROME) on auto-regressive transformers (GPU-only).\nWe currently support OpenAI's GPT-2 XL (1.5B) and EleutherAI's GPT-J (6B). The release of a 20B GPT-like model from EleutherAI is expected soon; we hope to support it ASAP.\n\nFeel free to open an issue if you find any problems; we are actively developing this repository and will monitor tickets closely.\n\n[![Colab ROME Demo](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/kmeng01/rome/blob/main/notebooks/rome.ipynb)\n\n\u003cp align=\"center\"\u003e\n    \u003cimg src=\"https://rome.baulab.info/images/eiftower-crop.svg\" alt=\"causal tracing GIF\" width=\"425px\" /\u003e\n\u003c/p\u003e\n\n## Table of Contents\n1. [Installation](#installation)\n2. [Causal Tracing](#causal-tracing)\n3. [Rank-One Model Editing (ROME)](#rank-one-model-editing-rome-1)\n4. [CounterFact](#counterfact)\n5. [Evaluation](#evaluation)\n    * [Running the Full Evaluation Suite](#running-the-full-evaluation-suite)\n    * [Integrating New Editing Methods](#integrating-new-editing-methods)\n6. [How to Cite](#how-to-cite)\n\n## Installation\n\nWe recommend `conda` for managing Python, CUDA, and PyTorch-related dependencies, and `pip` for everything else. To get started, simply install `conda` and run:\n```bash\n./scripts/setup_conda.sh\n```\n\n## Causal Tracing\n\n[`notebooks/causal_trace.ipynb`](notebooks/causal_trace.ipynb) demonstrates Causal Tracing, which can be modified to apply tracing to the processing of any statement.\n\n\u003cp align=\"center\"\u003e\n    \u003cimg src=\"https://thevisible.net/u/davidbau/romeweb/small-fast-ct-animation.gif\" alt=\"causal tracing GIF\" width=\"550px\" /\u003e\n\u003c/p\u003e\n\n## Rank-One Model Editing (ROME)\n\n\u003c!-- We provide a simple interactive notebook demonstrating ROME. --\u003e\n\n\u003c!-- ### Second-Moment Key Statistics\n\n**warning this is probably wrong; fixing later.**\n\nFirst, key statistics must be collected. The `rome` package contains a `layer_stats` module for computing and caching key statistics. See [rome/layer_stats.py](rome/layer_stats.py) for additional flags, but the basic logic can be executed with the following commands:\n\nGPT-2 XL:\n```bash\npython -m rome.layer_stats --layer_num=17 --model_name=gpt2-xl\n```\n\nGPT-J:\n```bash\npython -m rome.layer_stats --layer_num=10 --model_name=EleutherAI/gpt-j-6B\n```\n\n### ROME Model Rewriting --\u003e\n\n[`notebooks/rome.ipynb`](notebooks/rome.ipynb) demonstrates ROME. The API is simple; one simply has to specify a *requested rewrite* of the following form:\n\n```python\nrequest = {\n    \"prompt\": \"{} plays the sport of\",\n    \"subject\": \"LeBron James\",\n    \"target_new\": {\n        \"str\": \"football\"\n    }\n}\n```\n\nSeveral similar examples are included in the notebook.\n\n## CounterFact\n\nDetails coming soon!\n\n## Evaluation\n\nSee [`baselines/`](baselines/) for a description of the available baselines.\n\n### Running the Full Evaluation Suite\n\n[`experiments/evaluate.py`](experiments/evaluate.py) can be used to evaluate any method in [`baselines/`](baselines/).\nTo get started (e.g. using ROME on GPT-2 XL), run:\n```bash\npython3 -m experiments.evaluate \\\n    --alg_name=ROME \\\n    --model_name=gpt2-xl \\\n    --hparams_fname=gpt2-xl.json\n```\n\nResults from each run are stored at `results/\u003cmethod_name\u003e/run_\u003crun_id\u003e` in a specific format:\n```bash\nresults/\n|__ ROME/\n    |__ run_\u003crun_id\u003e/\n        |__ params.json\n        |__ case_0.json\n        |__ case_1.json\n        |__ ...\n        |__ case_10000.json\n```\n\nTo summarize the results, you can use [`experiments/summarize.py`](experiments/summarize.py):\n```bash\npython3 -m experiments.summarize --dir_name=ROME --runs=run_\u003crun_id\u003e\n```\n\nRunning `python3 -m experiments.evaluate -h` or `python3 -m experiments.summarize -h` provides details about command-line flags.\n\n### Integrating New Editing Methods\n\n\u003c!-- Say you have a new method `X` and want to benchmark it on CounterFact. Here's a checklist for evaluating `X`:\n- The public method that evaluates a model on each CounterFact record is [`compute_rewrite_quality`](experiments/py/eval_utils.py); see [the source code](experiments/py/eval_utils.py) for details.\n- In your evaluation script, you should call `compute_rewrite_quality` once with an unedited model and once with a model that has been edited with `X`. Each time, the function returns a dictionary. --\u003e\n\nSay you have a new method `X` and want to benchmark it on CounterFact. To integrate `X` with our runner:\n- Subclass [`HyperParams`](util/hparams.py) into `XHyperParams` and specify all hyperparameter fields. See [`ROMEHyperParameters`](rome/rome_hparams.py) for an example implementation.\n- Create a hyperparameters file at `hparams/X/gpt2-xl.json` and specify some default values. See [`hparams/ROME/gpt2-xl.json`](hparams/ROME/gpt2-xl.json) for an example.\n- Define a function `apply_X_to_model` which accepts several parameters and returns (i) the rewritten model and (ii) the original weight values for parameters that were edited (in the dictionary format `{weight_name: original_weight_value}`). See [`rome/rome_main.py`](rome/rome_main.py) for an example.\n- Add `X` to `ALG_DICT` in [`experiments/evaluate.py`](experiments/evaluate.py) by inserting the line `\"X\": (XHyperParams, apply_X_to_model)`.\n\nFinally, run the main scripts:\n```bash\npython3 -m experiments.evaluate \\\n    --alg_name=X \\\n    --model_name=gpt2-xl \\\n    --hparams_fname=gpt2-xl.json\n\npython3 -m experiments.summarize --dir_name=X --runs=run_\u003crun_id\u003e\n```\n\n### Note on Cross-Platform Compatibility\n\nWe currently only support methods that edit autoregressive HuggingFace models using the PyTorch backend. We are working on a set of general-purpose methods (usable on e.g. TensorFlow and without HuggingFace) that will be released soon.\n\n\u003c!-- \nEach method is customizable through a set of hyperparameters. For ROME, they are defined in `rome/hparams.py`. At runtime, you must specify a configuration of hyperparams through a `.json` file located in `hparams/\u003cmethod_name\u003e`. Check out [`hparams/ROME/default.json`](hparams/ROME/default.json) for an example.\n\nAt runtime, you must specify two command-line arguments: the method name, and the filename of the hyperparameters `.json` file.\n```bash\npython3 -m experiments.evaluate --alg_name=ROME --hparams_fname=default.json\n```\n\nRunning the following command will yield `dict` run summaries:\n```bash\npython3 -m experiments/summarize --alg_name=ROME --run_name=run_001\n``` --\u003e\n\n## How to Cite\n\n```bibtex\n@article{meng2022locating,\n  title={Locating and Editing Factual Associations in {GPT}},\n  author={Kevin Meng and David Bau and Alex Andonian and Yonatan Belinkov},\n  journal={Advances in Neural Information Processing Systems},\n  volume={35},\n  year={2022}\n}\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkmeng01%2Frome","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fkmeng01%2Frome","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkmeng01%2Frome/lists"}