{"id":15116112,"url":"https://github.com/voidism/Lookback-Lens","last_synced_at":"2025-09-27T21:31:56.509Z","repository":{"id":247696896,"uuid":"825512096","full_name":"voidism/Lookback-Lens","owner":"voidism","description":"Code for the EMNLP 2024 paper \"Detecting and Mitigating Contextual Hallucinations in Large Language Models Using Only Attention Maps\"","archived":false,"fork":false,"pushed_at":"2024-08-13T21:19:53.000Z","size":23437,"stargazers_count":118,"open_issues_count":6,"forks_count":6,"subscribers_count":3,"default_branch":"main","last_synced_at":"2025-01-18T03:34:28.931Z","etag":null,"topics":["factuality","hallucination-detection","hallucinations","large-language-models","text-generation"],"latest_commit_sha":null,"homepage":"https://arxiv.org/abs/2407.07071","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/voidism.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-07-08T01:24:17.000Z","updated_at":"2025-01-16T08:21:35.000Z","dependencies_parsed_at":"2025-01-18T03:41:53.864Z","dependency_job_id":null,"html_url":"https://github.com/voidism/Lookback-Lens","commit_stats":null,"previous_names":["voidism/lookback-lens"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/voidism/Lookback-Lens","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/voidism%2FLookback-Lens","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/voidism%2FLookback-Lens/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/voidism%2FLookback-Lens/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/voidism%2FLookback-Lens/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/voidism","download_url":"https://codeload.github.com/voidism/Lookback-Lens/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/voidism%2FLookback-Lens/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":277295918,"owners_count":25794402,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-09-27T02:00:08.978Z","response_time":73,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["factuality","hallucination-detection","hallucinations","large-language-models","text-generation"],"created_at":"2024-09-26T01:44:09.977Z","updated_at":"2025-09-27T21:31:52.093Z","avatar_url":"https://github.com/voidism.png","language":"Python","funding_links":[],"categories":["Python"],"sub_categories":[],"readme":"# Lookback Lens 🔎 🦙\n\n[![License: MIT](https://img.shields.io/badge/License-MIT-g.svg)](https://opensource.org/licenses/MIT)\n[![Arxiv](https://img.shields.io/badge/arXiv-2407.07071-B21A1B)](https://arxiv.org/abs/2407.07071)\n[![Hugging Face Transformers](https://img.shields.io/badge/%F0%9F%A4%97-Transformers-blue)](https://github.com/huggingface/transformers)\n\n[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/voidism/Lookback-Lens/blob/master/lookback_lens_demo.ipynb)\n\nCode for the paper **\"Detecting and Mitigating Contextual Hallucinations in Large Language Models Using Only Attention Maps\"**\n\nPaper: https://arxiv.org/abs/2407.07071  \nAuthors: [Yung-Sung Chuang](https://people.csail.mit.edu/yungsung/)$^\\dagger$, [Linlu Qiu](https://linlu-qiu.github.io/)$^\\dagger$, [Cheng-Yu Hsieh](https://chengyuhsieh.github.io/)$^\\ddagger$, [Ranjay Krishna](https://ranjaykrishna.com/index.html)$^\\ddagger$, [Yoon Kim](https://people.csail.mit.edu/yoonkim/)$^\\dagger$, [James Glass](https://people.csail.mit.edu/jrg/)$^\\dagger$  \n$^\\dagger$ Massachusetts Institute of Technology, $^\\ddagger$ University of Washington\n\n## Introduction\n\nWhen asked to summarize articles or answer questions given a passage, large language models (LLMs) hallucinate details and respond with unsubstantiated answers that are inaccurate with respect to the input context.\n\nThis paper describes a simple approach for detecting such **contextual hallucinations**. We hypothesize that contextual hallucinations are related to the extent to which an LLM attends to information in the provided context versus its own generations. Based on this intuition, we propose a simple hallucination detection model whose input features are given by the ratio of attention weights on the context versus newly generated tokens (for each attention head).  We find that a linear classifier based on these **lookback ratio** features is as effective as a richer detector that utilizes the entire hidden states of an LLM or a text-based entailment model. \n\nThe lookback ratio-based detector—**Lookback Lens**—is found to transfer across tasks and even models, allowing a detector that is trained on a 7B model to be applied (without retraining) to a larger 13B model.   \n\nWe further apply this detector to mitigate hallucination generations, and find that a simple classifier-guided sampling approach is able to reduce the amount of hallucinations. For example, the detector is able to reduce hallucinations by 9.6% in the XSum summarization task.\n\n![lookback-lens](lookback-lens.png)\n\n\n## Installation\n\nPython version: 3.9.5  \nCUDA toolkit version: 11.7  \n\n```bash\npip install -r requirements.txt\npip install -e ./transformers-4.32.0\n```\n\n```\ngzip -d data/nq-open-10_total_documents_gold_at_4.jsonl.gz\n```\n\n## Preparation 📚\n\n**\\*\\*Hint: Skip step 01 \u0026 02 by downloading the precomputed lookback ratios \u0026 annotations [here](https://www.dropbox.com/scl/fi/a87iv6xw9xma6ppc5pw2h/step1and2.tar.bz?rlkey=j382rsrwu2wnfwj7sn14ai3qw\u0026dl=0).\\*\\***\n\n### Step 01: Extracting Lookback Ratios from Attention Weights (NQ and CNN/DM) (Optional)\n\n\u003e To load LLaMA2 models/tokenizers, please login with `huggingface-cli login`, or add the argument `--auth_token \u003chf_auth_token\u003e` where `\u003chf_auth_token\u003e` is your huggingface auth token with LLaMA2 access.\n\n```bash\npython step01_extract_attns.py --model-name meta-llama/Llama-2-7b-chat-hf --data-path data/nq-open-10_total_documents_gold_at_4.jsonl --output-path lookback-ratio-nq-7b.pt\npython step01_extract_attns.py --model-name meta-llama/Llama-2-7b-chat-hf --data-path data/cnndm-1000.jsonl --output-path lookback-ratio-cnndm-7b.pt\n```\n\n### Step 02: Run GPT-4o Annotation (NQ and CNN/DM) (Optional)\n```bash\nOPENAI_API_KEY={your_key} python step02_eval_gpt4o.py --hyp lookback-ratio-nq-7b.pt --ref data/nq-open-10_total_documents_gold_at_4.jsonl --out anno-nq-7b.jsonl\nOPENAI_API_KEY={your_key} python step02_eval_gpt4o.py --hyp lookback-ratio-cnndm-7b.pt --ref data/xsum-1000.jsonl --out anno-cnndm-7b.jsonl\n```\n\n## Logistic Regression Classifiers (Lookback Lens) 📈\n\n\n### Step 03: Fitting Lookback Lens Classifiers (NQ and CNN/DM)\n\n\u003e To load LLaMA2 models/tokenizers, please login with `huggingface-cli login`, or add the argument `--auth_token \u003chf_auth_token\u003e` where `\u003chf_auth_token\u003e` is your huggingface auth token with LLaMA2 access.\n\n```bash\n# Predefined Span\npython step03_lookback_lens.py --anno_1 anno-nq-7b.jsonl --anno_2 anno-cnndm-7b.jsonl --lookback_ratio_1 lookback-ratio-nq-7b.pt --lookback_ratio_2 lookback-ratio-cnndm-7b.pt\n# Sliding Window (=8)\npython step03_lookback_lens.py --anno_1 anno-nq-7b.jsonl --anno_2 anno-cnndm-7b.jsonl --lookback_ratio_1 lookback-ratio-nq-7b.pt --lookback_ratio_2 lookback-ratio-cnndm-7b.pt --sliding_window 8\n```\n\nThe output will be similar to:\n```\n# Predefined Span\n\n======== Results:\n                  , Train AUROC (on A), Test AUROC (on A), Transfer AUROC (on B)\nA=nq-7b;B=cnndm-7b, 0.9867235784623354, 0.9140908050233869, 0.8526936562673579\nA=cnndm-7b;B=nq-7b, 0.9844307377081996, 0.8720309189629751, 0.8203155443540785\n\n# Sliding Window (=8)\n\n======== Results:\n                  , Train AUROC (on A), Test AUROC (on A), Transfer AUROC (on B)\nA=nq-7b;B=cnndm-7b, 0.8858071459740011, 0.8663781955546325, 0.6624004639123215\nA=cnndm-7b;B=nq-7b, 0.8650978795284527, 0.8474340844981891, 0.6608756591251488\n```\n\n\n## Inference 🏃\n### Step 04: Run Greedy vs Classifier Guided Decoding (NQ and XSum)\n\nWe perform decoding with `classifiers/classifier_anno-cnndm-7b_sliding_window_8.pkl` for both tasks to test the in-domain (XSum) and out-of-domain (NQ) performance of the Lookback Lens Guided Decoding.\n\n\u003e To load LLaMA2 models/tokenizers, please login with `huggingface-cli login`, or add the argument `--auth_token \u003chf_auth_token\u003e` where `\u003chf_auth_token\u003e` is your huggingface auth token with LLaMA2 access.\n\n\n```bash\n# Greedy (NQ)\npython step04_run_decoding.py --model_name meta-llama/Llama-2-7b-chat-hf/ --data_path data/nq-open-10_total_documents_gold_at_4.jsonl --output_path output-nq-open-greedy-decoding.jsonl --num_gpus 1\n# Lookback Lens Guided Decoding (NQ)\npython step04_run_decoding.py --model_name meta-llama/Llama-2-7b-chat-hf/ --data_path data/nq-open-10_total_documents_gold_at_4.jsonl --output_path output-nq-open-lookback-decoding.jsonl --num_gpus 1 --do_sample --guiding_classifier classifiers/classifier_anno-cnndm-7b_sliding_window_8.pkl --chunk_size 8 --num_candidates 8 \n```\n\n\n```bash\n# Greedy (XSum)\npython step04_run_decoding.py --model_name meta-llama/Llama-2-7b-chat-hf/ --data_path data/xsum-1000.jsonl --output_path output-xsum-greedy-decoding.jsonl --num_gpus 1\n# Lookback Lens Guided Decoding (XSum)\npython step04_run_decoding.py --model_name meta-llama/Llama-2-7b-chat-hf/ --data_path data/xsum-1000.jsonl --output_path output-xsum-lookback-decoding.jsonl --num_gpus 1 --do_sample --guiding_classifier classifiers/classifier_anno-cnndm-7b_sliding_window_8.pkl --chunk_size 8 --num_candidates 8 \n```\n\n#### If too slow: Parallel (Sharded) Inference\n\nRunning inference in sharded mode can be done by setting `--parallel --total_shard 4 --shard_id 0` for the first shard, `--parallel --total_shard 4 --shard_id 1` for the second shard, and so on. The dataset will be split into 4 shards and the inference of each shard can be run in parallel.\n\n## Evaluation 📊\n\n### Run Exact Match Evaluation (NQ)\n```bash\npython eval_exact_match.py --hyp output-nq-open-greedy-decoding.jsonl --ref data/nq-open-10_total_documents_gold_at_4.jsonl\npython eval_exact_match.py --hyp output-nq-open-lookback-decoding.jsonl --ref data/nq-open-10_total_documents_gold_at_4.jsonl\n```\n\nThe output will be similar to:\n```\n# Greedy\nBest span EM: 0.711864406779661\n# Lookback Lens Guided Decoding\nBest span EM: 0.7419962335216572 (by random sampling so the result may vary)\n```\n\n### Run GPT-4o Evaluation (XSum)\n```bash\nOPENAI_API_KEY={your_key} python step02_eval_gpt4o.py --hyp output-xsum-greedy-decoding.jsonl --ref data/xsum-1000.jsonl --out record-gpt4o-eval-xsum-greedy-decoding.jsonl \nOPENAI_API_KEY={your_key} python step02_eval_gpt4o.py --hyp output-xsum-lookback-decoding.jsonl --ref data/xsum-1000.jsonl --out record-gpt4o-eval-xsum-lookback-decoding.jsonl \n```\n\nThe output will be similar to:\n```\n# Greedy\nAccuracy: 0.490\n# Lookback Lens Guided Decoding\nAccuracy: 0.586\n(the result may vary due to the randomness of GPT-4o API and the randomness of sampling)\n```\n\n# Citation\n\nPlease cite our paper if it's helpful to your work!\n\n```\n@article{chuang2024lookback,\n  title={Lookback Lens: Detecting and Mitigating Contextual Hallucinations in Large Language Models Using Only Attention Maps},\n  author={Chuang, Yung-Sung and Qiu, Linlu and Hsieh, Cheng-Yu and Krishna, Ranjay and Kim, Yoon and Glass, James},\n  journal={arXiv preprint arXiv:2407.07071},\n  year={2024},\n}\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fvoidism%2FLookback-Lens","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fvoidism%2FLookback-Lens","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fvoidism%2FLookback-Lens/lists"}