https://github.com/idiap/hallucination-detection

Last synced: 11 months ago
JSON representation

Host: GitHub
URL: https://github.com/idiap/hallucination-detection
Owner: idiap
License: mit
Created: 2022-10-28T15:19:15.000Z (over 3 years ago)
Default Branch: main
Last Pushed: 2022-12-05T10:44:27.000Z (over 3 years ago)
Last Synced: 2025-04-15T03:46:32.024Z (about 1 year ago)
Language: Python
Size: 680 KB
Stars: 9
Watchers: 5
Forks: 2
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

# hallucination-detection
This repository contains the code and data for the paper [Unsupervised Token-level Hallucination Detection from Summary Generation By-products](#) by Andreas Marfurt and James Henderson, presented at the GEM workshop at EMNLP 2022.

## Contents
1. [Short Description](#short-description)
2. [Data](#data)
3. [Installation](#installation)
4. [Usage](#usage)
5. [Results](#results)
6. [Contact](#contact)
7. [Acknowledgments](#acknowledgments)
8. [Citation](#citation)

## Short Description
Our method *BART-GBP* gives token-level hallucination probabilities for summaries generated by BART. We use the `facebook/bart-large-cnn` model made available by Hugging Face [on their model hub](https://huggingface.co/facebook/bart-large-cnn). We first align the summary and source document with the help of BART's cross-attention, then classify aligned tokens for intrinsic hallucination and unaligned tokens for extrinsic hallucination. Our method was evaluated on CNN/DailyMail, but we expect it perform similarly on other equally extractive summarization datasets.

## Data
We provide the following data:
- [data/frank_annotations.jsonl](data/frank_annotations.jsonl): Token-level hallucination annotations of 250 CNN/DM summaries with 15700 words, of which 57 (0.4%) are hallucinations (31 intrinsic, 26 extrinsic).
- [data/tlhd-cnndm_annotations.jsonl](data/tlhd-cnndm_annotations.jsonl): Token-level hallucination annotations of 150 CNN/DM summaries, one selected sentence per summary, 2100 words with 299 (14.2%) hallucinations (51 intrinsic, 248 extrinsic).

## Installation
First, install conda, e.g. from [Miniconda](https://docs.conda.io/en/latest/miniconda.html). Then create and activate the environment:
```
conda env create -f environment.yml
conda activate hallucination-detection
```

## Usage
To reproduce the results of BART-GBP on the FRANK dataset, run the following steps:
1. Get BART's outputs (attentions and decoding entropies): [Get BART Outputs](#get-bart-outputs)
2. Compute the scores (association strength, fraction unaligned, inverse decoding entropy): [Compute BART-GBP Scores](#compute-bart-gbp-scores)
3. Evaluate the scores by computing the points on the precision/recall curve: [Evaluate Scores](#evaluate-scores)
4. Plot the results: [Plot Results](#plot-results)

For the TLHD-CNNDM dataset, please adjust the paths of outputs, scores, predictions and results.

### Get BART Outputs
First we need to save BART's outputs from summary generation.
```
python save_bart_outputs_for_alignment.py --cross_attention_layers 9 10 --encoder_layers 9 10
```

Since we use beam search decoding, we have to select the cross-attentions, encoder self-attentions and decoding probabilities of the eventually selected beam. This is taken care of in the `BartSummarizer` model by the `GenerationMixinEncoderDecoder` mixin.

As this is a research project, we store these outputs such that we can run multiple experiments on them. In production, one would include the subsequent steps into summary generation to compute hallucination probabilities in an online fashion.

### Compute BART-GBP Scores
The following script computes the scores for each token in the summary (sentence):
```
python bart_gbp_scores.py
```

### Evaluate Scores
To run evaluation, we first convert the token scores into word probabilities of hallucination:
```
python convert_token_scores_to_word_probs.py
```

Then we compute the points on the precision/recall curve. They are a result of varying the hallucination probability thresholds for classifying a data point as hallucination. The script also computes the ROC curve:
```
python evaluate_bart_gbp.py
```

### Plot Results
Once all results are computed, we plot the PR and ROC curves for intrinsic/extrinsic/all hallucinations with:
```
python plot_results.py
```

## Results
We've uploaded our model predictions and results for BART-GBP and the baselines [here](https://drive.google.com/file/d/1FN8GqXAInAsJhJH5tbdi1ygOHN5GBBMv/view?usp=sharing).

## Contact
In case of problems or questions open a Github issue or write an email to andreas.marfurt [at] idiap.ch.

## Acknowledgments
The work was supported as a part of the grant Automated interpretation of political and economic policy documents: Machine learning using semantic and syntactic information, funded by the Swiss National Science Foundation (grant number CRSII5_180320).

## Citation
If you use our code, data or models, please cite us.
```
@inproceedings{marfurt-etal-2022-corpus,
title = "Unsupervised Token-level Hallucination Detection from Summary Generation By-products",
author = "Marfurt, Andreas and
Henderson, James",
booktitle = "Proceedings of the Second Workshop on Generation, Evaluation and Metrics",
month = dec,
year = "2022",
publisher = "Association for Computational Linguistics",
}
```

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/idiap/hallucination-detection

Awesome Lists containing this project

README