https://github.com/shikiw/awesome-mllm-hallucination
Papers about Hallucination in Multi-Modal Large Language Models (MLLMs)
https://github.com/shikiw/awesome-mllm-hallucination
List: awesome-mllm-hallucination
Last synced: about 1 month ago
JSON representation
Papers about Hallucination in Multi-Modal Large Language Models (MLLMs)
- Host: GitHub
- URL: https://github.com/shikiw/awesome-mllm-hallucination
- Owner: shikiw
- Created: 2023-11-27T07:06:57.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2024-11-21T12:42:29.000Z (6 months ago)
- Last Synced: 2025-04-13T12:00:04.018Z (about 1 month ago)
- Size: 162 KB
- Stars: 87
- Watchers: 2
- Forks: 1
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
- ultimate-awesome - awesome-mllm-hallucination - Papers about Hallucination in Multi-Modal Large Language Models (MLLMs). (Other Lists / Julia Lists)
README
# Awesome Hallucination Papers in MLLMs
A curated list of papers about hallucination in multi-modal large language models (MLLMs)## Survey Papers
This section collects the survey papers about MLLM's hallucination.- **A Survey on Hallucination in Large Vision-Language Models** [[paper]](https://arxiv.org/pdf/2402.00253v1.pdf)
`Arxiv 2024/02`
- **Hallucination of Multimodal Large Language Models: A Survey** [[paper]](https://arxiv.org/abs/2404.18930)
`Arxiv 2024/04`
## Hallucination Evaluation
This section collects the benchmark papers on evaluating MLLM's hallucination.- **Evaluating Object Hallucination in Large Vision-Language Models** [[paper]](https://arxiv.org/pdf/2305.10355.pdf) [[code]](https://github.com/RUCAIBox/POPE)
`EMNLP 2023`
- **HallusionBench: You See What You Think? Or You Think What You See? An Image-Context Reasoning Benchmark Challenging for GPT-4V(ision), LLaVA-1.5, and Other Multi-modality Models** [[paper]](https://arxiv.org/pdf/2310.14566.pdf) [[code]](https://github.com/tianyi-lab/HallusionBench)
`CVPR 2024`
- *Aligning Large Multimodal Models with Factually Augmented RLHF* [[paper]](https://arxiv.org/pdf/2309.14525.pdf) [[code]](https://github.com/llava-rlhf/LLaVA-RLHF)
`Arxiv 2023/09`
- *An LLM-free Multi-dimensional Benchmark for MLLMs Hallucination Evaluation* [[paper]](https://arxiv.org/pdf/2311.07397.pdf) [[code]](https://github.com/junyangwang0410/AMBER)
`Arxiv 2023/11`
- *Holistic Analysis of Hallucination in GPT-4V(ision): Bias and Interference Challenges* [[paper]](https://arxiv.org/pdf/2311.03287.pdf) [[code]](https://github.com/gzcch/Bingo)
`Arxiv 2023/11`
- *Hallucination Benchmark in Medical Visual Question Answering* [[paper]](https://arxiv.org/pdf/2401.05827v1.pdf)
`Arxiv 2024/01`
- *The Instinctive Bias: Spurious Images lead to Hallucination in MLLMs* [[paper]](https://arxiv.org/pdf/2402.03757v1.pdf) [[code]](https://github.com/MasaiahHan/CorrelationQA)
`Arxiv 2024/02`
- *Unified Hallucination Detection for Multimodal Large Language Models* [[paper]](https://arxiv.org/pdf/2402.03190v1.pdf) [[code]](https://github.com/OpenKG-ORG/EasyDetect)
`Arxiv 2024/02`
- *Visual Hallucinations of Multi-modal Large Language Models* [[paper]](https://arxiv.org/pdf/2402.14683v1.pdf) [[code]](https://github.com/wenhuang2000/VHTest)
`Arxiv 2024/02`
- *Hal-Eval: A Universal and Fine-grained Hallucination Evaluation Framework for Large Vision Language Models* [[paper]](https://arxiv.org/pdf/2402.15721v1.pdf)
`Arxiv 2024/02`
- *PhD: A Prompted Visual Hallucination Evaluation Dataset* [[paper]](https://arxiv.org/pdf/2403.11116v1.pdf) [[code]](https://github.com/jiazhen-code/IntrinsicHallu)
`Arxiv 2024/03`
- *Unsolvable Problem Detection: Evaluating Trustworthiness of Vision Language Models* [[paper]](https://arxiv.org/pdf/2403.20331v1.pdf) [[code]](https://github.com/AtsuMiyai/UPD/)
`Arxiv 2024/04`
- *THRONE: An Object-based Hallucination Benchmark for the Free-form Generations of Large Vision-Language Models* [[paper]](https://arxiv.org/pdf/2405.05256v1)
`Arxiv 2024/05`
- *Evaluating the Quality of Hallucination Benchmarks for Large Vision-Language Models* [[paper]](https://github.com/HQHBench/HQHBench) [[code]](https://github.com/HQHBench/HQHBench)
`Arxiv 2024/06`
- *HaloQuest: A Visual Hallucination Dataset for Advancing Multimodal Reasoning* [[paper]](https://arxiv.org/pdf/2407.15680v1)[[code]](https://github.com/google/haloquest)
`Arxiv 2024/07`
- *Hallu-PI: Evaluating Hallucination in Multi-modal Large Language Models within Perturbed Inputs* [[paper]](https://arxiv.org/pdf/2408.01355v2)[[code]](https://github.com/NJUNLP/Hallu-PI)
`Arxiv 2024/08`
- *VideoHallucer: Evaluating Intrinsic and Extrinsic Hallucinations in Large Video-Language Models* [[paper]](https://arxiv.org/pdf/2406.16338v1) [[code]](https://github.com/patrick-tssn/VideoHallucer)
`Arxiv 2024/06`
- *Reefknot: A Comprehensive Benchmark for Relation Hallucination Evaluation, Analysis and Mitigation in Multimodal Large Language Models* [[paper]](https://arxiv.org/pdf/2408.09429v1)
`Arxiv 2024/08`
- *PhD: A Prompted Visual Hallucination Evaluation Dataset* [[paper]](https://arxiv.org/pdf/2403.11116v2)
`Arxiv 2024/08`
- *Understanding Multimodal Hallucination with Parameter-Free Representation Alignment (Pfram)* [[paper]](http://arxiv.org/pdf/2409.01151v1) [[code]](https://github.com/yellow-binary-tree/Pfram)
`Arxiv 2024/09`
- *Pre-Training Multimodal Hallucination Detectors with Corrupted Grounding Data* [[paper]](http://arxiv.org/pdf/2409.00238v1)
`Arxiv 2024/09`
- *Explore the Hallucination on Low-level Perception for MLLMs* [[paper]](https://arxiv.org/pdf/2409.09748v1)
`Arxiv 2024/09`
- *ODE: Open-Set Evaluation of Hallucinations in Multimodal Large Language Models* [[paper]](https://arxiv.org/pdf/2409.09318v1)
`Arxiv 2024/09`
- *FIHA: Autonomous Hallucination Evaluation in Vision-Language Models with Davidson Scene Graphs* [[paper]](https://arxiv.org/pdf/2409.13612v1) [[code]](https://anonymous.4open.science/r/FIHA-45BB)
`Arxiv 2024/09`
- *EventHallusion: Diagnosing Event Hallucinations in Video LLMs* [[paper]](https://arxiv.org/pdf/2409.16597v1) [[code]](https://github.com/Stevetich/EventHallusion)
`Arxiv 2024/09`
- *AUTOHALLUSION: Automatic Generation of Hallucination Benchmarks for Vision-Language Models* [[paper]](https://arxiv.org/pdf/2406.10900v2) [[code]](https://github.com/wuxiyang1996/AutoHallusion)
`Arxiv 2024/10`
- *Automatically Generating Visual Hallucination Test Cases for Multimodal Large Language Models* [[paper]](https://arxiv.org/pdf/2410.11242v1) [[code]](https://github.com/lycheeefish/VHExpansion)
`Arxiv 2024/10`
- *LongHalQA: Long-Context Hallucination Evaluation for MultiModal Large Language Models* [[paper]](https://arxiv.org/pdf/2410.09962v2) [[code]](https://github.com/hanqiu-hq/LongHalQA)
`Arxiv 2024/10`
- *The Curse of Multi-Modalities: Evaluating Hallucinations of Large Multimodal Models across Language, Visual, and Audio* [[paper]](https://arxiv.org/pdf/2410.12787v1) [[code]](https://arxiv.org/pdf/github.com/DAMO-NLP-SG/CMM)
`Arxiv 2024/10`
- *AVHBench: A Cross-Modal Hallucination Benchmark for Audio-Visual Large Language Models* [[paper]](https://arxiv.org/pdf/2410.18325v1) [[code]]()
`Arxiv 2024/10`
- *UNIFIED TRIPLET-LEVEL HALLUCINATION EVALUATION FOR LARGE VISION-LANGUAGE MODELS* [[paper]](https://arxiv.org/pdf/2410.23114v2) [[code]](https://github.com/wujunjie1998/Tri-HE)
`Arxiv 2024/11`
- *H-POPE: Hierarchical Polling-based Probing Evaluation of Hallucinations in Large Vision-Language Models* [[paper]](https://arxiv.org/pdf/2411.04077v1)
`Arxiv 2024/11`
- *VL-Uncertainty: Detecting Hallucination in Large Vision-Language Model via Uncertainty Estimation* [[paper]](https://arxiv.org/pdf/2411.11919v1) [[code]](https://github.com/Ruiyang-061X/VL-Uncertainty)
`Arxiv 2024/11`
- *ViBe: A Text-to-Video Benchmark for Evaluating Hallucination in Large Multimodal Models* [[paper]](https://arxiv.org/abs/2411.10867)
`Arxiv 2024/11`
## Hallucination Mitigation
This section collects the papers on mitigating the MLLM's hallucination.- **Mitigating Hallucination in Large Multi-Modal Models via Robust Instruction Tuning** [[paper]](https://arxiv.org/pdf/2306.14565.pdf) [[code]](https://github.com/FuxiaoLiu/LRV-Instruction)
`ICLR 2024`
- **Analyzing and Mitigating Object Hallucination in Large Vision-Language Models** [[paper]](https://arxiv.org/pdf/2310.00754.pdf) [[code]](https://github.com/YiyangZhou/LURE)
`ICLR 2024`
- **VIGC: Visual Instruction Generation and Correction** [[paper]](https://arxiv.org/pdf/2308.12714.pdf)[[code]](https://github.com/opendatalab/VIGC)
`AAAI 2024`- **OPERA: Alleviating Hallucination in Multi-Modal Large Language Models via Over-Trust Penalty and Retrospection-Allocation** [[paper]](https://arxiv.org/pdf/2311.17911.pdf) [[code]](https://github.com/shikiw/OPERA)
`CVPR 2024`
- **Mitigating Object Hallucinations in Large Vision-Language Models through Visual Contrastive Decoding** [[paper]](https://arxiv.org/pdf/2311.16922.pdf) [[code]](https://github.com/DAMO-NLP-SG/VCD)
`CVPR 2024`
- **Hallucination Augmented Contrastive Learning for Multimodal Large Language Model** [[paper]](https://arxiv.org/pdf/2312.06968.pdf)
`CVPR 2024`
- **RLHF-V: Towards Trustworthy MLLMs via Behavior Alignment from Fine-grained Correctional Human Feedback** [[paper]](https://arxiv.org/pdf/2312.00849.pdf) [[code]](https://github.com/RLHF-V/RLHF-V)
`CVPR 2024`
- *Detecting and Preventing Hallucinations in Large Vision Language Models* [[paper]](https://arxiv.org/pdf/2308.06394.pdf)
`Arxiv 2023/08`
- *Evaluation and Analysis of Hallucination in Large Vision-Language Models* [[paper]](https://arxiv.org/pdf/2308.15126.pdf)[[code]](https://github.com/junyangwang0410/HaELM)
`Arxiv 2023/08`- *CIEM: Contrastive Instruction Evaluation Method for Better Instruction Tuning* [[paper]](https://arxiv.org/pdf/2308.15126.pdf)
`Arxiv 2023/09`- *Evaluation and Mitigation of Agnosia in Multimodal Large Language Models* [[paper]](https://arxiv.org/pdf/2309.04041.pdf)
`Arxiv 2023/09`- *Aligning Large Multimodal Models with Factually Augmented RLHF* [[paper]](https://arxiv.org/pdf/2309.14525.pdf) [[code]](https://github.com/llava-rlhf/LLaVA-RLHF)
`Arxiv 2023/09`
- *HallE-Switch: Rethinking and Controlling Object Existence Hallucinations in Large Vision Language Models for Detailed Caption* [[paper]](https://arxiv.org/pdf/2310.01779.pdf)
`Arxiv 2023/10`- *Woodpecker: Hallucination Correction for Multimodal Large Language Models* [[paper]](https://arxiv.org/pdf/2310.16045.pdf) [[code]](https://github.com/BradyFU/Woodpecker)
`Arxiv 2023/10`
- *HalluciDoctor: Mitigating Hallucinatory Toxicity in Visual Instruction Data* [[paper]](https://arxiv.org/pdf/2311.13614.pdf) [[code]](https://github.com/Yuqifan1117/HalluciDoctor)
`Arxiv 2023/11`
- *VOLCANO: Mitigating Multimodal Hallucination through Self-Feedback Guided Revision* [[paper]](https://arxiv.org/pdf/2311.07362.pdf) [[code]](https://github.com/kaistAI/Volcano)
`Arxiv 2023/11`
- *Beyond Hallucinations: Enhancing LVLMs through Hallucination-Aware Direct Preference Optimization* [[paper]](https://arxiv.org/pdf/2311.16839.pdf)
`Arxiv 2023/11`
- *Mitigating Hallucination in Visual Language Models with Visual Supervision* [[paper]](https://arxiv.org/pdf/2311.16479.pdf)
`Arxiv 2023/11`
- *Mitigating Fine-Grained Hallucination by Fine-Tuning Large Vision-Language Models with Caption Rewrites* [[paper]](https://arxiv.org/pdf/2312.01701.pdf) [[code]](https://github.com/Anonymousanoy/FOHE)
`Arxiv 2023/12`
- *MOCHa: Multi-Objective Reinforcement Mitigating Caption Hallucinations* [[paper]](https://arxiv.org/pdf/2312.03631.pdf) [[code]](https://github.com/assafbk/mocha_code)
`Arxiv 2023/12`
- *Temporal Insight Enhancement: Mitigating Temporal Hallucination in Multimodal Large Language Models* [[paper]](https://arxiv.org/pdf/2401.09861v1.pdf)
`Arxiv 2024/01`
- *On the Audio Hallucinations in Large Audio-Video Language Models* [[paper]](https://arxiv.org/pdf/2401.09774v1.pdf)
`Arxiv 2024/01`
- *Skip \n: A simple method to reduce hallucination in Large Vision-Language Models* [[paper]](https://arxiv.org/pdf/2402.01345v1.pdf)
`Arxiv 2024/02`
- *Unified Hallucination Detection for Multimodal Large Language Models* [[paper]](https://arxiv.org/pdf/2402.03190v1.pdf) [[code]](https://github.com/OpenKG-ORG/EasyDetect)
`Arxiv 2024/02`
- *Mitigating Object Hallucination in Large Vision-Language Models via Classifier-Free Guidance* [[paper]](https://arxiv.org/pdf/2402.08680v1.pdf)
`Arxiv 2024/02`
- *EFUF: Efficient Fine-grained Unlearning Framework for Mitigating Hallucinations in Multimodal Large Language Models* [[paper]](https://arxiv.org/pdf/2402.09801v1.pdf)
`Arxiv 2024/02`
- *Logical Closed Loop: Uncovering Object Hallucinations in Large Vision-Language Models* [[paper]](https://arxiv.org/pdf/2402.11622v1.pdf) [[code]](https://github.com/Hyperwjf/LogicCheckGPT)
`Arxiv 2024/02`
- *Less is More: Mitigating Multimodal Hallucination from an EOS Decision Perspective* [[paper]](https://arxiv.org/pdf/2402.14545v1.pdf) [[code]](https://github.com/yuezih/less-is-more)
`Arxiv 2024/02`
- *Seeing is Believing: Mitigating Hallucination in Large Vision-Language Models via CLIP-Guided Decoding* [[paper]](https://arxiv.org/pdf/2402.15300v1.pdf)
`Arxiv 2024/02`
- *IBD: Alleviating Hallucinations in Large Vision-Language Models via Image-Biased Decoding* [[paper]](https://arxiv.org/pdf/2402.18476v1.pdf)
`Arxiv 2024/02`
- *HALC: Object Hallucination Reduction via Adaptive Focal-Contrast Decoding* [[paper]](https://arxiv.org/pdf/2403.00425v1.pdf) [[code]](https://github.com/BillChan226/HALC)
`Arxiv 2024/03`
- *Evaluating and Mitigating Number Hallucinations in Large Vision-Language Models: A Consistency Perspective* [[paper]](https://arxiv.org/pdf/2403.01373v1.pdf)
`Arxiv 2024/03`
- *Debiasing Large Visual Language Models* [[paper]](https://arxiv.org/pdf/2403.05262.pdf)
`Arxiv 2024/03`
- *AIGCs Confuse AI Too: Investigating and Explaining Synthetic Image-induced Hallucinations in Large Vision-Language Models* [[paper]](https://arxiv.org/pdf/2403.08542v1.pdf)
`Arxiv 2024/03`
- *What if...?: Counterfactual Inception to Mitigate Hallucination Effects in Large Multimodal Models* [[paper]](https://arxiv.org/pdf/2403.13513v1.pdf)
`Arxiv 2024/03`
- *Multi-Modal Hallucination Control by Visual Information Grounding* [[paper]](https://arxiv.org/pdf/2403.14003v1.pdf)
`Arxiv 2024/03`
- *Pensieve: Retrospect-then-Compare Mitigates Visual Hallucination* [[paper]](https://arxiv.org/pdf/2403.14401v1.pdf) [[code]](https://github.com/DingchenYang99/Pensieve)
`Arxiv 2024/03`
- *Hallucination Detection in Foundation Models for Decision-Making: A Flexible Definition and Review of the State of the Art* [[paper]](https://arxiv.org/pdf/2403.16527v1.pdf)
`Arxiv 2024/03`
- *Cartoon Hallucinations Detection: Pose-aware In Context Visual Learning* [[paper]](https://arxiv.org/pdf/2403.15048v2.pdf)
`Arxiv 2024/03`
- *Visual Hallucination: Definition, Quantification, and Prescriptive Remediations* [[paper]](https://arxiv.org/pdf/2403.17306v1.pdf)
`Arxiv 2024/03`
- *Exploiting Semantic Reconstruction to Mitigate Hallucinations in Vision-Language Models* [[paper]](https://arxiv.org/pdf/2403.16167v2.pdf)
`Arxiv 2024/03`
- *Mitigating Hallucinations in Large Vision-Language Models with Instruction Contrastive Decoding* [[paper]](https://arxiv.org/pdf/2403.18715v1.pdf)
`Arxiv 2024/03`
- *Automated Multi-level Preference for MLLMs* [[paper]](https://www.arxiv.org/pdf/2405.11165)
`Arxiv 2024/05`
- *CrossCheckGPT: Universal Hallucination Ranking for Multimodal Foundation Models* [[paper]](https://arxiv.org/pdf/2405.13684v1)
`Arxiv 2024/05`
- *VDGD: Mitigating LVLM Hallucinations in Cognitive Prompts by Bridging the Visual Perception Gap* [[paper]](http://arxiv.org/pdf/2405.15683v1)
`Arxiv 2024/05`
- *Alleviating Hallucinations in Large Vision-Language Models through Hallucination-Induced Optimization* [[paper]](https://arxiv.org/pdf/2405.15356v1)
`Arxiv 2024/05`
- *Mitigating Dialogue Hallucination for Large Vision Language Models via Adversarial Instruction Tuning* [[paper]](https://arxiv.org/pdf/2403.10492v2)
`Arxiv 2024/05`
- *RITUAL: Random Image Transformations as a Universal Anti-hallucination Lever in LVLMs* [[paper]](https://arxiv.org/pdf/2405.17821v1)
`Arxiv 2024/05`
- *MetaToken: Detecting Hallucination in Image Descriptions by Meta Classification* [[paper]](https://arxiv.org/pdf/2405.19186v1)
`Arxiv 2024/05`
- *Mitigating Object Hallucination via Data Augmented Contrastive Tuning* [[paper]](https://arxiv.org/pdf/2405.18654v1)
`Arxiv 2024/05`
- *NoiseBoost: Alleviating Hallucination with Noise Perturbation for Multimodal Large Language Models* [[paper]](https://arxiv.org/pdf/2405.20081v2) [[code]](https://kaiwu5.github.io/noiseboost)
`Arxiv 2024/06`
- *CODE: Contrasting Self-generated Description to Combat Hallucination in Large Multi-modal Models* [[paper]](https://arxiv.org/pdf/2406.01920v1) [[code]](https://ivy-lvlm.github.io/CODE/)
`Arxiv 2024/06`
- *Understanding Sounds, Missing the Questions: The Challenge of Object Hallucination in Large Audio-Language Models* [[paper]](http://arxiv.org/pdf/2406.08402v1)
`Arxiv 2024/06`
- *Detecting and Evaluating Medical Hallucinations in Large Vision Language Models* [[paper]](https://arxiv.org/pdf/2406.10185v1)
`Arxiv 2024/06`
- *AUTOHALLUSION: Automatic Generation of Hallucination Benchmarks for Vision-Language Models* [[paper]](https://arxiv.org/pdf/2406.10900v1)
`Arxiv 2024/06`
- *Hallucination Mitigation Prompts Long-term Video Understanding* [[paper]](https://arxiv.org/pdf/2406.11333v1) [[code]](https://github.com/lntzm/CVPR24Track-LongVideo)
`Arxiv 2024/06`
- *Do More Details Always Introduce More Hallucinations in LVLM-based Image Captioning?* [[paper]](https://arxiv.org/pdf/2406.12663v1)
`Arxiv 2024/06`
- *Does Object Grounding Really Reduce Hallucination of Large Vision-Language Models?* [[paper]](https://arxiv.org/pdf/2406.14492v1)
`Arxiv 2024/06`
- *VGA: Vision GUI Assistant - Minimizing Hallucinations through Image-Centric Fine-Tuning* [[paper]](https://arxiv.org/pdf/2406.14056v2)
`Arxiv 2024/06`
- *AGLA: Mitigating Object Hallucinations in Large Vision-Language Models with Assembly of Global and Local Attention* [[paper]](https://arxiv.org/pdf/2406.12718v2) [[code]](https://github.com/Lackel/AGLA)
`Arxiv 2024/06`
- *Evaluating and Analyzing Relationship Hallucinations in Large Vision-Language Models* [[paper]](https://arxiv.org/pdf/2406.16449v1) [[code]](https://github.com/mrwu-mac/R-Bench)
`Arxiv 2024/06`
- *Pelican: Correcting Hallucination in Vision-LLMs via Claim Decomposition and Program of Thought Verification* [[paper]](https://arxiv.org/pdf/2407.02352v1)
`Arxiv 2024/06`
- *Multi-Object Hallucination in Vision-Language Models* [[paper]](https://arxiv.org/pdf/2407.06192v1) [[code]](https://multi-object-hallucination.github.io/)
`Arxiv 2024/07`
- *Reflective Instruction Tuning: Mitigating Hallucinations in Large Vision-Language Models* [[paper]](https://arxiv.org/pdf/2407.11422v1) [[code]](https://zjr2000.github.io/projects/reverie)
`Arxiv 2024/07`
- *BEAF: Observing BEfore-AFter Changes to Evaluate Hallucination in Vision-language Models* [[paper]](https://arxiv.org/pdf/2407.13442v1) [[code]](https://beafbench.github.io/)
`Arxiv 2024/07`
- *Interpreting and Mitigating Hallucination in MLLMs through Multi-agent Debate* [[paper]](https://arxiv.org/pdf/2407.20505v1) [[code]](https://github.com/LZzz2000/MAD)
`Arxiv 2024/07`
- *Paying More Attention to Image: A Training-Free Method for Alleviating Hallucination in LVLMs* [[paper]](https://arxiv.org/pdf/2407.21771v1)[[code]](https://lalbj.github.io/projects/PAI/)
`Arxiv 2024/08`
- *Mitigating Multilingual Hallucination in Large Vision-Language Models* [[paper]](https://arxiv.org/pdf/2408.00550v1) [[code]](https://github.com/ssmisya/MHR)
`Arxiv 2024/08`
- *Alleviating Hallucination in Large Vision-Language Models with Active Retrieval Augmentation* [[paper]](https://arxiv.org/pdf/2408.00555v1)
`Arxiv 2024/08`
- *Self-Introspective Decoding: Alleviating Hallucinations for Large Vision-Language Models* [[paper]](https://arxiv.org/pdf/2408.02032v1) [[code]](https://github.com/huofushuo/SID)
`Arxiv 2024/08`
- *Mitigating Hallucinations in Large Vision-Language Models (LVLMs) via Language-Contrastive Decoding (LCD)* [[paper]](https://arxiv.org/pdf/2408.04664v1)
`Arxiv 2024/08`
- *Reference-free Hallucination Detection for Large Vision-Language Models* [[paper]](https://arxiv.org/pdf/2408.05767v1)
`Arxiv 2024/08`
- *Negative Object Presence Evaluation (NOPE) to Measure Object Hallucination in Vision-Language Models* [[paper]](https://arxiv.org/pdf/2310.05338v2)
`Arxiv 2024/08`
- *CLIP-DPO: Vision-Language Models as a Source of Preference for Fixing Hallucinations in LVLMs* [[paper]](https://arxiv.org/pdf/2408.10433v1)
`Arxiv 2024/08`
- *ConVis: Contrastive Decoding with Hallucination Visualization for Mitigating Hallucinations in Multimodal Large Language Models* [[paper]](https://arxiv.org/pdf/2408.13906v1) [[code]](https://github.com/yejipark-m/ConVis)
`Arxiv 2024/08`
- *Look, Compare, Decide: Alleviating Hallucination in Large Vision-Language Models via Multi-View Multi-Path Reasoning* [[paper]](https://github.com/GasolSun36/MVP)
`Arxiv 2024/08`
- *Mitigating Hallucination in Visual-Language Models via Re-Balancing Contrastive Decoding* [[paper]](https://arxiv.org/pdf/2409.06485v1)
`Arxiv 2024/09`
- *EventHallusion: Diagnosing Event Hallucinations in Video LLMs* [[paper]](https://arxiv.org/pdf/2409.16597v1) [[code]](https://github.com/Stevetich/EventHallusion)
`Arxiv 2024/09`
- *A Unified Hallucination Mitigation Framework for Large Vision-Language Models* [[paper]](https://arxiv.org/pdf/2409.16494v1)
`Arxiv 2024/09`
- *HELPD: Mitigating Hallucination of LVLMs by Hierarchical Feedback Learning with Vision-enhanced Penalty Decoding* [[paper]](https://arxiv.org/pdf/2409.20429v1) [[code]](https://github.com/F-Yuan303/HELPD)
`Arxiv 2024/09`
- *INTERPRETING AND EDITING VISION-LANGUAGE REPRESENTATIONS TO MITIGATE HALLUCINATIONS* [[paper]](https://arxiv.org/pdf/2410.02762v1) [[code]](https://github.com/nickjiang2378/vl-interp)
`Arxiv 2024/10`
- *Look Twice Before You Answer: Memory-Space Visual Retracing for Hallucination Mitigation in Multimodal Large Language Models* [[paper]](https://arxiv.org/pdf/2410.03577v1) [[code]](https://github.com/1zhou-Wang/MemVR)
`Arxiv 2024/10`
- *Investigating and Mitigating Object Hallucinations in Pretrained Vision-Language (CLIP) Models* [[paper]](https://arxiv.org/pdf/2410.03176v1) [[code]](https://github.com/Yufang-Liu/clip_hallucination)
`Arxiv 2024/10`
- *Exploiting Semantic Reconstruction to Mitigate Hallucinations in Vision-Language Models* [[paper]](https://arxiv.org/pdf/2403.16167v4)
`Arxiv 2024/10`
- *Mitigating Modality Prior-Induced Hallucinations in Multimodal Large Language Models via Deciphering Attention Causality* [[paper]](https://arxiv.org/pdf/2410.04780v1) [[code]](https://github.com/The-Martyr/CausalMM)
`Arxiv 2024/10`
- *DAMRO: Dive into the Attention Mechanism of LVLM to Reduce Object Hallucination* [[paper]](https://arxiv.org/pdf/2410.04514v1)
`Arxiv 2024/10`
- *From Pixels to Tokens: Revisiting Object Hallucinations in Large Vision-Language Models* [[paper]](https://arxiv.org/pdf/2410.06795v1)
`Arxiv 2024/10`
- *Data-augmented phrase-level alignment for mitigating object hallucination* [[paper]](https://arxiv.org/pdf/2405.18654v2)
`Arxiv 2024/10`
- *Visual Description Grounding Reduces Hallucinations and Boosts Reasoning in LVLMs* [[paper]](https://arxiv.org/pdf/2405.15683v2) [[code]](https://github.com/Sreyan88/VDGD)
`Arxiv 2024/10`
- *Magnifier Prompt: Tackling Multimodal Hallucination via Extremely Simple Instructions* [[paper]](https://arxiv.org/pdf/2410.11701v1)
`Arxiv 2024/10`
- *MLLM can see? Dynamic Correction Decoding for Hallucination Mitigation* [[paper]](https://arxiv.org/pdf/2410.11779v1) [[code]](https://github.com/zjunlp/DeCo)
`Arxiv 2024/10`
- *Mitigating Hallucinations in Large Vision-Language Models via Summary-Guided Decoding* [[paper]](https://arxiv.org/pdf/2410.13321v1) [[code]]()
`Arxiv 2024/10`
- *Mitigating Object Hallucination via Concentric Causal Attention* [[paper]](https://arxiv.org/pdf/2410.15926v1) [[code]](https://github.com/xing0047/cca-llava.git)
`Arxiv 2024/10`
- *Can Large Audio-Language Models Truly Hear? Tackling Hallucinations with Multi-Task Assessment and Stepwise Audio Reasoning* [[paper]](https://arxiv.org/pdf/2410.16130v1)
`Arxiv 2024/10`
- *V-DPO: Mitigating Hallucination in Large Vision Language Models via Vision-Guided Direct Preference Optimization* [[paper]](https://arxiv.org/pdf/2411.02712v1) [[code]](https://github.com/YuxiXie/V-DPO)
`Arxiv 2024/11`
- *Mitigating Hallucination in Multimodal Large Language Model via Hallucination-targeted Direct Preference Optimization* [[paper]](https://arxiv.org/pdf/2411.10436v1)
`Arxiv 2024/11`
- *Seeing Clearly by Layer Two: Enhancing Attention Heads to Alleviate Hallucination in LVLMs* [[paper]](https://arxiv.org/pdf/2411.09968v1)
`Arxiv 2024/11`
- *Thinking Before Looking: Improving Multimodal LLM Reasoning via Mitigating Visual Hallucination* [[paper]](https://arxiv.org/pdf/2411.12591v1) [[code]](https://github.com/Terry-Xu-666/visual_inference_chain)
`Arxiv 2024/11`
- *CATCH: Complementary Adaptive Token-level Contrastive Decoding to Mitigate Hallucinations in LVLMs* [[paper]](https://arxiv.org/pdf/2411.12713v1)
`Arxiv 2024/11`