An open API service indexing awesome lists of open source software.

https://github.com/Fanziyang-v/Awesome-MLLMs-Hallucination-Mitigation

Paper lists of awesome works in mitigating hallucination in Multimodal Large Language Models(MLLMs).
https://github.com/Fanziyang-v/Awesome-MLLMs-Hallucination-Mitigation

List: Awesome-MLLMs-Hallucination-Mitigation

Last synced: about 2 months ago
JSON representation

Paper lists of awesome works in mitigating hallucination in Multimodal Large Language Models(MLLMs).

Awesome Lists containing this project

README

        

# Awesome Multimodal Large Language Models Hallucination Mitigation
This is a list of some awesome works on mitigating hallucination in large multimodal models.

## :books:Survey

1. [Hallucination of Multimodal Large Language Models: A Survey](https://arxiv.org/abs/2404.18930) (Apr. 30, 2024)[![arxiv](https://img.shields.io/badge/arXiv-b31b1b.svg)](https://arxiv.org/abs/2404.18930)[![github](https://img.shields.io/github/stars/showlab/Awesome-MLLM-Hallucination)](https://github.com/showlab/Awesome-MLLM-Hallucination/)
1. [A Survey on Hallucination in Large Vision-Language Models](https://arxiv.org/abs/2402.00253) (Feb. 1, 2024)[![arxiv](https://img.shields.io/badge/arXiv-b31b1b.svg)](https://arxiv.org/abs/2402.00253)[![github](https://img.shields.io/github/stars/lhanchao777/LVLM-Hallucinations-Survey)](https://github.com/lhanchao777/LVLM-Hallucinations-Survey)

## :bar_chart:Benchmarks

1. [Object Hallucination in Image Captioning](https://arxiv.org/abs/1809.02156) ](https://arxiv.org/abs/2404.18930) (Sep. 6, 2018, **EMNLP 2018**) [![arxiv](https://img.shields.io/badge/arXiv-b31b1b.svg)](https://arxiv.org/abs/1809.02156)![alias](https://img.shields.io/badge/CHAIR-black)
2. [Evaluating Object Hallucination in Large Vision-Language Models](https://arxiv.org/abs/2305.10355) (May. 17, 2023, **EMNLP 2023**) [![arxiv](https://img.shields.io/badge/arXiv-b31b1b.svg)](https://arxiv.org/abs/2305.10355)[![github](https://img.shields.io/github/stars/AoiDragon/POPE)](https://github.com/AoiDragon/POPE)![alias](https://img.shields.io/badge/PoPE-black)
3. [Eyes Wide Shut? Exploring the Visual Shortcomings of Multimodal LLMs](https://arxiv.org/abs/2401.06209) (Jan. 11, 2024, **CVPR 2024**) [![arxiv](https://img.shields.io/badge/arXiv-b31b1b.svg)](https://arxiv.org/abs/2305.10355)[![github](https://img.shields.io/github/stars/tsb0601/MMVP)](https://github.com/tsb0601/MMVP)![alias](https://img.shields.io/badge/MMVP-black)
4. [MME: A Comprehensive Evaluation Benchmark for Multimodal Large Language Models](https://arxiv.org/abs/2306.13394) (Jun. 23, 2023) [![arxiv](https://img.shields.io/badge/arXiv-b31b1b.svg)](https://arxiv.org/abs/2306.13394)[![github](https://img.shields.io/github/stars/BradyFU/Awesome-Multimodal-Large-Language-Models)](https://github.com/BradyFU/Awesome-Multimodal-Large-Language-Models)![alias](https://img.shields.io/badge/MME-black)
5. [MASH-VLM: Mitigating Action-Scene Hallucination in Video-LLMs through Disentangled Spatial-Temporal Representations](http://arxiv.org/abs/2503.15871) (Mar. 20, 2025, **CVPR 2025**)[![arxiv](https://img.shields.io/badge/arXiv-b31b1b.svg)](http://arxiv.org/abs/2503.15871)![alias](https://img.shields.io/badge/UNSCENE-black)
6. [AMBER: An LLM-free Multi-dimensional Benchmark for MLLMs Hallucination Evaluation](https://arxiv.org/abs/2311.07397) (Nov. 13, 2023) [![arxiv](https://img.shields.io/badge/arXiv-b31b1b.svg)](https://arxiv.org/abs/2311.07397)[![github](https://img.shields.io/github/stars/junyangwang0410/AMBER)](https://github.com/junyangwang0410/AMBER)![alias](https://img.shields.io/badge/AMBER-black)
7. [Aligning Large Multimodal Models with Factually Augmented RLHF](https://arxiv.org/abs/2309.14525) (Sep. 25, 2023, **ACL 2024**) [![arxiv](https://img.shields.io/badge/arXiv-b31b1b.svg)](https://arxiv.org/abs/2309.14525)[![github](https://img.shields.io/github/stars/llava-rlhf/LLaVA-RLHF)](https://github.com/llava-rlhf/LLaVA-RLHF)![alias](https://img.shields.io/badge/MMHal_Bench-black)

## :clap:Hallucination Mitigation

1. [MASH-VLM: Mitigating Action-Scene Hallucination in Video-LLMs through Disentangled Spatial-Temporal Representations](http://arxiv.org/abs/2503.15871) (Mar. 20, 2025, **CVPR 2025**) [![arxiv](https://img.shields.io/badge/arXiv-b31b1b.svg)](http://arxiv.org/abs/2503.15871)![alias](https://img.shields.io/badge/MASH_VLM-black)
2. [ClearSight: Visual Signal Enhancement for Object Hallucination Mitigation in Multimodal Large language Models](https://arxiv.org/abs/2503.13107) (Mar. 17, 2025) [![arxiv](https://img.shields.io/badge/arXiv-b31b1b.svg)](https://arxiv.org/abs/2503.13107)[![github](https://img.shields.io/github/stars/ustc-hyin/ClearSight)](https://github.com/ustc-hyin/ClearSight)![alias](https://img.shields.io/badge/ClearSight-black)
3. [Through the Magnifying Glass: Adaptive Perception Magnification for Hallucination-Free VLM Decoding](https://arxiv.org/abs/2503.10183) (Mar. 13, 2025)[![arxiv](https://img.shields.io/badge/arXiv-b31b1b.svg)](https://arxiv.org/abs/2503.10183)[![github](https://img.shields.io/github/stars/ShunqiM/PM)](https://github.com/ShunqiM/PM)![alias](https://img.shields.io/badge/PM-black)
4. [EAZY: Eliminating Hallucinations in LVLMs by Zeroing out Hallucinatory Image Tokens](https://arxiv.org/abs/2503.07772) (Mar. 10, 2025)[![arxiv](https://img.shields.io/badge/arXiv-b31b1b.svg)](https://arxiv.org/abs/2503.07772)![alias](https://img.shields.io/badge/EAZY-black)
5. [Mask-DPO: Generalizable Fine-grained Factuality Alignment of LLMs](https://arxiv.org/abs/2503.02846) (Mar. 4, 2025, **ICLR 2025**)[![arxiv](https://img.shields.io/badge/arXiv-b31b1b.svg)](http://arxiv.org/abs/2503.02846)![alias](https://img.shields.io/badge/Mask_DPO-black)
6. [PerturboLLaVA: Reducing Multimodal Hallucinations with Perturbative Visual Training](https://arxiv.org/abs/2503.06486) (Mar. 9, 2025, **ICLR 2025**)[![arxiv](https://img.shields.io/badge/arXiv-b31b1b.svg)](https://arxiv.org/abs/2503.06486)![tag](https://img.shields.io/badge/Spotlight-FF4D00)![alias](https://img.shields.io/badge/PerturboLLaVA-black)
7. [Octopus: Alleviating Hallucination via Dynamic Contrastive Decoding](https://arxiv.org/abs/2503.00361) (Mar. 1, 2025, **CVPR 2025**)[![arxiv](https://img.shields.io/badge/arXiv-b31b1b.svg)](https://arxiv.org/abs/2503.00361)[![github](https://img.shields.io/github/stars/LijunZhang01/Octopus)](https://github.com/LijunZhang01/Octopus)![alias](https://img.shields.io/badge/Octopus-black)
8. [Refine Knowledge of Large Language Models via Adaptive Contrastive Learning](http://arxiv.org/abs/2502.07184) (Feb. 11, 2025, **ICLR 2025**)[![arxiv](https://img.shields.io/badge/arXiv-b31b1b.svg)](http://arxiv.org/abs/2502.07184)
9. [Mitigating Hallucinations in Large Vision-Language Models via DPO: On-Policy Data Hold the Key](https://arxiv.org/abs/2501.09695) (Jan. 16, 2025, **CVPR 2025**)[![arxiv](https://img.shields.io/badge/arXiv-b31b1b.svg)](https://arxiv.org/abs/2501.09695)[![github](https://img.shields.io/github/stars/zhyang2226/OPA-DPO)](https://github.com/zhyang2226/OPA-DPO)![alias](https://img.shields.io/badge/OPA_DPO-black)
10. [VASparse: Towards Efficient Visual Hallucination Mitigation for Large Vision-Language Model via Visual-Aware Sparsification](https://arxiv.org/abs/2501.06553) (Jan. 11, 2025, **CVPR 2025**)[![arxiv](https://img.shields.io/badge/arXiv-b31b1b.svg)](https://arxiv.org/abs/2501.06553)[![github](https://img.shields.io/github/stars/Ziwei-Zheng/Nullu)](https://github.com/Ziwei-Zheng/Nullu)![alias](https://img.shields.io/badge/VASparse-black)
11. [Nullu: Mitigating Object Hallucinations in Large Vision-Language Models via HalluSpace Projection](https://arxiv.org/abs/2412.13817) (Dec. 18, 2024, **CVPR 2025**) [![arxiv](https://img.shields.io/badge/arXiv-b31b1b.svg)](https://arxiv.org/abs/2501.06553)[![github](https://img.shields.io/github/stars/mengchuang123/VASparse-github)](https://github.com/mengchuang123/VASparse-github)![alias](https://img.shields.io/badge/Nullu-black)
12. [Interpreting and Editing Vision-Language Representations to Mitigate Hallucinations](https://arxiv.org/abs/2410.02762) (Oct. 3, 2024, **ICLR 2025**) [![arxiv](https://img.shields.io/badge/arXiv-b31b1b.svg)](https://arxiv.org/abs/2410.02762)[![github](https://img.shields.io/github/stars/nickjiang2378/vl-interp)](https://github.com/nickjiang2378/vl-interp/)
13. [Devils in Middle Layers of Large Vision-Language Models: Interpreting, Detecting and Mitigating Object Hallucinations via Attention Lens](https://arxiv.org/abs/2411.16724) (Nov. 23, 2024, **CVPR 2025**) [![arxiv](https://img.shields.io/badge/arXiv-b31b1b.svg)](https://arxiv.org/abs/2411.16724)[![github](https://img.shields.io/github/stars/ZhangqiJiang07/middle_layers_indicating_hallucinations)](https://github.com/ZhangqiJiang07/middle_layers_indicating_hallucinations)
14. [ICT: Image-Object Cross-Level Trusted Intervention for Mitigating Object Hallucination in Large Vision-Language Models](https://arxiv.org/abs/2411.15268) (Nov. 22, 2024, **CVPR 2025**) [![arxiv](https://img.shields.io/badge/arXiv-b31b1b.svg)](https://arxiv.org/abs/2411.15268)![alias](https://img.shields.io/badge/ICT-black)
15. [Eagle: Exploring The Design Space for Multimodal LLMs with Mixture of Encoders](https://arxiv.org/abs/2408.15998) (Aug. 28, 2024, **ICLR 2025**) [![arxiv](https://img.shields.io/badge/arXiv-b31b1b.svg)](https://arxiv.org/abs/2408.15998)[![github](https://img.shields.io/github/stars/NVlabs/Eagle)](https://github.com/NVlabs/Eagle)![alias](https://img.shields.io/badge/Eagle-black)
16. [Self-Introspective Decoding: Alleviating Hallucinations for Large Vision-Language Models](http://arxiv.org/abs/2408.02032) (Aug. 4, 2024, **ICLR 2025**)[![arxiv](https://img.shields.io/badge/arXiv-b31b1b.svg)](https://arxiv.org/abs/2406.12718)[![github](https://img.shields.io/github/stars/huofushuo/SID)](https://github.com/huofushuo/SID)![alias](https://img.shields.io/badge/SID-black)
17. [Paying More Attention to Image: A Training-Free Method for Alleviating Hallucination in LVLMs](https://arxiv.org/abs/2407.21771v1) (Jul. 31, 2024, **ECCV 2024**)[![arxiv](https://img.shields.io/badge/arXiv-b31b1b.svg)](https://arxiv.org/abs/2407.21771v1)[![github](https://img.shields.io/github/stars/LALBJ/PAI)](https://github.com/LALBJ/PAI)![alias](https://img.shields.io/badge/PAI-black)
18. [Mitigating Object Hallucinations in Large Vision-Language Models with Assembly of Global and Local Attention](https://arxiv.org/abs/2406.12718) (Jun. 18, 2024, **CVPR 2025**)[![arxiv](https://img.shields.io/badge/arXiv-b31b1b.svg)](https://arxiv.org/abs/2406.12718)[![github](https://img.shields.io/github/stars/Lackel/AGLA)](https://github.com/Lackel/AGLA)![alias](https://img.shields.io/badge/AGLA-black)
19. [Reducing Hallucinations in Vision-Language Models via Latent Space Steering](https://arxiv.org/abs/2410.15778) (Oct. 21, 2024, **ICLR 2025**)[![arxiv](https://img.shields.io/badge/arXiv-b31b1b.svg)](https://arxiv.org/abs/2410.15778)[![github](https://img.shields.io/github/stars/shengliu66/VTI)](https://github.com/shengliu66/VTI)![tag](https://img.shields.io/badge/Spotlight-FF4D00)![alias](https://img.shields.io/badge/VTI-black)
20. [Woodpecker: Hallucination Correction for Multimodal Large Language Models](https://arxiv.org/abs/2310.16045) (Oct. 10, 2024) [![arxiv](https://img.shields.io/badge/arXiv-b31b1b.svg)](https://arxiv.org/abs/2310.16045)![alias](https://img.shields.io/badge/Woodpecker-black)
21. [Alleviating Hallucinations in Large Vision-Language Models through Hallucination-Induced Optimization](http://arxiv.org/abs/2405.15356) (May. 24, 2024, **NeurIPS 2024**)[![arxiv](https://img.shields.io/badge/arXiv-b31b1b.svg)](http://arxiv.org/abs/2405.15356)[![github](https://img.shields.io/github/stars/BT-C/HIO)](https://github.com/BT-C/HIO)![alias](https://img.shields.io/badge/HIO-black)
22. [Mitigating Hallucinations in Large Vision-Language Models with Instruction Contrastive Decoding](https://arxiv.org/abs/2403.18715) (Mar. 27, 2024, **ACL 2024**)[![arxiv](https://img.shields.io/badge/arXiv-b31b1b.svg)](https://arxiv.org/abs/2403.18715)![alias](https://img.shields.io/badge/ICD-black)
23. [Mitigating Object Hallucination via Concentric Causal Attention](https://arxiv.org/abs/2410.15926) (Oct. 21, 2024, **NeurIPS 2024**)[![arxiv](https://img.shields.io/badge/arXiv-b31b1b.svg)](https://arxiv.org/abs/2410.15926)[![github](https://img.shields.io/github/stars/xing0047/cca-llava)](https://github.com/xing0047/cca-llava)![alias](https://img.shields.io/badge/CCA-black)
24. [DAMRO: Dive into the Attention Mechanism of LVLM to Reduce Object Hallucination](https://arxiv.org/abs/2410.04514) (Oct. 6, 2024 **EMNLP 2024**)[![arxiv](https://img.shields.io/badge/arXiv-b31b1b.svg)](https://arxiv.org/abs/2410.04514)![alias](https://img.shields.io/badge/DAMRO-black)
25. [Skip \n: A Simple Method to Reduce Hallucination in Large Vision-Language Models](https://arxiv.org/abs/2402.01345) (Feb. 2, 2024, **ICLR 2024**)[![arxiv](https://img.shields.io/badge/arXiv-b31b1b.svg)](https://arxiv.org/abs/2402.01345)[![github](https://img.shields.io/github/stars/hanmenghan/Skip-n)](https://github.com/hanmenghan/Skip-n)![alias](https://img.shields.io/badge/SKIP\n-black)
26. [Eyes Wide Shut? Exploring the Visual Shortcomings of Multimodal LLMs](https://arxiv.org/abs/2401.06209) (Jan. 11, 2024, **CVPR 2024**)[![arxiv](https://img.shields.io/badge/arXiv-b31b1b.svg)](https://arxiv.org/abs/2305.10355)[![github](https://img.shields.io/github/stars/tsb0601/MMVP)](https://github.com/tsb0601/MMVP)![alias](https://img.shields.io/badge/MoF-black)
27. [Hallucination Augmented Contrastive Learning for Multimodal Large Language Model](https://arxiv.org/abs/2312.06968) (Dec. 12, 2023, **CVPR 2024**)[![arxiv](https://img.shields.io/badge/arXiv-b31b1b.svg)](https://arxiv.org/abs/2312.06968)[![github](https://img.shields.io/github/stars/X-PLUG/mPLUG-HalOwl)](https://github.com/X-PLUG/mPLUG-HalOwl)![alias](https://img.shields.io/badge/HACL-black)
28. [OPERA: Alleviating Hallucination in Multi-Modal Large Language Models via Over-Trust Penalty and Retrospection-Allocation](https://arxiv.org/abs/2311.17911) (Nov. 29, 2023, **CVPR 2024**)[![arxiv](https://img.shields.io/badge/arXiv-b31b1b.svg)](https://arxiv.org/abs/2311.17911)[![github](https://img.shields.io/github/stars/shikiw/OPERA)](https://github.com/shikiw/OPERA)![tag](https://img.shields.io/badge/Highlight-FF4D00)![alias](https://img.shields.io/badge/OPERA-black)
29. [Mitigating Object Hallucinations in Large Vision-Language Models through Visual Contrastive Decoding](https://arxiv.org/abs/2311.16922) (Nov. 28, 2023, **CVPR 2024**)[![arxiv](https://img.shields.io/badge/arXiv-b31b1b.svg)](https://arxiv.org/abs/2311.16922)[![github](https://img.shields.io/github/stars/DAMO-NLP-SG/VCD)](https://github.com/DAMO-NLP-SG/VCD)![tag](https://img.shields.io/badge/Highlight-FF4D00)![alias](https://img.shields.io/badge/VCD-black)
30. [Analyzing and Mitigating Object Hallucination in Large Vision-Language Models](https://arxiv.org/abs/2310.00754) (Oct. 1, 2023, **ICLR 2024**)[![arxiv](https://img.shields.io/badge/arXiv-b31b1b.svg)](https://arxiv.org/abs/2310.00754)[![github](https://img.shields.io/github/stars/YiyangZhou/LURE)](https://github.com/YiyangZhou/LURE)![alias](https://img.shields.io/badge/LURE-black)
31. [DoLa: Decoding by Contrasting Layers Improves Factuality in Large Language Models](https://arxiv.org/abs/2309.03883) (Sep. 7, 2023, **ICLR 2024**)[![arxiv](https://img.shields.io/badge/arXiv-b31b1b.svg)](https://arxiv.org/abs/2309.03883)[![github](https://img.shields.io/github/stars/voidism/DoLa)](https://github.com/voidism/DoLa)![alias](https://img.shields.io/badge/DoLa-black)

## :star:Acknowledgment

This project is inspired by [Awesome-MLLM-Hallucination](https://github.com/showlab/Awesome-MLLM-Hallucination) and [Awesome-Multimodal-Large-Language-Models](https://github.com/BradyFU/Awesome-Multimodal-Large-Language-Models). Thanks for their contribution to the research community.