https://github.com/Fanziyang-v/Awesome-MLLMs-Hallucination-Mitigation
Paper lists of awesome works in mitigating hallucination in Multimodal Large Language Models(MLLMs).
https://github.com/Fanziyang-v/Awesome-MLLMs-Hallucination-Mitigation
List: Awesome-MLLMs-Hallucination-Mitigation
Last synced: about 2 months ago
JSON representation
Paper lists of awesome works in mitigating hallucination in Multimodal Large Language Models(MLLMs).
- Host: GitHub
- URL: https://github.com/Fanziyang-v/Awesome-MLLMs-Hallucination-Mitigation
- Owner: Fanziyang-v
- Created: 2025-03-11T14:30:34.000Z (3 months ago)
- Default Branch: main
- Last Pushed: 2025-04-04T06:47:35.000Z (about 2 months ago)
- Last Synced: 2025-04-04T07:33:27.004Z (about 2 months ago)
- Size: 8.79 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
- ultimate-awesome - Awesome-MLLMs-Hallucination-Mitigation - Paper lists of awesome works in mitigating hallucination in Multimodal Large Language Models(MLLMs). (Other Lists / Julia Lists)
README
# Awesome Multimodal Large Language Models Hallucination Mitigation
This is a list of some awesome works on mitigating hallucination in large multimodal models.## :books:Survey
1. [Hallucination of Multimodal Large Language Models: A Survey](https://arxiv.org/abs/2404.18930) (Apr. 30, 2024)[](https://arxiv.org/abs/2404.18930)[](https://github.com/showlab/Awesome-MLLM-Hallucination/)
1. [A Survey on Hallucination in Large Vision-Language Models](https://arxiv.org/abs/2402.00253) (Feb. 1, 2024)[](https://arxiv.org/abs/2402.00253)[](https://github.com/lhanchao777/LVLM-Hallucinations-Survey)## :bar_chart:Benchmarks
1. [Object Hallucination in Image Captioning](https://arxiv.org/abs/1809.02156) ](https://arxiv.org/abs/2404.18930) (Sep. 6, 2018, **EMNLP 2018**) [](https://arxiv.org/abs/1809.02156)
2. [Evaluating Object Hallucination in Large Vision-Language Models](https://arxiv.org/abs/2305.10355) (May. 17, 2023, **EMNLP 2023**) [](https://arxiv.org/abs/2305.10355)[](https://github.com/AoiDragon/POPE)
3. [Eyes Wide Shut? Exploring the Visual Shortcomings of Multimodal LLMs](https://arxiv.org/abs/2401.06209) (Jan. 11, 2024, **CVPR 2024**) [](https://arxiv.org/abs/2305.10355)[](https://github.com/tsb0601/MMVP)
4. [MME: A Comprehensive Evaluation Benchmark for Multimodal Large Language Models](https://arxiv.org/abs/2306.13394) (Jun. 23, 2023) [](https://arxiv.org/abs/2306.13394)[](https://github.com/BradyFU/Awesome-Multimodal-Large-Language-Models)
5. [MASH-VLM: Mitigating Action-Scene Hallucination in Video-LLMs through Disentangled Spatial-Temporal Representations](http://arxiv.org/abs/2503.15871) (Mar. 20, 2025, **CVPR 2025**)[](http://arxiv.org/abs/2503.15871)
6. [AMBER: An LLM-free Multi-dimensional Benchmark for MLLMs Hallucination Evaluation](https://arxiv.org/abs/2311.07397) (Nov. 13, 2023) [](https://arxiv.org/abs/2311.07397)[](https://github.com/junyangwang0410/AMBER)
7. [Aligning Large Multimodal Models with Factually Augmented RLHF](https://arxiv.org/abs/2309.14525) (Sep. 25, 2023, **ACL 2024**) [](https://arxiv.org/abs/2309.14525)[](https://github.com/llava-rlhf/LLaVA-RLHF)## :clap:Hallucination Mitigation
1. [MASH-VLM: Mitigating Action-Scene Hallucination in Video-LLMs through Disentangled Spatial-Temporal Representations](http://arxiv.org/abs/2503.15871) (Mar. 20, 2025, **CVPR 2025**) [](http://arxiv.org/abs/2503.15871)
2. [ClearSight: Visual Signal Enhancement for Object Hallucination Mitigation in Multimodal Large language Models](https://arxiv.org/abs/2503.13107) (Mar. 17, 2025) [](https://arxiv.org/abs/2503.13107)[](https://github.com/ustc-hyin/ClearSight)
3. [Through the Magnifying Glass: Adaptive Perception Magnification for Hallucination-Free VLM Decoding](https://arxiv.org/abs/2503.10183) (Mar. 13, 2025)[](https://arxiv.org/abs/2503.10183)[](https://github.com/ShunqiM/PM)
4. [EAZY: Eliminating Hallucinations in LVLMs by Zeroing out Hallucinatory Image Tokens](https://arxiv.org/abs/2503.07772) (Mar. 10, 2025)[](https://arxiv.org/abs/2503.07772)
5. [Mask-DPO: Generalizable Fine-grained Factuality Alignment of LLMs](https://arxiv.org/abs/2503.02846) (Mar. 4, 2025, **ICLR 2025**)[](http://arxiv.org/abs/2503.02846)
6. [PerturboLLaVA: Reducing Multimodal Hallucinations with Perturbative Visual Training](https://arxiv.org/abs/2503.06486) (Mar. 9, 2025, **ICLR 2025**)[](https://arxiv.org/abs/2503.06486)
7. [Octopus: Alleviating Hallucination via Dynamic Contrastive Decoding](https://arxiv.org/abs/2503.00361) (Mar. 1, 2025, **CVPR 2025**)[](https://arxiv.org/abs/2503.00361)[](https://github.com/LijunZhang01/Octopus)
8. [Refine Knowledge of Large Language Models via Adaptive Contrastive Learning](http://arxiv.org/abs/2502.07184) (Feb. 11, 2025, **ICLR 2025**)[](http://arxiv.org/abs/2502.07184)
9. [Mitigating Hallucinations in Large Vision-Language Models via DPO: On-Policy Data Hold the Key](https://arxiv.org/abs/2501.09695) (Jan. 16, 2025, **CVPR 2025**)[](https://arxiv.org/abs/2501.09695)[](https://github.com/zhyang2226/OPA-DPO)
10. [VASparse: Towards Efficient Visual Hallucination Mitigation for Large Vision-Language Model via Visual-Aware Sparsification](https://arxiv.org/abs/2501.06553) (Jan. 11, 2025, **CVPR 2025**)[](https://arxiv.org/abs/2501.06553)[](https://github.com/Ziwei-Zheng/Nullu)
11. [Nullu: Mitigating Object Hallucinations in Large Vision-Language Models via HalluSpace Projection](https://arxiv.org/abs/2412.13817) (Dec. 18, 2024, **CVPR 2025**) [](https://arxiv.org/abs/2501.06553)[](https://github.com/mengchuang123/VASparse-github)
12. [Interpreting and Editing Vision-Language Representations to Mitigate Hallucinations](https://arxiv.org/abs/2410.02762) (Oct. 3, 2024, **ICLR 2025**) [](https://arxiv.org/abs/2410.02762)[](https://github.com/nickjiang2378/vl-interp/)
13. [Devils in Middle Layers of Large Vision-Language Models: Interpreting, Detecting and Mitigating Object Hallucinations via Attention Lens](https://arxiv.org/abs/2411.16724) (Nov. 23, 2024, **CVPR 2025**) [](https://arxiv.org/abs/2411.16724)[](https://github.com/ZhangqiJiang07/middle_layers_indicating_hallucinations)
14. [ICT: Image-Object Cross-Level Trusted Intervention for Mitigating Object Hallucination in Large Vision-Language Models](https://arxiv.org/abs/2411.15268) (Nov. 22, 2024, **CVPR 2025**) [](https://arxiv.org/abs/2411.15268)
15. [Eagle: Exploring The Design Space for Multimodal LLMs with Mixture of Encoders](https://arxiv.org/abs/2408.15998) (Aug. 28, 2024, **ICLR 2025**) [](https://arxiv.org/abs/2408.15998)[](https://github.com/NVlabs/Eagle)
16. [Self-Introspective Decoding: Alleviating Hallucinations for Large Vision-Language Models](http://arxiv.org/abs/2408.02032) (Aug. 4, 2024, **ICLR 2025**)[](https://arxiv.org/abs/2406.12718)[](https://github.com/huofushuo/SID)
17. [Paying More Attention to Image: A Training-Free Method for Alleviating Hallucination in LVLMs](https://arxiv.org/abs/2407.21771v1) (Jul. 31, 2024, **ECCV 2024**)[](https://arxiv.org/abs/2407.21771v1)[](https://github.com/LALBJ/PAI)
18. [Mitigating Object Hallucinations in Large Vision-Language Models with Assembly of Global and Local Attention](https://arxiv.org/abs/2406.12718) (Jun. 18, 2024, **CVPR 2025**)[](https://arxiv.org/abs/2406.12718)[](https://github.com/Lackel/AGLA)
19. [Reducing Hallucinations in Vision-Language Models via Latent Space Steering](https://arxiv.org/abs/2410.15778) (Oct. 21, 2024, **ICLR 2025**)[](https://arxiv.org/abs/2410.15778)[](https://github.com/shengliu66/VTI)
20. [Woodpecker: Hallucination Correction for Multimodal Large Language Models](https://arxiv.org/abs/2310.16045) (Oct. 10, 2024) [](https://arxiv.org/abs/2310.16045)
21. [Alleviating Hallucinations in Large Vision-Language Models through Hallucination-Induced Optimization](http://arxiv.org/abs/2405.15356) (May. 24, 2024, **NeurIPS 2024**)[](http://arxiv.org/abs/2405.15356)[](https://github.com/BT-C/HIO)
22. [Mitigating Hallucinations in Large Vision-Language Models with Instruction Contrastive Decoding](https://arxiv.org/abs/2403.18715) (Mar. 27, 2024, **ACL 2024**)[](https://arxiv.org/abs/2403.18715)
23. [Mitigating Object Hallucination via Concentric Causal Attention](https://arxiv.org/abs/2410.15926) (Oct. 21, 2024, **NeurIPS 2024**)[](https://arxiv.org/abs/2410.15926)[](https://github.com/xing0047/cca-llava)
24. [DAMRO: Dive into the Attention Mechanism of LVLM to Reduce Object Hallucination](https://arxiv.org/abs/2410.04514) (Oct. 6, 2024 **EMNLP 2024**)[](https://arxiv.org/abs/2410.04514)
25. [Skip \n: A Simple Method to Reduce Hallucination in Large Vision-Language Models](https://arxiv.org/abs/2402.01345) (Feb. 2, 2024, **ICLR 2024**)[](https://arxiv.org/abs/2402.01345)[](https://github.com/hanmenghan/Skip-n)
26. [Eyes Wide Shut? Exploring the Visual Shortcomings of Multimodal LLMs](https://arxiv.org/abs/2401.06209) (Jan. 11, 2024, **CVPR 2024**)[](https://arxiv.org/abs/2305.10355)[](https://github.com/tsb0601/MMVP)
27. [Hallucination Augmented Contrastive Learning for Multimodal Large Language Model](https://arxiv.org/abs/2312.06968) (Dec. 12, 2023, **CVPR 2024**)[](https://arxiv.org/abs/2312.06968)[](https://github.com/X-PLUG/mPLUG-HalOwl)
28. [OPERA: Alleviating Hallucination in Multi-Modal Large Language Models via Over-Trust Penalty and Retrospection-Allocation](https://arxiv.org/abs/2311.17911) (Nov. 29, 2023, **CVPR 2024**)[](https://arxiv.org/abs/2311.17911)[](https://github.com/shikiw/OPERA)
29. [Mitigating Object Hallucinations in Large Vision-Language Models through Visual Contrastive Decoding](https://arxiv.org/abs/2311.16922) (Nov. 28, 2023, **CVPR 2024**)[](https://arxiv.org/abs/2311.16922)[](https://github.com/DAMO-NLP-SG/VCD)
30. [Analyzing and Mitigating Object Hallucination in Large Vision-Language Models](https://arxiv.org/abs/2310.00754) (Oct. 1, 2023, **ICLR 2024**)[](https://arxiv.org/abs/2310.00754)[](https://github.com/YiyangZhou/LURE)
31. [DoLa: Decoding by Contrasting Layers Improves Factuality in Large Language Models](https://arxiv.org/abs/2309.03883) (Sep. 7, 2023, **ICLR 2024**)[](https://arxiv.org/abs/2309.03883)[](https://github.com/voidism/DoLa)
## :star:Acknowledgment
This project is inspired by [Awesome-MLLM-Hallucination](https://github.com/showlab/Awesome-MLLM-Hallucination) and [Awesome-Multimodal-Large-Language-Models](https://github.com/BradyFU/Awesome-Multimodal-Large-Language-Models). Thanks for their contribution to the research community.