An open API service indexing awesome lists of open source software.

https://github.com/ruiyang-061x/awesome-mllm-uncertainty

✹A curated list of papers on the uncertainty in multi-modal large language model (MLLM).
https://github.com/ruiyang-061x/awesome-mllm-uncertainty

List: awesome-mllm-uncertainty

large-language-models large-vision-language-models large-vision-models mllm multi-modal uncertainty uncertainty-estimation uncertainty-quantification

Last synced: 3 months ago
JSON representation

✹A curated list of papers on the uncertainty in multi-modal large language model (MLLM).

Awesome Lists containing this project

README

        

# Awesome MLLM Uncertainty [![Awesome](https://cdn.rawgit.com/sindresorhus/awesome/d7305f38d29fed78fa85652e3a63e154dd8e8829/media/badge.svg)](https://github.com/sindresorhus/awesome)

### :star::star::star: If you find this repo useful, please star it!

## Discussion

🏄🏄🏄 Welcome to join our MLLM uncertainty discussion group (the left QR code)! Or add my WeChat (the right QR code) to enter the group if the group QR code expires~




## 🔥 News

- 2025.1: 🔥🔥🔥 Check out our latest work: [VL-Uncertainty](https://arxiv.org/abs/2411.11919), leveraging semantic-equivalent perturbation for refined LVLM uncertainty estimation!

## Awesome List

+ **HEIE** [HEIE: MLLM-Based Hierarchical Explainable AIGC Image Implausibility Evaluator](https://arxiv.org/abs/2411.17261) (26 Nov 2024)
[![arXiv](https://img.shields.io/badge/arXiv-b31b1b.svg)](https://arxiv.org/abs/2411.17261)

+ **Calibration-MLLM** [Unveiling Uncertainty: A Deep Dive into Calibration and Performance of Multimodal Large Language Models](https://arxiv.org/abs/2412.14660) (19 Dec 2024, COLING 2025)
[![arXiv](https://img.shields.io/badge/arXiv-b31b1b.svg)](https://arxiv.org/abs/2412.14660)
[![Star](https://img.shields.io/github/stars/hfutml/Calibration-MLLM.svg?style=social&label=Star)](https://github.com/hfutml/Calibration-MLLM)

+ **LAP** [LAP, Using Action Feasibility for Improved Uncertainty Alignment of Large Language Model Planners](https://arxiv.org/abs/2412.06474) (9 Dec 2024)
[![arXiv](https://img.shields.io/badge/arXiv-b31b1b.svg)](https://arxiv.org/abs/2412.06474)

+ **DropoutDecoding** [From Uncertainty to Trust: Enhancing Reliability in Vision-Language Models with Uncertainty-Guided Dropout Decoding](https://arxiv.org/abs/2412.06474) (9 Dec 2024)
[![arXiv](https://img.shields.io/badge/arXiv-b31b1b.svg)](https://arxiv.org/abs/2412.06474)
[![Star](https://img.shields.io/github/stars/kigb/DropoutDecoding.svg?style=social&label=Star)](https://github.com/kigb/DropoutDecoding)

+ **IDK** [I Don't Know: Explicit Modeling of Uncertainty with an \[IDK\] Token](https://arxiv.org/abs/2412.06676) (9 Dec 2024, NeurIPS 2024)
[![arXiv](https://img.shields.io/badge/arXiv-b31b1b.svg)](https://arxiv.org/abs/2412.06676)
[![Star](https://img.shields.io/github/stars/roi-hpi/IDK-token-tuning.svg?style=social&label=Star)](https://github.com/roi-hpi/IDK-token-tuning)

+ **BayesVLM** [Post-hoc Probabilistic Vision-Language Models](https://arxiv.org/abs/2412.06014) (8 Dec 2024)
[![arXiv](https://img.shields.io/badge/arXiv-b31b1b.svg)](https://arxiv.org/abs/2412.06014)
[![Star](https://img.shields.io/github/stars/AaltoML/bayesVLM.svg?style=social&label=Star)](https://github.com/AaltoML/bayesVLM)

+ **Verb Mirage** [Verb Mirage: Unveiling and Assessing Verb Concept Hallucinations in Multimodal Large Language Models](https://arxiv.org/pdf/2412.04939) (6 Dec 2024)
[![arXiv](https://img.shields.io/badge/arXiv-b31b1b.svg)](https://arxiv.org/pdf/2412.04939)

+ **PUNC** [Towards Understanding and Quantifying Uncertainty for Text-to-Image Generation](https://arxiv.org/abs/2412.03178) (4 Dec 2024)
[![arXiv](https://img.shields.io/badge/arXiv-b31b1b.svg)](https://arxiv.org/abs/2412.03178)

+ **UA-CLM** [Enhancing Trust in Large Language Models with Uncertainty-Aware Fine-Tuning](https://arxiv.org/abs/2412.02904) (3 Dec 2024)
[![arXiv](https://img.shields.io/badge/arXiv-b31b1b.svg)](https://arxiv.org/abs/2412.02904)

+ **VL-Uncertainty** [VL-Uncertainty: Detecting Hallucination in Large Vision-Language Model via Uncertainty Estimation](https://arxiv.org/abs/2411.11919) (18 Nov 2024)
[![arXiv](https://img.shields.io/badge/arXiv-b31b1b.svg)](https://arxiv.org/abs/2411.11919)
[![Star](https://img.shields.io/github/stars/Ruiyang-061X/VL-Uncertainty.svg?style=social&label=Star)](https://github.com/Ruiyang-061X/VL-Uncertainty)

+ **MUB** [Exploring Response Uncertainty in MLLMs: An Empirical Evaluation under Misleading Scenarios](https://arxiv.org/abs/2411.02708) (5 Nov 2024)
[![arXiv](https://img.shields.io/badge/arXiv-b31b1b.svg)](https://arxiv.org/abs/2411.02708)
[![Star](https://img.shields.io/github/stars/Yunkai696/MUB.svg?style=social&label=Star)](https://github.com/Yunkai696/MUB)

+ **CrossPred-LVLM** [Can We Predict Performance of Large Models across Vision-Language Tasks?](https://arxiv.org/abs/2410.10112) (14 Oct 2024)
[![arXiv](https://img.shields.io/badge/arXiv-b31b1b.svg)](https://arxiv.org/abs/2410.10112)
[![Star](https://img.shields.io/github/stars/qinyu-allen-zhao/crosspred-lvlm.svg?style=social&label=Star)](https://github.com/qinyu-allen-zhao/crosspred-lvlm)

+ **TRON** [Sample then Identify: A General Framework for Risk Control and Assessment in Multimodal Large Language Models](https://arxiv.org/abs/2410.08174) (10 Oct 2024)
[![arXiv](https://img.shields.io/badge/arXiv-b31b1b.svg)](https://arxiv.org/abs/2410.08174)

+ [Reference-free Hallucination Detection for Large Vision-Language Models](https://arxiv.org/abs/2408.05767) (11 Aug 2024, EMNLP 2024 Findings)
[![arXiv](https://img.shields.io/badge/arXiv-b31b1b.svg)](https://arxiv.org/abs/2408.05767)
[![Star](https://img.shields.io/github/stars/Ruiyang-061X/VL-Uncertainty.svg?style=social&label=Star)](https://github.com/Ruiyang-061X/VL-Uncertainty)

+ **Semantic Entropy** [Detecting Hallucinations in Large Language Models Using Semantic Entropy](https://www.nature.com/articles/s41586-024-07421-0) (19 Jun 2024, Nature)
[![Star](https://img.shields.io/github/stars/jlko/semantic_uncertainty.svg?style=social&label=Star)](https://github.com/jlko/semantic_uncertainty)

+ **UAL** [Uncertainty Aware Learning for Language Model Alignment](https://arxiv.org/abs/2406.04854) (7 Jun 2024, ACL 2024)
[![arXiv](https://img.shields.io/badge/arXiv-b31b1b.svg)](https://arxiv.org/abs/2406.04854)

+ **HIO** [Alleviating Hallucinations in Large Vision-Language Models through Hallucination-Induced Optimization](https://arxiv.org/abs/2405.15356) (24 May 2024, NeurIPS 2024)
[![arXiv](https://img.shields.io/badge/arXiv-b31b1b.svg)](https://arxiv.org/abs/2405.15356)
[![Star](https://img.shields.io/github/stars/BT-C/HIO.svg?style=social&label=Star)](https://github.com/BT-C/HIO)

+ [Overconfidence is Key: Verbalized Uncertainty Evaluation in Large Language and Vision-Language Models](https://arxiv.org/abs/2405.02917) (5 May 2024, TrustNLP 2024))
[![arXiv](https://img.shields.io/badge/arXiv-b31b1b.svg)](https://arxiv.org/abs/2405.02917)

+ **Consistency and Uncertainty** [Consistency and Uncertainty: Identifying Unreliable Responses From Black-Box Vision-Language Models for Selective Visual Question Answering](https://arxiv.org/abs/2404.10193) (16 Apr 2024, CVPR 2024)
[![arXiv](https://img.shields.io/badge/arXiv-b31b1b.svg)](https://arxiv.org/abs/2404.10193)

+ **UPD** [Unsolvable Problem Detection: Evaluating Trustworthiness of Vision Language Models](https://arxiv.org/abs/2403.20331) (29 Mar 2024)
[![arXiv](https://img.shields.io/badge/arXiv-b31b1b.svg)](https://arxiv.org/abs/2403.20331)
[![Star](https://img.shields.io/github/stars/AtsuMiyai/UPD.svg?style=social&label=Star)](https://github.com/AtsuMiyai/UPD)

+ **ICD** [Mitigating Hallucinations in Large Vision-Language Models with Instruction Contrastive Decoding](https://arxiv.org/abs/2403.18715) (27 Mar 2024, ACL 2024 Findings)
[![arXiv](https://img.shields.io/badge/arXiv-b31b1b.svg)](https://arxiv.org/abs/2403.18715)

+ **The First to Know** [The First to Know: How Token Distributions Reveal Hidden Knowledge in Large Vision-Language Models?](https://arxiv.org/abs/2403.09037) (14 Mar 2024, ECCV 2024)
[![arXiv](https://img.shields.io/badge/arXiv-b31b1b.svg)](https://arxiv.org/abs/2403.09037)
[![Star](https://img.shields.io/github/stars/Qinyu-Allen-Zhao/LVLM-LP.svg?style=social&label=Star)](https://github.com/Qinyu-Allen-Zhao/LVLM-LP)

+ **VLM-Uncertainty-Bench** [Uncertainty-Aware Evaluation for Vision-Language Models](https://arxiv.org/abs/2402.14418) (22 Feb 2024)
[![arXiv](https://img.shields.io/badge/arXiv-b31b1b.svg)](https://arxiv.org/abs/2402.14418)
[![Star](https://img.shields.io/github/stars/EnSec-AI/VLM-Uncertainty-Bench.svg?style=social&label=Star)](https://github.com/EnSec-AI/VLM-Uncertainty-Bench)

+ **LogicCheckGPT** [Logical Closed Loop: Uncovering Object Hallucinations in Large Vision-Language Models](https://arxiv.org/abs/2402.11622) (18 Feb 2024, ACL 2024 Findings)
[![arXiv](https://img.shields.io/badge/arXiv-b31b1b.svg)](https://arxiv.org/abs/2402.11622)
[![Star](https://img.shields.io/github/stars/CRIPAC-DIG/LogicCheckGPT.svg?style=social&label=Star)](https://github.com/CRIPAC-DIG/LogicCheckGPT)

+ **UQ_ICL** [Uncertainty Quantification for In-Context Learning of Large Language Models](https://arxiv.org/abs/2402.10189) (15 Feb 2024, NAACL 2024)
[![arXiv](https://img.shields.io/badge/arXiv-b31b1b.svg)](https://arxiv.org/abs/2402.10189)
[![Star](https://img.shields.io/github/stars/lingchen0331/UQ_ICL.svg?style=social&label=Star)](https://github.com/lingchen0331/UQ_ICL)

+ **IntroPlan** [Introspective Planning: Aligning Robots' Uncertainty with Inherent Task Ambiguity](https://arxiv.org/abs/2402.06529) (9 Feb 2024, NeurIPS 2024)
[![arXiv](https://img.shields.io/badge/arXiv-b31b1b.svg)](https://arxiv.org/abs/2402.06529)
[![Star](https://img.shields.io/github/stars/kevinliang888/IntroPlan.svg?style=social&label=Star)](https://github.com/kevinliang888/IntroPlan)

+ **LLM-Uncertainty-Bench** [Benchmarking LLMs via Uncertainty Quantification](https://arxiv.org/abs/2401.12794) (23 Jan 2024, NeurIPS 2024 Datasets & Benchmarks)
[![arXiv](https://img.shields.io/badge/arXiv-b31b1b.svg)](https://arxiv.org/abs/2401.12794)
[![Star](https://img.shields.io/github/stars/smartyfh/LLM-Uncertainty-Bench.svg?style=social&label=Star)](https://github.com/smartyfh/LLM-Uncertainty-Bench)

+ **CD-CCA** [Cloud-Device Collaborative Learning for Multimodal Large Language Models](https://arxiv.org/abs/2312.16279) (26 Dec 2023, CVPR 2024)
[![arXiv](https://img.shields.io/badge/arXiv-b31b1b.svg)](https://arxiv.org/abs/2312.16279)

+ **VCD** [Mitigating Object Hallucinations in Large Vision-Language Models through Visual Contrastive Decoding](https://arxiv.org/abs/2311.16922) (28 Nov 2023, CVPR 2024 Highlight)
[![arXiv](https://img.shields.io/badge/arXiv-b31b1b.svg)](https://arxiv.org/abs/2311.16922)
[![Star](https://img.shields.io/github/stars/DAMO-NLP-SG/VCD.svg?style=social&label=Star)](https://github.com/DAMO-NLP-SG/VCD)

+ **LURE** [Analyzing and Mitigating Object Hallucination in Large Vision-Language Models](https://arxiv.org/abs/2310.00754) (1 Oct 2023, ICLR 2024)
[![arXiv](https://img.shields.io/badge/arXiv-b31b1b.svg)](https://arxiv.org/abs/2310.00754)
[![Star](https://img.shields.io/github/stars/YiyangZhou/LURE.svg?style=social&label=Star)](https://github.com/YiyangZhou/LURE)

+ **PAU** [Prototype-based Aleatoric Uncertainty Quantification for Cross-modal Retrieval](https://arxiv.org/abs/2309.17093) (29 Sep 2023, NeurIPS 2023)
[![arXiv](https://img.shields.io/badge/arXiv-b31b1b.svg)](https://arxiv.org/abs/2309.17093)
[![Star](https://img.shields.io/github/stars/leolee99/PAU.svg?style=social&label=Star)](https://github.com/leolee99/PAU)

+ **KnowNo** [Robots That Ask For Help: Uncertainty Alignment for Large Language Model Planners](https://arxiv.org/abs/2403.13198) (4 Jul 2023, CoRL 2023, Best Student Paper)
[![arXiv](https://img.shields.io/badge/arXiv-b31b1b.svg)](https://arxiv.org/abs/2403.13198)
[![Star](https://img.shields.io/github/stars/google-research/google-research.svg?style=social&label=Star)](https://github.com/google-research/google-research/tree/master/language_model_uncertainty)

+ **ProbVLM** [ProbVLM: Probabilistic Adapter for Frozen Vision-Language Models](https://arxiv.org/abs/2307.00398) (1 Jul 2023, ICCV 2023)
[![arXiv](https://img.shields.io/badge/arXiv-b31b1b.svg)](https://arxiv.org/abs/2307.00398)
[![Star](https://img.shields.io/github/stars/ExplainableML/ProbVLM.svg?style=social&label=Star)](https://github.com/ExplainableML/ProbVLM)

+ **GAVIE** [Mitigating Hallucination in Large Multi-Modal Models via Robust Instruction Tuning](https://arxiv.org/abs/2306.14565) (26 Jun 2023, ICLR 2024)
[![arXiv](https://img.shields.io/badge/arXiv-b31b1b.svg)](https://arxiv.org/abs/2306.14565)
[![Star](https://img.shields.io/github/stars/FuxiaoLiu/LRV-Instruction.svg?style=social&label=Star)](https://github.com/FuxiaoLiu/LRV-Instruction)

+ [Can LLMs Express Their Uncertainty? An Empirical Evaluation of Confidence Elicitation in LLMs](https://arxiv.org/abs/2306.13063) (22 Jun 2023, ICLR 2024)
[![arXiv](https://img.shields.io/badge/arXiv-b31b1b.svg)](https://arxiv.org/abs/2306.13063)
[![Star](https://img.shields.io/github/stars/MiaoXiong2320/llm-uncertainty.svg?style=social&label=Star)](https://github.com/MiaoXiong2320/llm-uncertainty)

+ **POPE** [Evaluating Object Hallucination in Large Vision-Language Models](https://arxiv.org/abs/2305.10355) (17 May 2023, EMNLP 2023)
[![arXiv](https://img.shields.io/badge/arXiv-b31b1b.svg)](https://arxiv.org/abs/2305.10355)
[![Star](https://img.shields.io/github/stars/RUCAIBox/POPE.svg?style=social&label=Star)](https://github.com/RUCAIBox/POPE)

+ **UQ-NLG** [Generating with Confidence: Uncertainty Quantification for Black-box Large Language Models](https://arxiv.org/abs/2305.19187) (30 May 2023, TMLR)
[![arXiv](https://img.shields.io/badge/arXiv-b31b1b.svg)](https://arxiv.org/abs/2305.19187)
[![Star](https://img.shields.io/github/stars/zlin7/UQ-NLG.svg?style=social&label=Star)](https://github.com/zlin7/UQ-NLG)

+ **Semantic Uncertainty** [Semantic Uncertainty: Linguistic Invariances for Uncertainty Estimation in Natural Language Generation](https://arxiv.org/abs/2302.09664) (19 Feb 2023, ICLR 2023 Spotlight)
[![arXiv](https://img.shields.io/badge/arXiv-b31b1b.svg)](https://arxiv.org/abs/2302.09664)
[![Star](https://img.shields.io/github/stars/lorenzkuhn/semantic_uncertainty.svg?style=social&label=Star)](https://github.com/lorenzkuhn/semantic_uncertainty)

+ **MAP** [MAP: Multimodal Uncertainty-Aware Vision-Language Pre-training Model](https://arxiv.org/abs/2210.05335) (11 Oct 2022, CVPR 2023)
[![arXiv](https://img.shields.io/badge/arXiv-b31b1b.svg)](https://arxiv.org/abs/2210.05335)
[![Star](https://img.shields.io/github/stars/IIGROUP/MAP.svg?style=social&label=Star)](https://github.com/IIGROUP/MAP)