https://github.com/RLHF-V/RLAIF-V
[CVPR'25] RLAIF-V: Aligning MLLMs through Open-Source AI Feedback for Super GPT-4V Trustworthiness
https://github.com/RLHF-V/RLAIF-V
chatbot cvpr2025 gpt-4v llava llava-next minicpm-v multimodal rlaif-v vision-language-learning
Last synced: about 1 month ago
JSON representation
[CVPR'25] RLAIF-V: Aligning MLLMs through Open-Source AI Feedback for Super GPT-4V Trustworthiness
- Host: GitHub
- URL: https://github.com/RLHF-V/RLAIF-V
- Owner: RLHF-V
- Created: 2024-05-13T07:58:37.000Z (12 months ago)
- Default Branch: main
- Last Pushed: 2025-03-03T11:40:28.000Z (2 months ago)
- Last Synced: 2025-03-03T12:34:19.562Z (2 months ago)
- Topics: chatbot, cvpr2025, gpt-4v, llava, llava-next, minicpm-v, multimodal, rlaif-v, vision-language-learning
- Language: Python
- Homepage:
- Size: 56.4 MB
- Stars: 300
- Watchers: 5
- Forks: 12
- Open Issues: 7
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
- awesome-RLAIF - Datasets and models
- StarryDivineSky - RLHF-V/RLAIF-V - V是一个开源项目,旨在通过人工智能反馈(AI Feedback)提升GPT-4V等视觉语言模型的可靠性和安全性。该项目基于RLAIF(Reinforcement Learning from AI Feedback)框架,利用AI而非人类来评估和改进模型的行为。其核心思想是训练一个奖励模型,该模型能够判断模型输出的好坏,并用此奖励信号来优化视觉语言模型。项目特色在于其开源性和对GPT-4V等先进模型的适配,目标是使AI系统更加可信赖。具体实现包括数据收集、奖励模型训练和强化学习优化三个阶段。项目代码和预训练模型将会开源,方便研究人员复现和进一步研究。该项目是CVPR 2025的论文成果,表明其在计算机视觉领域的学术价值。通过使用AI反馈,RLAIF-V有望减少人工干预,并提升模型训练的效率和可扩展性。 (多模态大模型 / 资源传输下载)