https://github.com/RLHF-V/RLHF-V
[CVPR'24] RLHF-V: Towards Trustworthy MLLMs via Behavior Alignment from Fine-grained Correctional Human Feedback
https://github.com/RLHF-V/RLHF-V
chatbot gpt-4 llama multi-modality multimodal rlhf-v visual-language-learning
Last synced: about 2 months ago
JSON representation
[CVPR'24] RLHF-V: Towards Trustworthy MLLMs via Behavior Alignment from Fine-grained Correctional Human Feedback
- Host: GitHub
- URL: https://github.com/RLHF-V/RLHF-V
- Owner: RLHF-V
- Created: 2023-11-29T07:32:32.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2024-09-11T09:21:43.000Z (7 months ago)
- Last Synced: 2024-10-18T21:59:14.525Z (6 months ago)
- Topics: chatbot, gpt-4, llama, multi-modality, multimodal, rlhf-v, visual-language-learning
- Language: Python
- Homepage: https://rlhf-v.github.io
- Size: 70.6 MB
- Stars: 228
- Watchers: 2
- Forks: 6
- Open Issues: 1
-
Metadata Files:
- Readme: readme.md
Awesome Lists containing this project
- awesome-RLHF - official
- StarryDivineSky - RLHF-V/RLHF-V - V项目是CVPR 2024的一项研究,旨在通过细粒度的修正性人类反馈,实现更值得信赖的多模态大型语言模型(MLLMs)。该项目提出了一种行为对齐方法,通过人类提供的细致修正意见来训练模型,使其行为更符合人类期望。核心思想是利用人类反馈来纠正模型在视觉理解和推理方面的错误,从而提高模型的可信度和可靠性。项目关注于提升MLLMs在处理视觉信息时的准确性和一致性,使其能够更好地理解图像并生成更合理的文本描述。这种方法通过对模型行为的微调,使其在复杂场景下能够做出更明智的决策,从而增强用户对模型的信任。简单来说,RLHF-V利用人类的“修改意见”来训练AI,让AI更好地理解图像并做出正确的判断,最终让AI更值得信任。 (多模态大模型 / 资源传输下载)