https://github.com/OpenGVLab/VisionLLM
VisionLLM Series
https://github.com/OpenGVLab/VisionLLM
generalist-model large-language-models object-detection
Last synced: 9 months ago
JSON representation
VisionLLM Series
- Host: GitHub
- URL: https://github.com/OpenGVLab/VisionLLM
- Owner: OpenGVLab
- License: apache-2.0
- Created: 2023-05-18T10:32:39.000Z (over 2 years ago)
- Default Branch: main
- Last Pushed: 2024-07-02T04:38:17.000Z (over 1 year ago)
- Last Synced: 2024-08-02T15:54:55.998Z (over 1 year ago)
- Topics: generalist-model, large-language-models, object-detection
- Language: Python
- Homepage: https://arxiv.org/abs/2305.11175
- Size: 17.4 MB
- Stars: 778
- Watchers: 40
- Forks: 16
- Open Issues: 11
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
- StarryDivineSky - OpenGVLab/VisionLLM - 4等,并持续探索新的架构和训练方法。VisionLLM模型能够执行图像描述、视觉问答、图像生成等多种任务。其核心工作原理通常涉及将视觉信息编码为向量表示,并与文本信息进行融合,然后利用Transformer等架构进行学习和推理。该项目旨在推动多模态人工智能的发展,为更智能的视觉应用提供基础。项目提供了代码、模型权重和数据集等资源,方便研究者和开发者使用。VisionLLM的目标是构建通用且高效的视觉语言模型,解决现实世界中的复杂问题。 (多模态大模型 / 资源传输下载)