Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/OpenGVLab/InternVL
[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型
https://github.com/OpenGVLab/InternVL
gpt gpt-4o gpt-4v image-classification image-text-retrieval llm multi-modal semantic-segmentation video-classification vision-language-model vit-22b vit-6b
Last synced: about 2 months ago
JSON representation
[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型
- Host: GitHub
- URL: https://github.com/OpenGVLab/InternVL
- Owner: OpenGVLab
- License: mit
- Created: 2023-11-22T08:08:08.000Z (about 1 year ago)
- Default Branch: main
- Last Pushed: 2024-10-21T15:58:22.000Z (about 2 months ago)
- Last Synced: 2024-10-21T16:47:09.711Z (about 2 months ago)
- Topics: gpt, gpt-4o, gpt-4v, image-classification, image-text-retrieval, llm, multi-modal, semantic-segmentation, video-classification, vision-language-model, vit-22b, vit-6b
- Language: Python
- Homepage: https://internvl.readthedocs.io/en/latest/
- Size: 35.6 MB
- Stars: 5,778
- Watchers: 52
- Forks: 453
- Open Issues: 117
-
Metadata Files:
- Readme: README.md
- Contributing: .github/CONTRIBUTING.md
- License: LICENSE
Awesome Lists containing this project
- Awesome-Reasoning-Foundation-Models - [code
- awesome-LLMs-finetuning - InternVL
- awesome-multi-modal - https://github.com/OpenGVLab/InternVL
- awesome-multi-modal - this https URL
- AiTreasureBox - OpenGVLab/InternVL - 12-07_6206_0](https://img.shields.io/github/stars/OpenGVLab/InternVL.svg)|[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4V. 接近GPT-4V表现的可商用开源模型| (Repos)
- StarryDivineSky - OpenGVLab/InternVL - 4o 的开创性开源替代品。接近GPT-4o表现的可商用开源多模态对话模型。InternVL 1.5,这是一种开源多模态大型语言模型 (MLLM),旨在弥合开源和专有商业模型在多模态理解方面的能力差距。我们介绍三种简单的设计:强视觉编码器:我们探索了一种针对大规模视觉基础模型的持续学习策略——InternViT-6B,提升其视觉理解能力,使其可以在不同的LLMs环境中转移和复用。动态高分辨率:我们根据输入图像的纵横比和分辨率,将图像划分为 1 到 40 的 448 × 448 像素的瓦片,最高支持 4K 分辨率输入。高质量的双语数据集:我们精心收集了一个高质量的双语数据集,涵盖了常见场景、文档图像,并用中英文问答对进行标注,显著提高了OCR和中文相关任务的性能。 (A01_文本生成_文本对话 / 大语言对话模型及数据)
- awesome-vision-language-pretraining - InternVL