Projects in Awesome Lists by OpenGVLab

https://github.com/OpenGVLab/LLaMA-Adapter

[ICLR 2024] Fine-tuning LLaMA to follow Instructions within 1 Hour and 1.2M Parameters

Last synced: 30 Jul 2024

https://github.com/OpenGVLab/DragGAN

Unofficial Implementation of DragGAN - "Drag Your GAN: Interactive Point-based Manipulation on the Generative Image Manifold" （DragGAN 全功能实现，在线Demo，本地部署试用，代码、模型已全部开源，支持Windows, macOS, Linux）

draggan gradio-interface image-editing image-generation interngpt

Last synced: 01 Aug 2024

https://github.com/opengvlab/internvl

[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的可商用开源多模态对话模型

gpt gpt-4o gpt-4v image-classification image-text-retrieval llm multi-modal semantic-segmentation video-classification vision-language-model vit-22b vit-6b

Last synced: 02 Aug 2024

https://github.com/OpenGVLab/InternVL

[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的可商用开源多模态对话模型

gpt gpt-4o gpt-4v image-classification image-text-retrieval llm multi-modal semantic-segmentation video-classification vision-language-model vit-22b vit-6b

Last synced: 31 Jul 2024

https://github.com/opengvlab/interngpt

InternGPT (iGPT) is an open source demo platform where you can easily showcase your AI models. Now it supports DragGAN, ChatGPT, ImageBind, multimodal chat like GPT-4, SAM, interactive image editing, etc. Try it at igpt.opengvlab.com (支持DragGAN、ChatGPT、ImageBind、SAM的在线Demo系统)

chatgpt click draggan foundation-model gpt gpt-4 gradio husky image-captioning imagebind internimage langchain llama llm multimodal sam segment-anything vicuna video-generation vqa

Last synced: 02 Aug 2024

https://github.com/OpenGVLab/InternGPT

InternGPT (iGPT) is an open source demo platform where you can easily showcase your AI models. Now it supports DragGAN, ChatGPT, ImageBind, multimodal chat like GPT-4, SAM, interactive image editing, etc. Try it at igpt.opengvlab.com (支持DragGAN、ChatGPT、ImageBind、SAM的在线Demo系统)

chatgpt click draggan foundation-model gpt gpt-4 gradio husky image-captioning imagebind internimage langchain llama llm multimodal sam segment-anything vicuna video-generation vqa

Last synced: 31 Jul 2024

https://github.com/OpenGVLab/Ask-Anything

[CVPR2024 Highlight][VideoChatGPT] ChatGPT with video understanding! And many more supported LMs such as miniGPT4, StableLM, and MOSS.

big-model captioning-videos chat chatgpt foundation-models gradio langchain large-language-models large-model stablelm video video-question-answering video-understanding

Last synced: 31 Jul 2024

https://github.com/opengvlab/ask-anything

[CVPR2024 Highlight][VideoChatGPT] ChatGPT with video understanding! And many more supported LMs such as miniGPT4, StableLM, and MOSS.

big-model captioning-videos chat chatgpt foundation-models gradio langchain large-language-models large-model stablelm video video-question-answering video-understanding

Last synced: 02 Aug 2024

https://github.com/OpenGVLab/InternImage

[CVPR 2023 Highlight] InternImage: Exploring Large-Scale Vision Foundation Models with Deformable Convolutions

backbone deformable-convolution foundation-model object-detection semantic-segmentation

Last synced: 31 Jul 2024

https://github.com/OpenGVLab/InternVideo

Video Foundation Models & Data for Multimodal Understanding

action-recognition benchmark contrastive-learning foundation-models instruction-tuning masked-autoencoder multimodal open-set-recognition self-supervised spatio-temporal-action-localization temporal-action-localization video-clip video-data video-dataset video-question-answering video-retrieval video-understanding vision-transformer zero-shot-classification zero-shot-retrieval

Last synced: 31 Jul 2024

https://github.com/OpenGVLab/SAM-Med2D

Official implementation of SAM-Med2D

Last synced: 01 Aug 2024

https://github.com/OpenGVLab/VideoMamba

[ECCV2024] VideoMamba: State Space Model for Efficient Video Understanding

Last synced: 03 Aug 2024

https://github.com/OpenGVLab/VisionLLM

VisionLLM Series

generalist-model large-language-models object-detection

Last synced: 01 Aug 2024

https://github.com/opengvlab/multi-modality-arena

Chatbot Arena meets multi-modality! Multi-Modality Arena allows you to benchmark vision-language models side-by-side while providing images as inputs. Supports MiniGPT-4, LLaMA-Adapter V2, LLaVA, BLIP-2, and many more!

chat chatbot chatgpt gradio large-language-models llms multi-modality vision-language-model vqa

Last synced: 02 Aug 2024

https://github.com/OpenGVLab/DCNv4

[CVPR 2024] Deformable Convolution v4

Last synced: 31 Jul 2024

https://github.com/OpenGVLab/Multi-Modality-Arena

Chatbot Arena meets multi-modality! Multi-Modality Arena allows you to benchmark vision-language models side-by-side while providing images as inputs. Supports MiniGPT-4, LLaMA-Adapter V2, LLaVA, BLIP-2, and many more!

chat chatbot chatgpt gradio large-language-models llms multi-modality vision-language-model vqa