Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
Projects in Awesome Lists by OpenGVLab
A curated list of projects in awesome lists by OpenGVLab .
https://github.com/OpenGVLab/LLaMA-Adapter
[ICLR 2024] Fine-tuning LLaMA to follow Instructions within 1 Hour and 1.2M Parameters
Last synced: 30 Jul 2024
https://github.com/OpenGVLab/DragGAN
Unofficial Implementation of DragGAN - "Drag Your GAN: Interactive Point-based Manipulation on the Generative Image Manifold" (DragGAN 全功能实现,在线Demo,本地部署试用,代码、模型已全部开源,支持Windows, macOS, Linux)
draggan gradio-interface image-editing image-generation interngpt
Last synced: 01 Aug 2024
https://github.com/opengvlab/internvl
[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的可商用开源多模态对话模型
gpt gpt-4o gpt-4v image-classification image-text-retrieval llm multi-modal semantic-segmentation video-classification vision-language-model vit-22b vit-6b
Last synced: 02 Aug 2024
https://github.com/OpenGVLab/InternVL
[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的可商用开源多模态对话模型
gpt gpt-4o gpt-4v image-classification image-text-retrieval llm multi-modal semantic-segmentation video-classification vision-language-model vit-22b vit-6b
Last synced: 31 Jul 2024
https://github.com/opengvlab/interngpt
InternGPT (iGPT) is an open source demo platform where you can easily showcase your AI models. Now it supports DragGAN, ChatGPT, ImageBind, multimodal chat like GPT-4, SAM, interactive image editing, etc. Try it at igpt.opengvlab.com (支持DragGAN、ChatGPT、ImageBind、SAM的在线Demo系统)
chatgpt click draggan foundation-model gpt gpt-4 gradio husky image-captioning imagebind internimage langchain llama llm multimodal sam segment-anything vicuna video-generation vqa
Last synced: 02 Aug 2024
https://github.com/OpenGVLab/InternGPT
InternGPT (iGPT) is an open source demo platform where you can easily showcase your AI models. Now it supports DragGAN, ChatGPT, ImageBind, multimodal chat like GPT-4, SAM, interactive image editing, etc. Try it at igpt.opengvlab.com (支持DragGAN、ChatGPT、ImageBind、SAM的在线Demo系统)
chatgpt click draggan foundation-model gpt gpt-4 gradio husky image-captioning imagebind internimage langchain llama llm multimodal sam segment-anything vicuna video-generation vqa
Last synced: 31 Jul 2024
https://github.com/OpenGVLab/Ask-Anything
[CVPR2024 Highlight][VideoChatGPT] ChatGPT with video understanding! And many more supported LMs such as miniGPT4, StableLM, and MOSS.
big-model captioning-videos chat chatgpt foundation-models gradio langchain large-language-models large-model stablelm video video-question-answering video-understanding
Last synced: 31 Jul 2024
https://github.com/opengvlab/ask-anything
[CVPR2024 Highlight][VideoChatGPT] ChatGPT with video understanding! And many more supported LMs such as miniGPT4, StableLM, and MOSS.
big-model captioning-videos chat chatgpt foundation-models gradio langchain large-language-models large-model stablelm video video-question-answering video-understanding
Last synced: 02 Aug 2024
https://github.com/OpenGVLab/InternImage
[CVPR 2023 Highlight] InternImage: Exploring Large-Scale Vision Foundation Models with Deformable Convolutions
backbone deformable-convolution foundation-model object-detection semantic-segmentation
Last synced: 31 Jul 2024
https://github.com/OpenGVLab/InternVideo
Video Foundation Models & Data for Multimodal Understanding
action-recognition benchmark contrastive-learning foundation-models instruction-tuning masked-autoencoder multimodal open-set-recognition self-supervised spatio-temporal-action-localization temporal-action-localization video-clip video-data video-dataset video-question-answering video-retrieval video-understanding vision-transformer zero-shot-classification zero-shot-retrieval
Last synced: 31 Jul 2024
https://github.com/OpenGVLab/SAM-Med2D
Official implementation of SAM-Med2D
Last synced: 01 Aug 2024
https://github.com/OpenGVLab/VideoMamba
[ECCV2024] VideoMamba: State Space Model for Efficient Video Understanding
Last synced: 03 Aug 2024
https://github.com/OpenGVLab/VisionLLM
VisionLLM Series
generalist-model large-language-models object-detection
Last synced: 01 Aug 2024
https://github.com/opengvlab/multi-modality-arena
Chatbot Arena meets multi-modality! Multi-Modality Arena allows you to benchmark vision-language models side-by-side while providing images as inputs. Supports MiniGPT-4, LLaMA-Adapter V2, LLaVA, BLIP-2, and many more!
chat chatbot chatgpt gradio large-language-models llms multi-modality vision-language-model vqa
Last synced: 02 Aug 2024
https://github.com/OpenGVLab/Multi-Modality-Arena
Chatbot Arena meets multi-modality! Multi-Modality Arena allows you to benchmark vision-language models side-by-side while providing images as inputs. Supports MiniGPT-4, LLaMA-Adapter V2, LLaVA, BLIP-2, and many more!
chat chatbot chatgpt gradio large-language-models llms multi-modality vision-language-model vqa
Last synced: 01 Aug 2024
https://github.com/OpenGVLab/VideoMAEv2
[CVPR 2023] VideoMAE V2: Scaling Video Masked Autoencoders with Dual Masking
action-detection action-recognition cvpr2023 foundation-model self-supervised-learning temporal-action-detection video-understanding
Last synced: 31 Jul 2024
https://github.com/OpenGVLab/CaFo
[CVPR 2023] Prompt, Generate, then Cache: Cascade of Foundation Models makes Strong Few-shot Learners
Last synced: 31 Jul 2024
https://github.com/OpenGVLab/Instruct2Act
Instruct2Act: Mapping Multi-modality Instructions to Robotic Actions with Large Language Model
chatgpt clip llm robotics segment-anything
Last synced: 02 Aug 2024
https://github.com/opengvlab/instruct2act
Instruct2Act: Mapping Multi-modality Instructions to Robotic Actions with Large Language Model
chatgpt clip llm robotics segment-anything
Last synced: 02 Aug 2024
https://github.com/OpenGVLab/UniFormerV2
[ICCV2023] UniFormerV2: Spatiotemporal Learning by Arming Image ViTs with Video UniFormer
Last synced: 31 Jul 2024
https://github.com/OpenGVLab/Vision-RWKV
Vision-RWKV: Efficient and Scalable Visual Perception with RWKV-Like Architectures
Last synced: 09 Aug 2024
https://github.com/OpenGVLab/LAMM
[NeurIPS 2023 Datasets and Benchmarks Track] LAMM: Multi-Modal Large Language Models and Applications as AI Agents
Last synced: 12 Aug 2024
https://github.com/OpenGVLab/Diffree
Diffree: Text-Guided Shape Free Object Inpainting with Diffusion Model
Last synced: 31 Jul 2024
https://github.com/OpenGVLab/M3I-Pretraining
[CVPR 2023] implementation of Towards All-in-one Pre-training via Maximizing Multi-modal Mutual Information.
Last synced: 02 Aug 2024
https://github.com/OpenGVLab/MMT-Bench
ICML'2024 | MMT-Bench: A Comprehensive Multimodal Benchmark for Evaluating Large Vision-Language Models Towards Multitask AGI
Last synced: 01 Aug 2024
https://github.com/OpenGVLab/MM-NIAH
This is the official implementation of the paper "Needle In A Multimodal Haystack"
benchmark long-context multimodal-large-language-models vision-language-model
Last synced: 01 Aug 2024
https://github.com/OpenGVLab/MMIU
MMIU: Multimodal Multi-image Understanding for Evaluating Large Vision-Language Models
Last synced: 07 Sep 2024