Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

Projects in Awesome Lists by OpenGVLab

A curated list of projects in awesome lists by OpenGVLab .

https://github.com/OpenGVLab/LLaMA-Adapter

[ICLR 2024] Fine-tuning LLaMA to follow Instructions within 1 Hour and 1.2M Parameters

Last synced: 30 Jul 2024

https://github.com/OpenGVLab/DragGAN

Unofficial Implementation of DragGAN - "Drag Your GAN: Interactive Point-based Manipulation on the Generative Image Manifold" (DragGAN 全功能实现,在线Demo,本地部署试用,代码、模型已全部开源,支持Windows, macOS, Linux)

draggan gradio-interface image-editing image-generation interngpt

Last synced: 01 Aug 2024

https://github.com/opengvlab/internvl

[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的可商用开源多模态对话模型

gpt gpt-4o gpt-4v image-classification image-text-retrieval llm multi-modal semantic-segmentation video-classification vision-language-model vit-22b vit-6b

Last synced: 02 Aug 2024

https://github.com/OpenGVLab/InternVL

[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的可商用开源多模态对话模型

gpt gpt-4o gpt-4v image-classification image-text-retrieval llm multi-modal semantic-segmentation video-classification vision-language-model vit-22b vit-6b

Last synced: 31 Jul 2024

https://github.com/opengvlab/interngpt

InternGPT (iGPT) is an open source demo platform where you can easily showcase your AI models. Now it supports DragGAN, ChatGPT, ImageBind, multimodal chat like GPT-4, SAM, interactive image editing, etc. Try it at igpt.opengvlab.com (支持DragGAN、ChatGPT、ImageBind、SAM的在线Demo系统)

chatgpt click draggan foundation-model gpt gpt-4 gradio husky image-captioning imagebind internimage langchain llama llm multimodal sam segment-anything vicuna video-generation vqa

Last synced: 02 Aug 2024

https://github.com/OpenGVLab/InternGPT

InternGPT (iGPT) is an open source demo platform where you can easily showcase your AI models. Now it supports DragGAN, ChatGPT, ImageBind, multimodal chat like GPT-4, SAM, interactive image editing, etc. Try it at igpt.opengvlab.com (支持DragGAN、ChatGPT、ImageBind、SAM的在线Demo系统)

chatgpt click draggan foundation-model gpt gpt-4 gradio husky image-captioning imagebind internimage langchain llama llm multimodal sam segment-anything vicuna video-generation vqa

Last synced: 31 Jul 2024

https://github.com/OpenGVLab/Ask-Anything

[CVPR2024 Highlight][VideoChatGPT] ChatGPT with video understanding! And many more supported LMs such as miniGPT4, StableLM, and MOSS.

big-model captioning-videos chat chatgpt foundation-models gradio langchain large-language-models large-model stablelm video video-question-answering video-understanding

Last synced: 31 Jul 2024

https://github.com/opengvlab/ask-anything

[CVPR2024 Highlight][VideoChatGPT] ChatGPT with video understanding! And many more supported LMs such as miniGPT4, StableLM, and MOSS.

big-model captioning-videos chat chatgpt foundation-models gradio langchain large-language-models large-model stablelm video video-question-answering video-understanding

Last synced: 02 Aug 2024

https://github.com/OpenGVLab/InternImage

[CVPR 2023 Highlight] InternImage: Exploring Large-Scale Vision Foundation Models with Deformable Convolutions

backbone deformable-convolution foundation-model object-detection semantic-segmentation

Last synced: 31 Jul 2024

https://github.com/OpenGVLab/SAM-Med2D

Official implementation of SAM-Med2D

Last synced: 01 Aug 2024

https://github.com/OpenGVLab/VideoMamba

[ECCV2024] VideoMamba: State Space Model for Efficient Video Understanding

Last synced: 03 Aug 2024

https://github.com/opengvlab/multi-modality-arena

Chatbot Arena meets multi-modality! Multi-Modality Arena allows you to benchmark vision-language models side-by-side while providing images as inputs. Supports MiniGPT-4, LLaMA-Adapter V2, LLaVA, BLIP-2, and many more!

chat chatbot chatgpt gradio large-language-models llms multi-modality vision-language-model vqa

Last synced: 02 Aug 2024

https://github.com/OpenGVLab/DCNv4

[CVPR 2024] Deformable Convolution v4

Last synced: 31 Jul 2024

https://github.com/OpenGVLab/Multi-Modality-Arena

Chatbot Arena meets multi-modality! Multi-Modality Arena allows you to benchmark vision-language models side-by-side while providing images as inputs. Supports MiniGPT-4, LLaMA-Adapter V2, LLaVA, BLIP-2, and many more!

chat chatbot chatgpt gradio large-language-models llms multi-modality vision-language-model vqa

Last synced: 01 Aug 2024

https://github.com/OpenGVLab/CaFo

[CVPR 2023] Prompt, Generate, then Cache: Cascade of Foundation Models makes Strong Few-shot Learners

Last synced: 31 Jul 2024

https://github.com/OpenGVLab/Instruct2Act

Instruct2Act: Mapping Multi-modality Instructions to Robotic Actions with Large Language Model

chatgpt clip llm robotics segment-anything

Last synced: 02 Aug 2024

https://github.com/opengvlab/instruct2act

Instruct2Act: Mapping Multi-modality Instructions to Robotic Actions with Large Language Model

chatgpt clip llm robotics segment-anything

Last synced: 02 Aug 2024

https://github.com/OpenGVLab/UniFormerV2

[ICCV2023] UniFormerV2: Spatiotemporal Learning by Arming Image ViTs with Video UniFormer

Last synced: 31 Jul 2024

https://github.com/OpenGVLab/Vision-RWKV

Vision-RWKV: Efficient and Scalable Visual Perception with RWKV-Like Architectures

Last synced: 09 Aug 2024

https://github.com/OpenGVLab/LAMM

[NeurIPS 2023 Datasets and Benchmarks Track] LAMM: Multi-Modal Large Language Models and Applications as AI Agents

Last synced: 12 Aug 2024

https://github.com/OpenGVLab/DriveMLM

Last synced: 31 Jul 2024

https://github.com/OpenGVLab/Diffree

Diffree: Text-Guided Shape Free Object Inpainting with Diffusion Model

Last synced: 31 Jul 2024

https://github.com/OpenGVLab/M3I-Pretraining

[CVPR 2023] implementation of Towards All-in-one Pre-training via Maximizing Multi-modal Mutual Information.

Last synced: 02 Aug 2024

https://github.com/OpenGVLab/MMT-Bench

ICML'2024 | MMT-Bench: A Comprehensive Multimodal Benchmark for Evaluating Large Vision-Language Models Towards Multitask AGI

Last synced: 01 Aug 2024

https://github.com/OpenGVLab/MM-NIAH

This is the official implementation of the paper "Needle In A Multimodal Haystack"

benchmark long-context multimodal-large-language-models vision-language-model

Last synced: 01 Aug 2024

https://github.com/OpenGVLab/MMIU

MMIU: Multimodal Multi-image Understanding for Evaluating Large Vision-Language Models

Last synced: 07 Sep 2024