Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
awesome-vlm-architectures
Famous Vision Language Models and Their Architectures
https://github.com/gokayfem/awesome-vlm-architectures
- ![arXiv - liu/LLaVA) [![Gradio](https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Spaces-blue)](https://llava.hliu.cc/)
- ![arXiv
- ![GitHub - vl.github.io/blog/2024-01-30-llava-next/)
- ![arXiv - Spaces-blue)](https://huggingface.co/spaces/HuggingFaceM4/idefics-8b)
- ![arXiv - XComposer) [![Gradio](https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Spaces-blue)](https://huggingface.co/spaces/Willow123/InternLM-XComposer)
- ![arXiv
- ![arXiv - ai/DeepSeek-VL) [![Gradio](https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Spaces-blue)](https://huggingface.co/spaces/deepseek-ai/DeepSeek-VL-7B)
- ![GitHub - AI-Lab/Mantis) [![Gradio](https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Spaces-blue)](https://huggingface.co/spaces/TIGER-Lab/Mantis)
- ![arXiv - vl) [![Gradio](https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Spaces-blue)](https://huggingface.co/spaces/Qwen/Qwen-VL-Plus)
- ![GitHub - Spaces-blue)](https://huggingface.co/spaces/vikhyatk/moondream2)
- ![arXiv - vllm/llama2-accessory) [![Model](https://huggingface.co/datasets/huggingface/badges/resolve/main/model-on-hf-md-dark.svg)](https://huggingface.co/Alpha-VLLM/SPHINX)
- ![arXiv
- ![arXiv - Spaces-blue)](https://huggingface.co/spaces/Salesforce/BLIP2)
- ![arXiv - Spaces-blue)](https://huggingface.co/spaces/hysts/InstructBLIP)
- ![arXiv
- ![arXiv - 2) [![Gradio](https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Spaces-blue)](https://huggingface.co/spaces/ydshieh/Kosmos-2)
- ![arXiv - nlp/multiinstruct)
- ![arXiv
- ![arXiv
- ![Model - Hermes-2-Vision-Alpha)
- ![arXiv - V) [![Gradio](https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Spaces-blue)](https://huggingface.co/spaces/llizhx/TinyGPT-V)
- ![arXiv
- ![arXiv - oryx/groundingLMM)
- ![arXiv
- ![Model - ai/FireLLaVA-13b)
- ![arXiv - LLaVA)
- ![arXiv - YuanGroup/MoE-LLaVA) [![Gradio](https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Spaces-blue)](https://huggingface.co/spaces/LanguageBind/MoE-LLaVA)
- ![arXiv - ucsd/bliva)
- ![arXiv - automl/mobilevlm)
- ![arXiv
- ![arXiv
- ![arXiv
- ![Model - 80b)
- ![arXiv - research/big_vision)
- ![arXiv
- ![arXiv - e.github.io)
- ![arXiv - cair/minigpt-4)
- ![arXiv
- ![arXiv - VL/LLaVA-Plus-Codebase)
- ![GitHub - on-hf-md-dark.svg)](https://huggingface.co/SkunkworksAI/BakLLaVA-1)
- ![arXiv
- ![arXiv - ferret)
- ![Link
- - on-hf-md-dark.svg)](https://huggingface.co/adept/fuyu-8b)
- ![arXiv - Spaces-blue)](https://huggingface.co/spaces/Otter-AI/OtterHD-Demo)
- ![arXiv - vllm/)
- ![arXiv
- ![arXiv
- ![arXiv
- ![arXiv
- ![arXiv
- ![arXiv
- ![arXiv - research/vision_transformer)
- Guide to Vision-Language Models (VLMs) by Görkem Polat
- VLM Primer by Aman Chadha
- Generalized Visual Language Models by Lilian Weng