Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

awesome-vlm-architectures

Famous Vision Language Models and Their Architectures
https://github.com/gokayfem/awesome-vlm-architectures

  • ![arXiv - liu/LLaVA) [![Gradio](https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Spaces-blue)](https://llava.hliu.cc/)
  • ![arXiv
  • ![GitHub - vl.github.io/blog/2024-01-30-llava-next/)
  • ![arXiv - Spaces-blue)](https://huggingface.co/spaces/HuggingFaceM4/idefics-8b)
  • ![arXiv - XComposer) [![Gradio](https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Spaces-blue)](https://huggingface.co/spaces/Willow123/InternLM-XComposer)
  • ![arXiv
  • ![arXiv - ai/DeepSeek-VL) [![Gradio](https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Spaces-blue)](https://huggingface.co/spaces/deepseek-ai/DeepSeek-VL-7B)
  • ![GitHub - AI-Lab/Mantis) [![Gradio](https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Spaces-blue)](https://huggingface.co/spaces/TIGER-Lab/Mantis)
  • ![arXiv - vl) [![Gradio](https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Spaces-blue)](https://huggingface.co/spaces/Qwen/Qwen-VL-Plus)
  • ![GitHub - Spaces-blue)](https://huggingface.co/spaces/vikhyatk/moondream2)
  • ![arXiv - vllm/llama2-accessory) [![Model](https://huggingface.co/datasets/huggingface/badges/resolve/main/model-on-hf-md-dark.svg)](https://huggingface.co/Alpha-VLLM/SPHINX)
  • ![arXiv
  • ![arXiv - Spaces-blue)](https://huggingface.co/spaces/Salesforce/BLIP2)
  • ![arXiv - Spaces-blue)](https://huggingface.co/spaces/hysts/InstructBLIP)
  • ![arXiv
  • ![arXiv - 2) [![Gradio](https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Spaces-blue)](https://huggingface.co/spaces/ydshieh/Kosmos-2)
  • ![arXiv - nlp/multiinstruct)
  • ![arXiv
  • ![arXiv
  • ![Model - Hermes-2-Vision-Alpha)
  • ![arXiv - V) [![Gradio](https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Spaces-blue)](https://huggingface.co/spaces/llizhx/TinyGPT-V)
  • ![arXiv
  • ![arXiv - oryx/groundingLMM)
  • ![arXiv
  • ![Model - ai/FireLLaVA-13b)
  • ![arXiv - LLaVA)
  • ![arXiv - YuanGroup/MoE-LLaVA) [![Gradio](https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Spaces-blue)](https://huggingface.co/spaces/LanguageBind/MoE-LLaVA)
  • ![arXiv - ucsd/bliva)
  • ![arXiv - automl/mobilevlm)
  • ![arXiv
  • ![arXiv
  • ![arXiv
  • ![Model - 80b)
  • ![arXiv - research/big_vision)
  • ![arXiv
  • ![arXiv - e.github.io)
  • ![arXiv - cair/minigpt-4)
  • ![arXiv
  • ![arXiv - VL/LLaVA-Plus-Codebase)
  • ![GitHub - on-hf-md-dark.svg)](https://huggingface.co/SkunkworksAI/BakLLaVA-1)
  • ![arXiv
  • ![arXiv - ferret)
  • ![Link
  • - on-hf-md-dark.svg)](https://huggingface.co/adept/fuyu-8b)
  • ![arXiv - Spaces-blue)](https://huggingface.co/spaces/Otter-AI/OtterHD-Demo)
  • ![arXiv - vllm/)
  • ![arXiv
  • ![arXiv
  • ![arXiv
  • ![arXiv
  • ![arXiv
  • ![arXiv
  • ![arXiv - research/vision_transformer)
  • Guide to Vision-Language Models (VLMs) by Görkem Polat
  • VLM Primer by Aman Chadha
  • Generalized Visual Language Models by Lilian Weng