Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/haotian-liu/LLaVA
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
https://github.com/haotian-liu/LLaVA
chatbot chatgpt foundation-models gpt-4 instruction-tuning llama llama-2 llama2 llava multi-modality multimodal vision-language-model visual-language-learning
Last synced: about 2 months ago
JSON representation
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
- Host: GitHub
- URL: https://github.com/haotian-liu/LLaVA
- Owner: haotian-liu
- License: apache-2.0
- Created: 2023-04-17T16:13:11.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2024-08-12T09:52:38.000Z (4 months ago)
- Last Synced: 2024-10-19T14:52:31.819Z (about 2 months ago)
- Topics: chatbot, chatgpt, foundation-models, gpt-4, instruction-tuning, llama, llama-2, llama2, llava, multi-modality, multimodal, vision-language-model, visual-language-learning
- Language: Python
- Homepage: https://llava.hliu.cc
- Size: 13.4 MB
- Stars: 19,796
- Watchers: 156
- Forks: 2,175
- Open Issues: 1,015
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
- awesome-llama-resources - LLaVA - and-Vision Assistant (Multimodal LLM)
- Awesome-LLM-Productization - LLaVA - Visual instruction tuning towards large language and vision models with GPT-4 level capabilities (Models and Tools / Open LLM Models)
- Awesome-Reasoning-Foundation-Models - [Code
- Awesome-LLM4AD - LLaVA-7B-1.5 - diffusion](https://github.com/CompVis/latent-diffusion) (Papers)
- Awesome-Video-LLMs - LLaVA
- awesome-LLMs-finetuning - LLaVA
- awesome-chatgpt - haotian-liu/LLaVA - LLaVA (Large Language and Vision Assistant) is a general-purpose desktop client for ChatGPT with visual instruction tuning capabilities towards GPT-4V level. (UIs / Desktop applications)
- awesome-multi-modal - https://github.com/haotian-liu/LLaVA
- awesome-multi-modal - https://github.com/haotian-liu/LLaVA
- StarryDivineSky - haotian-liu/LLaVA - 4 级别功能构建的大型语言和视觉助手。 (多模态大模型 / 网络服务_其他)
- awesome-vision-language-pretraining - LLaVA Series - Tutorials/blob/master/LLaVa/Inference_with_LLaVa_for_multimodal_generation.ipynb) [[LLaVA-NeXT]](https://github.com/LLaVA-VL/LLaVA-NeXT/) [[hf docs]](https://huggingface.co/docs/transformers/en/model_doc/llava_next) [[hf docs]](https://huggingface.co/docs/transformers/main/en/model_doc/llava_onevision) [[demo]](https://huggingface.co/spaces/merve/llava-next) [[hf card]](https://huggingface.co/llava-hf) [[LLaVA-CoT]](https://github.com/PKU-YuanGroup/LLaVA-CoT) (Papers)
- awesome-llm-and-aigc - LLaVA - liu/LLaVA?style=social"/> : 🌋 LLaVA: Large Language and Vision Assistant. Visual instruction tuning towards large language and vision models with GPT-4 level capabilities. [llava.hliu.cc](https://llava.hliu.cc/). "Visual Instruction Tuning". (**[arXiv 2023](https://arxiv.org/abs/2304.08485)**). (Summary)
- awesome-llm-and-aigc - LLaVA - liu/LLaVA?style=social"/> : 🌋 LLaVA: Large Language and Vision Assistant. Visual instruction tuning towards large language and vision models with GPT-4 level capabilities. [llava.hliu.cc](https://llava.hliu.cc/). "Visual Instruction Tuning". (**[arXiv 2023](https://arxiv.org/abs/2304.08485)**). (Summary)
- AiTreasureBox - haotian-liu/LLaVA - 12-07_20592_1](https://img.shields.io/github/stars/haotian-liu/LLaVA.svg) <a alt="Click Me" href="https://llava.hliu.cc" target="_blank"><img src="https://img.shields.io/badge/Gradio-Spaces-brightgreen" alt="Open in Demo"/></a> <a href='https://arxiv.org/abs/2304.08485'><img src='https://img.shields.io/badge/Paper-Arxiv-red'></a> |Large Language-and-Vision Assistant built towards multimodal GPT-4 level capabilities.| (Repos)