An open API service indexing awesome lists of open source software.

Projects in Awesome Lists tagged with visual-language-learning

A curated list of projects in awesome lists tagged with visual-language-learning .

https://github.com/haotian-liu/llava

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

chatbot chatgpt foundation-models gpt-4 instruction-tuning llama llama-2 llama2 llava multi-modality multimodal vision-language-model visual-language-learning

Last synced: 17 Nov 2025

https://github.com/haotian-liu/LLaVA

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

chatbot chatgpt foundation-models gpt-4 instruction-tuning llama llama-2 llama2 llava multi-modality multimodal vision-language-model visual-language-learning

Last synced: 14 Mar 2025

https://github.com/next-gpt/next-gpt

Code and models for ICML 2024 paper, NExT-GPT: Any-to-Any Multimodal Large Language Model

chatgpt foundation-models gpt-4 instruction-tuning large-language-models llm mllm multi-modal-chatgpt multimodal visual-language-learning

Last synced: 14 May 2025

https://github.com/NExT-GPT/NExT-GPT

Code and models for NExT-GPT: Any-to-Any Multimodal Large Language Model

chatgpt foundation-models gpt-4 instruction-tuning large-language-models llm multi-modal-chatgpt multimodal visual-language-learning

Last synced: 12 Mar 2025

https://github.com/evolvinglmms-lab/otter

🦦 Otter, a multi-modal model based on OpenFlamingo (open-sourced version of DeepMind's Flamingo), trained on MIMIC-IT and showcasing improved instruction-following and in-context learning ability.

artificial-inteligence chatgpt deep-learning embodied-ai foundation-models gpt-4 instruction-tuning large-scale-models machine-learning multi-modality visual-language-learning

Last synced: 13 Dec 2025

https://github.com/RLHF-V/RLHF-V

[CVPR'24] RLHF-V: Towards Trustworthy MLLMs via Behavior Alignment from Fine-grained Correctional Human Feedback

chatbot gpt-4 llama multi-modality multimodal rlhf-v visual-language-learning

Last synced: 24 Feb 2025

https://github.com/thomas-yanxin/karmavlm

🧘🏻‍♂️KarmaVLM (相生):A family of high efficiency and powerful visual language model.

llama2 llava multimodel qwen2 vision-language-model visual-language-learning vlm

Last synced: 02 Aug 2025