Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

Projects in Awesome Lists tagged with visual-language-learning

A curated list of projects in awesome lists tagged with visual-language-learning .

- Recently synced
- Stars

https://github.com/haotian-liu/llava

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

chatbot chatgpt foundation-models gpt-4 instruction-tuning llama llama-2 llama2 llava multi-modality multimodal vision-language-model visual-language-learning

Last synced: 29 Sep 2024

https://github.com/haotian-liu/LLaVA

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

chatbot chatgpt foundation-models gpt-4 instruction-tuning llama llama-2 llama2 llava multi-modality multimodal vision-language-model visual-language-learning

Last synced: 30 Jul 2024

https://github.com/luodian/otter

🦦 Otter, a multi-modal model based on OpenFlamingo (open-sourced version of DeepMind's Flamingo), trained on MIMIC-IT and showcasing improved instruction-following and in-context learning ability.

artificial-inteligence chatgpt deep-learning embodied-ai foundation-models gpt-4 instruction-tuning large-scale-models machine-learning multi-modality visual-language-learning

Last synced: 27 Sep 2024

https://github.com/Luodian/Otter

🦦 Otter, a multi-modal model based on OpenFlamingo (open-sourced version of DeepMind's Flamingo), trained on MIMIC-IT and showcasing improved instruction-following and in-context learning ability.

artificial-inteligence chatgpt deep-learning embodied-ai foundation-models gpt-4 instruction-tuning large-scale-models machine-learning multi-modality visual-language-learning

Last synced: 29 Jul 2024

https://github.com/next-gpt/next-gpt

Code and models for NExT-GPT: Any-to-Any Multimodal Large Language Model

chatgpt foundation-models gpt-4 instruction-tuning large-language-models llm multi-modal-chatgpt multimodal visual-language-learning

Last synced: 30 Sep 2024

https://github.com/NExT-GPT/NExT-GPT

Code and models for NExT-GPT: Any-to-Any Multimodal Large Language Model

chatgpt foundation-models gpt-4 instruction-tuning large-language-models llm multi-modal-chatgpt multimodal visual-language-learning

Last synced: 29 Jul 2024

https://github.com/InternLM/InternLM-XComposer

InternLM-XComposer2 is a groundbreaking vision-language large model (VLLM) excelling in free-form text-image composition and comprehension.

chatgpt foundation gpt gpt-4 instruction-tuning language-model large-language-model large-vision-language-model llm mllm multi-modality multimodal supervised-finetuning vision-language-model vision-transformer visual-language-learning

Last synced: 03 Aug 2024

https://github.com/internlm/internlm-xcomposer

InternLM-XComposer2 is a groundbreaking vision-language large model (VLLM) excelling in free-form text-image composition and comprehension.

chatgpt foundation gpt gpt-4 instruction-tuning language-model large-language-model large-vision-language-model llm mllm multi-modality multimodal supervised-finetuning vision-language-model vision-transformer visual-language-learning

Last synced: 01 Oct 2024

https://github.com/ashleykleynhans/llava-docker

Docker image for LLaVA: Large Language and Vision Assistant

ai chatbot chatgpt docker docker-image foundation-models gpt-4 instruction-tuning llama llama-2 llama2 llava llm multimodal runpod vision-language-model visual-language-learning

Last synced: 06 Aug 2024

https://github.com/adrianbzg/llama-multimodal-vqa

Multimodal Instruction Tuning for Llama 3

chatbot chatgpt gpt-4 huggingface instruction-tuning language-models llama llama2 llama3 multimodal multimodal-instruction-tuning visual-language-learning visual-question-answering vqa

Last synced: 27 Sep 2024