Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
Projects in Awesome Lists tagged with visual-language-learning
A curated list of projects in awesome lists tagged with visual-language-learning .
https://github.com/haotian-liu/llava
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
chatbot chatgpt foundation-models gpt-4 instruction-tuning llama llama-2 llama2 llava multi-modality multimodal vision-language-model visual-language-learning
Last synced: 29 Sep 2024
https://github.com/haotian-liu/LLaVA
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
chatbot chatgpt foundation-models gpt-4 instruction-tuning llama llama-2 llama2 llava multi-modality multimodal vision-language-model visual-language-learning
Last synced: 30 Jul 2024
https://github.com/luodian/otter
🦦 Otter, a multi-modal model based on OpenFlamingo (open-sourced version of DeepMind's Flamingo), trained on MIMIC-IT and showcasing improved instruction-following and in-context learning ability.
artificial-inteligence chatgpt deep-learning embodied-ai foundation-models gpt-4 instruction-tuning large-scale-models machine-learning multi-modality visual-language-learning
Last synced: 27 Sep 2024
https://github.com/Luodian/Otter
🦦 Otter, a multi-modal model based on OpenFlamingo (open-sourced version of DeepMind's Flamingo), trained on MIMIC-IT and showcasing improved instruction-following and in-context learning ability.
artificial-inteligence chatgpt deep-learning embodied-ai foundation-models gpt-4 instruction-tuning large-scale-models machine-learning multi-modality visual-language-learning
Last synced: 29 Jul 2024
https://github.com/next-gpt/next-gpt
Code and models for NExT-GPT: Any-to-Any Multimodal Large Language Model
chatgpt foundation-models gpt-4 instruction-tuning large-language-models llm multi-modal-chatgpt multimodal visual-language-learning
Last synced: 30 Sep 2024
https://github.com/NExT-GPT/NExT-GPT
Code and models for NExT-GPT: Any-to-Any Multimodal Large Language Model
chatgpt foundation-models gpt-4 instruction-tuning large-language-models llm multi-modal-chatgpt multimodal visual-language-learning
Last synced: 29 Jul 2024
https://github.com/InternLM/InternLM-XComposer
InternLM-XComposer2 is a groundbreaking vision-language large model (VLLM) excelling in free-form text-image composition and comprehension.
chatgpt foundation gpt gpt-4 instruction-tuning language-model large-language-model large-vision-language-model llm mllm multi-modality multimodal supervised-finetuning vision-language-model vision-transformer visual-language-learning
Last synced: 03 Aug 2024
https://github.com/internlm/internlm-xcomposer
InternLM-XComposer2 is a groundbreaking vision-language large model (VLLM) excelling in free-form text-image composition and comprehension.
chatgpt foundation gpt gpt-4 instruction-tuning language-model large-language-model large-vision-language-model llm mllm multi-modality multimodal supervised-finetuning vision-language-model vision-transformer visual-language-learning
Last synced: 01 Oct 2024
https://github.com/ashleykleynhans/llava-docker
Docker image for LLaVA: Large Language and Vision Assistant
ai chatbot chatgpt docker docker-image foundation-models gpt-4 instruction-tuning llama llama-2 llama2 llava llm multimodal runpod vision-language-model visual-language-learning
Last synced: 06 Aug 2024
https://github.com/adrianbzg/llama-multimodal-vqa
Multimodal Instruction Tuning for Llama 3
chatbot chatgpt gpt-4 huggingface instruction-tuning language-models llama llama2 llama3 multimodal multimodal-instruction-tuning visual-language-learning visual-question-answering vqa
Last synced: 27 Sep 2024