Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

Projects in Awesome Lists tagged with multi-modality

A curated list of projects in awesome lists tagged with multi-modality .

https://github.com/haotian-liu/llava

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

chatbot chatgpt foundation-models gpt-4 instruction-tuning llama llama-2 llama2 llava multi-modality multimodal vision-language-model visual-language-learning

Last synced: 16 Dec 2024

https://github.com/haotian-liu/LLaVA

[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.

chatbot chatgpt foundation-models gpt-4 instruction-tuning llama llama-2 llama2 llava multi-modality multimodal vision-language-model visual-language-learning

Last synced: 25 Oct 2024

https://github.com/lucidrains/deep-daze

Simple command line tool for text to image generation using OpenAI's CLIP and Siren (Implicit neural representation network). Technique was originally created by https://twitter.com/advadnoun

artificial-intelligence deep-learning implicit-neural-representation multi-modality siren text-to-image transformers

Last synced: 18 Dec 2024

https://github.com/luodian/otter

🦦 Otter, a multi-modal model based on OpenFlamingo (open-sourced version of DeepMind's Flamingo), trained on MIMIC-IT and showcasing improved instruction-following and in-context learning ability.

artificial-inteligence chatgpt deep-learning embodied-ai foundation-models gpt-4 instruction-tuning large-scale-models machine-learning multi-modality visual-language-learning

Last synced: 19 Dec 2024

https://github.com/Luodian/Otter

🦦 Otter, a multi-modal model based on OpenFlamingo (open-sourced version of DeepMind's Flamingo), trained on MIMIC-IT and showcasing improved instruction-following and in-context learning ability.

artificial-inteligence chatgpt deep-learning embodied-ai foundation-models gpt-4 instruction-tuning large-scale-models machine-learning multi-modality visual-language-learning

Last synced: 24 Oct 2024

https://github.com/kyegomez/swarms

The Enterprise-Grade Production-Ready Multi-Agent Orchestration Framework Join our Community: https://discord.com/servers/agora-999382051935506503

agents ai artificial-intelligence attention-mechanism chatgpt gpt4 gpt4all huggingface langchain langchain-python machine-learning multi-modal-imaging multi-modality multimodal prompt-engineering prompt-toolkit prompting swarms transformer-models tree-of-thoughts

Last synced: 17 Dec 2024

https://github.com/opengvlab/multi-modality-arena

Chatbot Arena meets multi-modality! Multi-Modality Arena allows you to benchmark vision-language models side-by-side while providing images as inputs. Supports MiniGPT-4, LLaMA-Adapter V2, LLaVA, BLIP-2, and many more!

chat chatbot chatgpt gradio large-language-models llms multi-modality vision-language-model vqa

Last synced: 09 Nov 2024

https://github.com/OpenGVLab/Multi-Modality-Arena

Chatbot Arena meets multi-modality! Multi-Modality Arena allows you to benchmark vision-language models side-by-side while providing images as inputs. Supports MiniGPT-4, LLaMA-Adapter V2, LLaVA, BLIP-2, and many more!

chat chatbot chatgpt gradio large-language-models llms multi-modality vision-language-model vqa

Last synced: 04 Nov 2024

https://github.com/kyegomez/Sophia

Effortless plugin and play Optimizer to cut model training costs by 50%. New optimizer that is 2x faster than Adam on LLMs.

artificial-intelligence chatgpt deep-learning multi-modality neural-network optimizer

Last synced: 29 Nov 2024

https://github.com/kyegomez/sophia

Effortless plugin and play Optimizer to cut model training costs by 50%. New optimizer that is 2x faster than Adam on LLMs.

artificial-intelligence chatgpt deep-learning multi-modality neural-network optimizer

Last synced: 21 Dec 2024

https://github.com/kyegomez/gemini

The open source implementation of Gemini, the model that will "eclipse ChatGPT" by Google

ai artificial-intelligence gemini gpt4 machine-learning ml multi-modality multimodla

Last synced: 18 Dec 2024

https://github.com/kyegomez/Gemini

The open source implementation of Gemini, the model that will "eclipse ChatGPT" by Google

ai artificial-intelligence gemini gpt4 machine-learning ml multi-modality multimodla

Last synced: 05 Nov 2024

https://github.com/zwwwayne/mmmot

[ICCV2019] Robust Multi-Modality Multi-Object Tracking

iccv2019 mot multi-modality

Last synced: 18 Nov 2024

https://github.com/ZwwWayne/mmMOT

[ICCV2019] Robust Multi-Modality Multi-Object Tracking

iccv2019 mot multi-modality

Last synced: 28 Oct 2024

https://github.com/dvlab-research/UVTR

Unifying Voxel-based Representation with Transformer for 3D Object Detection (NeurIPS 2022)

3d-detection multi-modality pytorch

Last synced: 28 Oct 2024

https://github.com/sshh12/multi_token

Embed arbitrary modalities (images, audio, documents, etc) into large language models.

large-context large-language-models large-multimodal-models llava llm multi-modality multimodal vision-language-model

Last synced: 17 Nov 2024

https://github.com/jina-ai/rungpt

An open-source cloud-native of large multi-modal models (LMMs) serving framework.

flamingo gpt-4 large-language-models large-multimadality-models llama llm-hosting llm-serve lmm-serve multi-modality opengpt self-hosting transformers

Last synced: 17 Dec 2024

https://github.com/kyegomez/mambabyte

Implementation of MambaByte in "MambaByte: Token-free Selective State Space Model" in Pytorch and Zeta

ai artificial-intelligence gpt4v machine-learning mamba megabyte ml multi-modality tokenizer

Last synced: 21 Dec 2024

https://github.com/kyegomez/MambaByte

Implementation of MambaByte in "MambaByte: Token-free Selective State Space Model" in Pytorch and Zeta

ai artificial-intelligence gpt4v machine-learning mamba megabyte ml multi-modality tokenizer

Last synced: 28 Oct 2024

https://github.com/kyegomez/kosmos2.5

My implementation of Kosmos2.5 from the paper: "KOSMOS-2.5: A Multimodal Literate Model"

attention attention-is-all-you-need gpt3 gpt4 kosmos multi-modality multimodal multimodal-deep-learning opensource

Last synced: 16 Dec 2024

https://github.com/rentainhe/trar-vqa

[ICCV 2021] Official implementation of the paper "TRAR: Routing the Attention Spans in Transformers for Visual Question Answering"

attention clevr dynamic-network iccv2021 local-and-global multi-modal multi-modal-learning multi-modality multi-scale-features official pytorch transformer vision-and-language visual-question-answering visualization vqav2

Last synced: 07 Nov 2024

https://github.com/kyegomez/moe-mamba

Implementation of MoE Mamba from the paper: "MoE-Mamba: Efficient Selective State Space Models with Mixture of Experts" in Pytorch and Zeta

ai ml moe multi-modal-fusion multi-modality swarms

Last synced: 09 Nov 2024

https://github.com/kyegomez/qformer

Implementation of Qformer from BLIP2 in Zeta Lego blocks.

ai artificial-intelligence attention-mechanism blip2 machine machine-learning multi-modal multi-modality

Last synced: 19 Dec 2024

https://github.com/kyegomez/mm1

PyTorch Implementation of the paper "MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training"

ai artificial-intelligence deep-learning gpt4 machine-learning ml mm1 multi-modal multi-modal-revolution multi-modality

Last synced: 16 Nov 2024

https://github.com/kyegomez/fuyu

Implementation of Adepts Fuyu all-new Multi-Modality model in pytorch

ai artificial-intelligence gpt4 gpt5 machine-learning multi-modal multi-modality

Last synced: 09 Nov 2024

https://github.com/kyegomez/swarmos

An all-new OS that orchestrates autonomous agents as workers to execute tasks.

ai asynchronous asynchronous-programming concurrent gpt4 llms ml multi-modality multithreading operating-system os swarms

Last synced: 09 Nov 2024

https://github.com/kyegomez/mc-vit

Implementation of the model: "(MC-ViT)" from the paper: "Memory Consolidation Enables Long-Context Video Understanding"

ai multi-modal multi-modal-transformers multi-modality open-source transformer transformers vit

Last synced: 09 Nov 2024

https://github.com/kyegomez/hrtx

Multi-Modal Multi-Embodied Hivemind-like Iteration of RTX-2

ai artificial-intelligence ensemble gpt4v machine-learning ml multi-modal multi-modality rt-2 rtx

Last synced: 16 Nov 2024

https://github.com/kyegomez/tinygptv

Simple Implementation of TinyGPTV in super simple Zeta lego blocks

artificial-intelligence attention attention-is-all-you-need deep-learning multi-modal multi-modality transformers

Last synced: 09 Nov 2024

https://github.com/kyegomez/mlxtransformer

Simple Implementation of a Transformer in the new framework MLX by Apple

artificial-intelligence gpt4 machine-learning multi-modal multi-modality

Last synced: 09 Nov 2024

https://github.com/kyegomez/hsss

Implementation of a Hierarchical Mamba as described in the paper: "Hierarchical State Space Models for Continuous Sequence-to-Sequence Modeling"

ai artificial-intelligence jesus machine-learning ml multi-modal multi-modality open-source pytorch rnn rnns ssms tensorflow zeta

Last synced: 09 Nov 2024

https://github.com/kyegomez/multimodal-tot

Multi-Modal Tree of thoughts for DALLE-3 like auto self improvement

artificial-intelligence gpt4 multi-modal multi-modality multi-modality-data

Last synced: 09 Nov 2024

https://github.com/xufangzhi/moca

The implementation of MoCA

multi-modality textbook-question-answering

Last synced: 19 Dec 2024

https://github.com/kyegomez/visiondatasets

Open source scripts to create large scale datasets with rich detail for multi-modal models

ai artificial-intelligence function-calling gpt3 gpt4 json machine-learning ml multi-modal multi-modality pytorch tensorflow

Last synced: 09 Nov 2024

https://github.com/kyegomez/gats

Implementation of GATS from the paper: "GATS: Gather-Attend-Scatter" in pytorch and zeta

ai attention attention-is-all-you-need attention-mechanism gpt4 llama ml multi-modal multi-modality multimodal open-source

Last synced: 10 Oct 2024

https://github.com/kyegomez/vortexfusion

Transformers + Mambas + LSTMS All in One Model

agora ai ai-research deep-learning lstms mambas ml multi-modality ssms transformers

Last synced: 09 Nov 2024

https://github.com/kyegomez/aoa-torch

Implementation of Attention on Attention in Zeta

ai artificial-intelligence gpt4 machine-learning multi-modal multi-modality research

Last synced: 09 Nov 2024

https://github.com/ravi-teja-konda/tunedllavadelights

Explore the rich flavors of Indian desserts with TunedLlavaDelights. Utilizing the in Llava fine-tuning, our project unveils detailed nutritional profiles, taste notes, and optimal consumption times for beloved sweets. Dive into a fusion of AI innovation and culinary tradition

chatgpt dalle2 dessert finetuning gpt4 gpt4v llama2 llava multi-modality multimodal nutrition nutrition-information stable-diffusion tranformers vision-language-learning vision-language-model

Last synced: 15 Nov 2024

https://github.com/yuanze-lin/olympus

The official code for "Olympus: A Universal Task Router for Computer Vision Tasks"

chatbot chatgpt deeplearning foundation-models instruction-tuning llava llms mllms multi-modality multimodal pytorch vision-language-model

Last synced: 14 Dec 2024