Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

Projects in Awesome Lists tagged with multi-modal

A curated list of projects in awesome lists tagged with multi-modal .

https://github.com/kyegomez/visionllama

Implementation of VisionLLaMA from the paper: "VisionLLaMA: A Unified LLaMA Interface for Vision Tasks" in PyTorch and Zeta

ai deep-learning multi-modal vision-models vision-transformers vit

Last synced: 09 Nov 2024

https://github.com/kyegomez/tinygptv

Simple Implementation of TinyGPTV in super simple Zeta lego blocks

artificial-intelligence attention attention-is-all-you-need deep-learning multi-modal multi-modality transformers

Last synced: 09 Nov 2024

https://github.com/kyegomez/mlxtransformer

Simple Implementation of a Transformer in the new framework MLX by Apple

artificial-intelligence gpt4 machine-learning multi-modal multi-modality

Last synced: 09 Nov 2024

https://github.com/kyegomez/multimodal-tot

Multi-Modal Tree of thoughts for DALLE-3 like auto self improvement

artificial-intelligence gpt4 multi-modal multi-modality multi-modality-data

Last synced: 09 Nov 2024

https://github.com/kyegomez/hsss

Implementation of a Hierarchical Mamba as described in the paper: "Hierarchical State Space Models for Continuous Sequence-to-Sequence Modeling"

ai artificial-intelligence jesus machine-learning ml multi-modal multi-modality open-source pytorch rnn rnns ssms tensorflow zeta

Last synced: 09 Nov 2024

https://github.com/kyegomez/m2pt

Implementation of M2PT in PyTorch from the paper: "Multimodal Pathway: Improve Transformers with Irrelevant Data from Other Modalities"

ai attention attention-is-all-you-need gpt4 gpt5 llama ml models mulit-modality multi-modal

Last synced: 10 Oct 2024

https://github.com/kookmin-sw/capstone-2020-2

FBI: Facial expression & brainwave signals based emotion recognition and analysis web service

analysis brain-waves deep-learning django emotion-recognition facial-expressions fbi multi-modal python react

Last synced: 13 Nov 2024

https://github.com/kyegomez/qwen-vl

My personal implementation of the model from "Qwen-VL: A Frontier Large Vision-Language Model with Versatile Abilities", they haven't released model code yet sooo...

ai artificial-intelligence attention attention-is-all-you-need gemini gpt-4 gpt4 llama ml multi-modal open-source-ai

Last synced: 10 Oct 2024

https://github.com/kyegomez/visiondatasets

Open source scripts to create large scale datasets with rich detail for multi-modal models

ai artificial-intelligence function-calling gpt3 gpt4 json machine-learning ml multi-modal multi-modality pytorch tensorflow

Last synced: 09 Nov 2024

https://github.com/kyegomez/gats

Implementation of GATS from the paper: "GATS: Gather-Attend-Scatter" in pytorch and zeta

ai attention attention-is-all-you-need attention-mechanism gpt4 llama ml multi-modal multi-modality multimodal open-source

Last synced: 10 Oct 2024

https://github.com/onolab-tmu/blinky-iva

Multimodal formulation of IVA using conventional microphones and power sensing blinkies.

blind-source-separation blinky independent-vector-analysis microphone-array multi-modal unsupervised-learning

Last synced: 30 Nov 2024

https://github.com/zjysteven/vlm-visualizer

Visualizing the attention of vision-language models

attention attention-mechanism llava multi-modal vision-language vision-language-model

Last synced: 02 Nov 2024

https://github.com/kyegomez/midas

Implementation of Midas from [Towards Robust Monocular Depth Estimation] in Pytorch and Zeta

ai artificial-intelligence ml multi-modal parallel python pytorch tensorflow vision-models

Last synced: 09 Nov 2024

https://github.com/ashvardanian/tenpack

Fast Tensors Packaging library for text, image, video, and audio data compatible with PyTorch, TensorFlow, & NumPy 🖼️🎵🎥 ➡️ 🧠

clip laion multi-modal numpy parser pytorch simd tensor tensorflow transformer

Last synced: 28 Oct 2024

https://github.com/janteichertkluge/dmlsim

This library provides packages on DoubleML / Causal Machine Learning and Neural Networks in Python for Simulation and Case Studies.

beit bert case-study causal causal-inference causal-machine-learning deep-learning dgp double-machine-learning doubleml machine-learning multi-modal multimodal multimodal-deep-learning neural-network simulation transformer transformers

Last synced: 14 Oct 2024

https://github.com/kyegomez/hedgehog

Implementation of the model "Hedgehog" from the paper: "The Hedgehog & the Porcupine: Expressive Linear Attentions with Softmax Mimicry"

ai attention attention-is-all-you-need attention-mechanisms feedforward ffns ml mlps multi-modal neural-nets open-source opensource-ai softmax

Last synced: 09 Nov 2024

https://github.com/kyegomez/aoa-torch

Implementation of Attention on Attention in Zeta

ai artificial-intelligence gpt4 machine-learning multi-modal multi-modality research

Last synced: 09 Nov 2024

https://github.com/gangula-karthik/aicu-bike-search

Find Your Stolen Bike Lah! With AICU, We Kena Spot Your Bicycle on Carousell One Shot 🚲🔍💨

chromadb clip computer-vision fastapi multi-modal natural-language-processing nextjs typescript

Last synced: 05 Nov 2024

https://github.com/pnnl/transmed

Transfer Learning From Existing Diseases Via Hierarchical Multi-Modal BERT Models

bert-model disease-prediction hierarchical-models multi-modal transfer-learning

Last synced: 25 Nov 2024

https://github.com/garethjns/msimodels

Exploring multi-sensory integration and decision making in biologically inspired deep neural networks.

audiodag brain decision-making deep-neural-networks event-detection keras-neural-networks lstm matlab multi-modal multisensory-processing psychophysics

Last synced: 09 Nov 2024

https://github.com/upgundecha/applied-ai

A repository of curated use cases, articles, blogs, videos on how companies are using Artificial Intelligence and Machine Learning.

artificial-intelligence deep-learning engineering-blogs generative-ai large-language-models machine-learning multi-modal prompt-engineering retrieval-augmented-generation use-cases vector-database

Last synced: 15 Oct 2024

https://github.com/alessioborgi/stylealigned_multireference-multimodal

Novel framework for Zero-Shot Style Alignment in Text-to-Image generation, incorporating Multi-Modal Context-Awareness and Multi-Reference Style Alignment, using minimal attention sharing, ensuring consistent style transfer without fine-tuning.

adain blip clap context-awareness multi-modal multi-style-transfer no-fine-tuning shared-attention-heads style-aligned text-to-image-generation whisper zero-shot-learning

Last synced: 18 Oct 2024

https://github.com/lanl/epbd-bert

Transcription factor binding site prediction for novel DNA sequence data aiding in mutation identification and drug discovery

cross-attention dnabert-model epbd multi-modal transformers-bert

Last synced: 09 Dec 2024

https://github.com/sachs7/multi-modal-langchain-chatbot

A Multi-modal chatbot with LangChain, that supports RAG, Paperswithcode, and Image generation using Dall-E-3

chatbot dall-e-3 langchain langchain-agent multi-modal openai-chatbot paperswithcode rag

Last synced: 10 Dec 2024

https://github.com/liu42/contrastive

项目取材自 2024 年 ”泰迪杯“ 数据挖掘挑战赛 B 题,基于共享特征空间对比学习的跨模态图文互检模型

bert cnn computer-vision contrastive-learning deep-learning image-text-retrieval image-text-search multi-modal multi-modal-learning nlp pytorch roberta transformers

Last synced: 13 Dec 2024

https://github.com/dermatologist/kedro-tf-utils

Kedro pipelines for multimodal ML in TensorFlow.

hacktoberfest healthcare kedro multi-modal tensorflow

Last synced: 21 Dec 2024

https://github.com/jianzhnie/lmmrobot

LMMRobot is a professional end-to-end development framework that uses multimodal large models to enable embodied intelligent robot development.

aloha mobile-aloha mujuco multi-modal reinforcement-learning robotics transformer

Last synced: 01 Jan 2025

https://github.com/sachs7/multi-modal-chatbot-with-agentic-rag

A Multi-Modal Chatbot with LangChain that also supports the agentic RAG, Dall-E-3 images, PapersWithCode

chatbot dall-e-3 langchain langchain-agent multi-modal openai-chatbot paperswithcode rag

Last synced: 10 Dec 2024

https://github.com/amazon-science/contrastive_emc2

Code the ICML 2024 paper: "EMC^2: Efficient MCMC Negative Sampling for Contrastive Learning with Global Convergence"

contrastive-learning deep-neural-networks machine-learning machine-learning-algorithms mcmc-sampling multi-modal multi-modal-learning

Last synced: 12 Nov 2024

https://github.com/ammarlodhi255/metadata-augmented-neural-networks-for-wild-animal-classification

This repository contains the implementation code for the paper "Metadata Augmented Neural Networks For Wild Animal Classification".

deep-learning fusion-techniques metadata metadata-fusion multi-modal multi-modal-learning wild-animal-classification wild-life-monitoring

Last synced: 17 Nov 2024

https://github.com/datafog/vlm-api

REST API for computing cross-modal similarity between images and text using the ColPaLI vision-language model

colpali information-retrieval multi-modal rag retrieval-augmented-generation retrieval-systems vision-language-model

Last synced: 10 Nov 2024

https://github.com/aisuko/multimodal-mimic

Multi-modal LLM and traditional ML models for ICU modality prediction on MIMIC-III across various time windows.

ai logistic-regression mimic-iii modality-classification multi-modal neural-network random-forest-classifier transfer-learning xgboost-classifier

Last synced: 13 Dec 2024