Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
Projects in Awesome Lists tagged with multi-modal
A curated list of projects in awesome lists tagged with multi-modal .
https://github.com/kyegomez/visionllama
Implementation of VisionLLaMA from the paper: "VisionLLaMA: A Unified LLaMA Interface for Vision Tasks" in PyTorch and Zeta
ai deep-learning multi-modal vision-models vision-transformers vit
Last synced: 09 Nov 2024
https://github.com/kyegomez/tinygptv
Simple Implementation of TinyGPTV in super simple Zeta lego blocks
artificial-intelligence attention attention-is-all-you-need deep-learning multi-modal multi-modality transformers
Last synced: 09 Nov 2024
https://github.com/kyegomez/awesome-robotic-foundation-models
A vast array of Multi-Modal Embodied Robotic Foundation Models!
ai artificial-intelligence artificial-neural-networks functions machine-learning ml multi-modal robotics
Last synced: 09 Nov 2024
https://github.com/kyegomez/mlxtransformer
Simple Implementation of a Transformer in the new framework MLX by Apple
artificial-intelligence gpt4 machine-learning multi-modal multi-modality
Last synced: 09 Nov 2024
https://github.com/kyegomez/multimodal-tot
Multi-Modal Tree of thoughts for DALLE-3 like auto self improvement
artificial-intelligence gpt4 multi-modal multi-modality multi-modality-data
Last synced: 09 Nov 2024
https://github.com/kyegomez/hsss
Implementation of a Hierarchical Mamba as described in the paper: "Hierarchical State Space Models for Continuous Sequence-to-Sequence Modeling"
ai artificial-intelligence jesus machine-learning ml multi-modal multi-modality open-source pytorch rnn rnns ssms tensorflow zeta
Last synced: 09 Nov 2024
https://github.com/kyegomez/m2pt
Implementation of M2PT in PyTorch from the paper: "Multimodal Pathway: Improve Transformers with Irrelevant Data from Other Modalities"
ai attention attention-is-all-you-need gpt4 gpt5 llama ml models mulit-modality multi-modal
Last synced: 10 Oct 2024
https://github.com/agora-lab-ai/atom
a suite of finetuned LLMs for atomically precise function calling 🧪
ai artificial-intelligence convolutional-neural-networks function-calling gpt-4 llama llama2 llamacpp ml multi-modal open-source rpa rpc task-automation tool-usage transformer workflow-automation
Last synced: 10 Nov 2024
https://github.com/kookmin-sw/capstone-2020-2
FBI: Facial expression & brainwave signals based emotion recognition and analysis web service
analysis brain-waves deep-learning django emotion-recognition facial-expressions fbi multi-modal python react
Last synced: 13 Nov 2024
https://github.com/kyegomez/qwen-vl
My personal implementation of the model from "Qwen-VL: A Frontier Large Vision-Language Model with Versatile Abilities", they haven't released model code yet sooo...
ai artificial-intelligence attention attention-is-all-you-need gemini gpt-4 gpt4 llama ml multi-modal open-source-ai
Last synced: 10 Oct 2024
https://github.com/kyegomez/visiondatasets
Open source scripts to create large scale datasets with rich detail for multi-modal models
ai artificial-intelligence function-calling gpt3 gpt4 json machine-learning ml multi-modal multi-modality pytorch tensorflow
Last synced: 09 Nov 2024
https://github.com/kyegomez/gats
Implementation of GATS from the paper: "GATS: Gather-Attend-Scatter" in pytorch and zeta
ai attention attention-is-all-you-need attention-mechanism gpt4 llama ml multi-modal multi-modality multimodal open-source
Last synced: 10 Oct 2024
https://github.com/onolab-tmu/blinky-iva
Multimodal formulation of IVA using conventional microphones and power sensing blinkies.
blind-source-separation blinky independent-vector-analysis microphone-array multi-modal unsupervised-learning
Last synced: 30 Nov 2024
https://github.com/zjysteven/vlm-visualizer
Visualizing the attention of vision-language models
attention attention-mechanism llava multi-modal vision-language vision-language-model
Last synced: 02 Nov 2024
https://github.com/kyegomez/midas
Implementation of Midas from [Towards Robust Monocular Depth Estimation] in Pytorch and Zeta
ai artificial-intelligence ml multi-modal parallel python pytorch tensorflow vision-models
Last synced: 09 Nov 2024
https://github.com/kyegomez/celestial-1
Omni-Modality Processing, Understanding, and Generation
attention attention-is-all-you-need attention-mechanisms gpt-4 gpt4 multi-modal multimodal multimodal-deep-learning multimodality omnimodal openai
Last synced: 09 Nov 2024
https://github.com/yifanfeng97/multi-modal-generation-for-shrec22
Multi-modal data generation for 3D objects.
3d blender data-generation multi-modal multi-view pointclouds voxel
Last synced: 28 Oct 2024
https://github.com/ashvardanian/tenpack
Fast Tensors Packaging library for text, image, video, and audio data compatible with PyTorch, TensorFlow, & NumPy 🖼️🎵🎥 ➡️ 🧠
clip laion multi-modal numpy parser pytorch simd tensor tensorflow transformer
Last synced: 28 Oct 2024
https://github.com/janteichertkluge/dmlsim
This library provides packages on DoubleML / Causal Machine Learning and Neural Networks in Python for Simulation and Case Studies.
beit bert case-study causal causal-inference causal-machine-learning deep-learning dgp double-machine-learning doubleml machine-learning multi-modal multimodal multimodal-deep-learning neural-network simulation transformer transformers
Last synced: 14 Oct 2024
https://github.com/kyegomez/hedgehog
Implementation of the model "Hedgehog" from the paper: "The Hedgehog & the Porcupine: Expressive Linear Attentions with Softmax Mimicry"
ai attention attention-is-all-you-need attention-mechanisms feedforward ffns ml mlps multi-modal neural-nets open-source opensource-ai softmax
Last synced: 09 Nov 2024
https://github.com/kyegomez/aoa-torch
Implementation of Attention on Attention in Zeta
ai artificial-intelligence gpt4 machine-learning multi-modal multi-modality research
Last synced: 09 Nov 2024
https://github.com/gmierz/pupil-lib-matlab
Pupil Data Epoch Extraction Library
academic data-synchronization epoching eye eye-tracking labstreaminglayer lsl multi-modal pupil-labs trial
Last synced: 19 Dec 2024
https://github.com/gangula-karthik/aicu-bike-search
Find Your Stolen Bike Lah! With AICU, We Kena Spot Your Bicycle on Carousell One Shot 🚲🔍💨
chromadb clip computer-vision fastapi multi-modal natural-language-processing nextjs typescript
Last synced: 05 Nov 2024
https://github.com/pnnl/transmed
Transfer Learning From Existing Diseases Via Hierarchical Multi-Modal BERT Models
bert-model disease-prediction hierarchical-models multi-modal transfer-learning
Last synced: 25 Nov 2024
https://github.com/garethjns/msimodels
Exploring multi-sensory integration and decision making in biologically inspired deep neural networks.
audiodag brain decision-making deep-neural-networks event-detection keras-neural-networks lstm matlab multi-modal multisensory-processing psychophysics
Last synced: 09 Nov 2024
https://github.com/upgundecha/applied-ai
A repository of curated use cases, articles, blogs, videos on how companies are using Artificial Intelligence and Machine Learning.
artificial-intelligence deep-learning engineering-blogs generative-ai large-language-models machine-learning multi-modal prompt-engineering retrieval-augmented-generation use-cases vector-database
Last synced: 15 Oct 2024
https://github.com/alessioborgi/stylealigned_multireference-multimodal
Novel framework for Zero-Shot Style Alignment in Text-to-Image generation, incorporating Multi-Modal Context-Awareness and Multi-Reference Style Alignment, using minimal attention sharing, ensuring consistent style transfer without fine-tuning.
adain blip clap context-awareness multi-modal multi-style-transfer no-fine-tuning shared-attention-heads style-aligned text-to-image-generation whisper zero-shot-learning
Last synced: 18 Oct 2024
https://github.com/lanl/epbd-bert
Transcription factor binding site prediction for novel DNA sequence data aiding in mutation identification and drug discovery
cross-attention dnabert-model epbd multi-modal transformers-bert
Last synced: 09 Dec 2024
https://github.com/sachs7/multi-modal-langchain-chatbot
A Multi-modal chatbot with LangChain, that supports RAG, Paperswithcode, and Image generation using Dall-E-3
chatbot dall-e-3 langchain langchain-agent multi-modal openai-chatbot paperswithcode rag
Last synced: 10 Dec 2024
https://github.com/liu42/contrastive
项目取材自 2024 年 ”泰迪杯“ 数据挖掘挑战赛 B 题,基于共享特征空间对比学习的跨模态图文互检模型
bert cnn computer-vision contrastive-learning deep-learning image-text-retrieval image-text-search multi-modal multi-modal-learning nlp pytorch roberta transformers
Last synced: 13 Dec 2024
https://github.com/dermatologist/kedro-tf-utils
Kedro pipelines for multimodal ML in TensorFlow.
hacktoberfest healthcare kedro multi-modal tensorflow
Last synced: 21 Dec 2024
https://github.com/jianzhnie/lmmrobot
LMMRobot is a professional end-to-end development framework that uses multimodal large models to enable embodied intelligent robot development.
aloha mobile-aloha mujuco multi-modal reinforcement-learning robotics transformer
Last synced: 01 Jan 2025
https://github.com/sachs7/multi-modal-chatbot-with-agentic-rag
A Multi-Modal Chatbot with LangChain that also supports the agentic RAG, Dall-E-3 images, PapersWithCode
chatbot dall-e-3 langchain langchain-agent multi-modal openai-chatbot paperswithcode rag
Last synced: 10 Dec 2024
https://github.com/amazon-science/contrastive_emc2
Code the ICML 2024 paper: "EMC^2: Efficient MCMC Negative Sampling for Contrastive Learning with Global Convergence"
contrastive-learning deep-neural-networks machine-learning machine-learning-algorithms mcmc-sampling multi-modal multi-modal-learning
Last synced: 12 Nov 2024
https://github.com/pankajarm/multi-modal-cloud-search
Build & Host multi-modal cloud search
clip deep-learning embeddings multi-modal neural-search
Last synced: 13 Nov 2024
https://github.com/ammarlodhi255/metadata-augmented-neural-networks-for-wild-animal-classification
This repository contains the implementation code for the paper "Metadata Augmented Neural Networks For Wild Animal Classification".
deep-learning fusion-techniques metadata metadata-fusion multi-modal multi-modal-learning wild-animal-classification wild-life-monitoring
Last synced: 17 Nov 2024
https://github.com/datafog/vlm-api
REST API for computing cross-modal similarity between images and text using the ColPaLI vision-language model
colpali information-retrieval multi-modal rag retrieval-augmented-generation retrieval-systems vision-language-model
Last synced: 10 Nov 2024
https://github.com/aisuko/multimodal-mimic
Multi-modal LLM and traditional ML models for ICU modality prediction on MIMIC-III across various time windows.
ai logistic-regression mimic-iii modality-classification multi-modal neural-network random-forest-classifier transfer-learning xgboost-classifier
Last synced: 13 Dec 2024