Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/AILab-CVC/UniRepLKNet
[CVPR'24] UniRepLKNet: A Universal Perception Large-Kernel ConvNet for Audio, Video, Point Cloud, Time-Series and Image Recognition
architecture artificial-intelligence convolutional-neural-networks deep-learning multimodal-learning
Last synced: 02 Jul 2024
https://github.com/Pointcept/GPT4Point
[CVPR'24 Highlight] GPT4Point: A Unified Framework for Point-Language Understanding and Generation.
3d-generation llm multimodal-learning
Last synced: 01 Jul 2024
https://github.com/ai4ce/EgoPAT3D
[CVPR 2022] Egocentric Action Target Prediction in 3D
3d-computer-vision computer-vision dataset deep-learning egocentric-vision human-robot-collaboration human-robot-interaction machine-learning multimodal-learning target-prediction wearable-robotics
Last synced: 01 Jul 2024
https://github.com/HUANGLIZI/LViT
[IEEE Transactions on Medical Imaging/TMI] This repo is the official implementation of "LViT: Language meets Vision Transformer in Medical Image Segmentation"
medical-image-analysis multimodal-learning pytorch segmentation vision-language
Last synced: 28 Jun 2024
https://github.com/richard-peng-xia/awesome-multimodal-in-medical-imaging
A collection of resources on applications of multi-modal learning in medical imaging.
large-language-models large-multimodal-models medical-imaging medical-report-generation multimodal-deep-learning multimodal-large-language-models multimodal-learning visual-question-answering
Last synced: 25 Jun 2024
https://github.com/declare-lab/multimodal-deep-learning
This repository contains various models targetting multimodal representation learning, multimodal fusion for downstream tasks such as multimodal sentiment analysis.
multimodal-deep-learning multimodal-interactions multimodal-learning multimodal-sentiment-analysis
Last synced: 16 Jun 2024
https://github.com/ArrowLuo/CLIP4Clip
An official implementation for "CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retrieval"
activitynet clip didemo lsmdc msrvtt msvd multimodal multimodal-learning multimodality ranking retrieval retrieval-model search video-clip-retrieval video-text-retrieval
Last synced: 07 Jun 2024
https://github.com/microsoft/XPretrain
Multi-modality pre-training
computer-vision multimedia multimodal-learning nlp pre-training
Last synced: 07 Jun 2024
https://github.com/miccunifi/SEARLE
[ICCV 2023] - Zero-shot Composed Image Retrieval with Textual Inversion
circo cirr clip composed-image-retrieval fashion-iq knowledge-distillation multimodal-learning pytorch textual-inversion
Last synced: 07 Jun 2024
https://github.com/Eurus-Holmes/Awesome-Multimodal-Research
A curated list of Multimodal Related Research.
awesome multimodal multimodal-learning multimodal-research
Last synced: 14 May 2024
https://github.com/pliang279/awesome-multimodal-ml
Reading list for research topics in multimodal machine learning
computer-vision deep-learning healthcare machine-learning multimodal-learning natural-language-processing reading-list reinforcement-learning representation-learning robotics speech-processing
Last synced: 14 May 2024
https://github.com/pliang279/MultiBench
[NeurIPS 2021] Multiscale Benchmarks for Multimodal Representation Learning
computer-vision deep-learning healthcare machine-learning multimodal-learning natural-language-processing representation-learning robotics speech-processing
Last synced: 13 May 2024
https://github.com/PreferredAI/cornac
A Comparative Framework for Multimodal Recommender Systems
collaborative-filtering matrix-factorization multimodal-learning multimodality recommendation-algorithms recommendation-engine recommendation-system recommender-system
Last synced: 12 May 2024
https://github.com/subho406/OmniNet
Official Pytorch implementation of "OmniNet: A unified architecture for multi-modal multi-task learning" | Authors: Subhojeet Pramanik, Priyanka Agrawal, Aman Hussain
artificial-intelligence deep-learning image-captioning machine-learning multimodal-learning multitask-learning neural-network nlp transformer video-recognition
Last synced: 27 Apr 2024
https://github.com/willxxy/awesome-mmps
Corpus of resources for multimodal machine learning with physiological signals
deep-learning machine-learning multimodal multimodal-data multimodal-deep-learning multimodal-learning physiological-signals signal-processing
Last synced: 23 Apr 2024
https://github.com/sangminwoo/awesome-vision-and-language
A curated list of awesome vision and language resources (still under construction... stay tuned!)
awesome awesome-list multimodal-learning vision-and-language
Last synced: 20 Apr 2024
https://github.com/ilaria-manco/multimodal-ml-music
List of academic resources on Multimodal ML for Music
academic-publications awesome-list multimodal-data multimodal-deep-learning multimodal-learning music-ai music-information-retrieval music-research resources
Last synced: 13 Apr 2024
https://github.com/HuaizhengZhang/Awsome-Deep-Learning-for-Video-Analysis
Papers, code and datasets about deep learning and multi-modal learning for video analysis
deep-learning machine-learning multimodal-learning paper video-analysis video-classification video-dataset
Last synced: 11 Apr 2024
https://github.com/ImKeTT/Awesome-Multi-Modal-Dialog
[Paperlist] Awesome paper list of multimodal dialog, including methods, datasets and metrics
awesome-list dialogue dialogue-system multimodal multimodal-datasets multimodal-deep-learning multimodal-dialogue multimodal-learning paperlist
Last synced: 09 Apr 2024
https://github.com/georgian-io/Multimodal-Toolkit
Multimodal model for text and tabular data with HuggingFace transformers as building block for text data
huggingface-transformers multimodal-learning natural-language-processing tabular-data transformer
Last synced: 08 Apr 2024
https://github.com/machine-intelligence-laboratory/TopicNet
Interface for easier topic modelling.
bigartm-library custom-score document-representation modalities multimodal-data multimodal-learning pypi topic-modeling topic-modelling
Last synced: 08 Apr 2024
https://github.com/MMMU-Benchmark/MMMU
This repo contains evaluation code for the paper "MMMU: A Massive Multi-discipline Multimodal Understanding and Reasoning Benchmark for Expert AGI"
computer-vision deep-learning deep-neural-networks evaluation foundation-models large-language-models large-multimodal-models llm llms machine-learning multimodal multimodal-deep-learning multimodal-learning multimodality natural-language-processing question-answering stem visual-question-answering
Last synced: 07 Apr 2024
https://github.com/ys-zong/awesome-self-supervised-multimodal-learning
A curated list of self-supervised multimodal learning resources.
awesome-list machine-learning multimodal-learning self-supervised-learning
Last synced: 05 Apr 2024
https://github.com/DmitryRyumin/ICASSP-2023-24-Papers
ICASSP 2023-2024 Papers: A complete collection of influential and exciting research papers from the ICASSP 2023-24 conferences. Explore the latest advancements in acoustics, speech and signal processing. Code included. Star the repository to support the advancement of audio and signal processing!
asr denoising domain-adaptation face-recognition generative-models icassp icassp2023 icassp2024 image-generation keyword-spotting language-modeling multimodal-learning music-generation self-supervised-learning semantic-segmentation signal-processing signal-restoration speech-recognition spoken-language-understanding vad
Last synced: 31 Mar 2024
https://github.com/SkBlaz/autobot
An autoML for explainable text classification.
automl automl-algorithms automl-experiments classification data-mining data-science distributed-computing ensemble-learning evolutionary-algorithms machine-learning multimodal-learning natural-language-processing nlp python representation-learning sparse-matrices text-classification transfer-learning transformers-models
Last synced: 31 Mar 2024
https://github.com/mlfoundations/open_flamingo
An open-source framework for training large multimodal models.
computer-vision deep-learning flamingo in-context-learning language-model multimodal-learning pytorch
Last synced: 19 Mar 2024
https://github.com/pykale/pykale
Knowledge-Aware machine LEarning (KALE): accessible machine learning from multiple sources for interdisciplinary research, part of the π₯PyTorch ecosystem. β Star to support our work!
computer-vision data-science deep-learning domain-adaptation graph-analysis knowledge-aware-learning machine-learning medical-image-analysis meta-learning multimodal multimodal-learning python pytorch transfer-learning
Last synced: 17 Mar 2024