Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

https://github.com/AILab-CVC/UniRepLKNet

[CVPR'24] UniRepLKNet: A Universal Perception Large-Kernel ConvNet for Audio, Video, Point Cloud, Time-Series and Image Recognition

architecture artificial-intelligence convolutional-neural-networks deep-learning multimodal-learning

Last synced: 02 Jul 2024

https://github.com/Pointcept/GPT4Point

[CVPR'24 Highlight] GPT4Point: A Unified Framework for Point-Language Understanding and Generation.

3d-generation llm multimodal-learning

Last synced: 01 Jul 2024

https://github.com/HUANGLIZI/LViT

[IEEE Transactions on Medical Imaging/TMI] This repo is the official implementation of "LViT: Language meets Vision Transformer in Medical Image Segmentation"

medical-image-analysis multimodal-learning pytorch segmentation vision-language

Last synced: 28 Jun 2024

https://github.com/declare-lab/multimodal-deep-learning

This repository contains various models targetting multimodal representation learning, multimodal fusion for downstream tasks such as multimodal sentiment analysis.

multimodal-deep-learning multimodal-interactions multimodal-learning multimodal-sentiment-analysis

Last synced: 16 Jun 2024

https://github.com/ArrowLuo/CLIP4Clip

An official implementation for "CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retrieval"

activitynet clip didemo lsmdc msrvtt msvd multimodal multimodal-learning multimodality ranking retrieval retrieval-model search video-clip-retrieval video-text-retrieval

Last synced: 07 Jun 2024

https://github.com/miccunifi/SEARLE

[ICCV 2023] - Zero-shot Composed Image Retrieval with Textual Inversion

circo cirr clip composed-image-retrieval fashion-iq knowledge-distillation multimodal-learning pytorch textual-inversion

Last synced: 07 Jun 2024

https://github.com/Eurus-Holmes/Awesome-Multimodal-Research

A curated list of Multimodal Related Research.

awesome multimodal multimodal-learning multimodal-research

Last synced: 14 May 2024

https://github.com/subho406/OmniNet

Official Pytorch implementation of "OmniNet: A unified architecture for multi-modal multi-task learning" | Authors: Subhojeet Pramanik, Priyanka Agrawal, Aman Hussain

artificial-intelligence deep-learning image-captioning machine-learning multimodal-learning multitask-learning neural-network nlp transformer video-recognition

Last synced: 27 Apr 2024

https://github.com/sangminwoo/awesome-vision-and-language

A curated list of awesome vision and language resources (still under construction... stay tuned!)

awesome awesome-list multimodal-learning vision-and-language

Last synced: 20 Apr 2024

https://github.com/HuaizhengZhang/Awsome-Deep-Learning-for-Video-Analysis

Papers, code and datasets about deep learning and multi-modal learning for video analysis

deep-learning machine-learning multimodal-learning paper video-analysis video-classification video-dataset

Last synced: 11 Apr 2024

https://github.com/ImKeTT/Awesome-Multi-Modal-Dialog

[Paperlist] Awesome paper list of multimodal dialog, including methods, datasets and metrics

awesome-list dialogue dialogue-system multimodal multimodal-datasets multimodal-deep-learning multimodal-dialogue multimodal-learning paperlist

Last synced: 09 Apr 2024

https://github.com/georgian-io/Multimodal-Toolkit

Multimodal model for text and tabular data with HuggingFace transformers as building block for text data

huggingface-transformers multimodal-learning natural-language-processing tabular-data transformer

Last synced: 08 Apr 2024

https://github.com/ys-zong/awesome-self-supervised-multimodal-learning

A curated list of self-supervised multimodal learning resources.

awesome-list machine-learning multimodal-learning self-supervised-learning

Last synced: 05 Apr 2024

https://github.com/DmitryRyumin/ICASSP-2023-24-Papers

ICASSP 2023-2024 Papers: A complete collection of influential and exciting research papers from the ICASSP 2023-24 conferences. Explore the latest advancements in acoustics, speech and signal processing. Code included. Star the repository to support the advancement of audio and signal processing!

asr denoising domain-adaptation face-recognition generative-models icassp icassp2023 icassp2024 image-generation keyword-spotting language-modeling multimodal-learning music-generation self-supervised-learning semantic-segmentation signal-processing signal-restoration speech-recognition spoken-language-understanding vad

Last synced: 31 Mar 2024

https://github.com/mlfoundations/open_flamingo

An open-source framework for training large multimodal models.

computer-vision deep-learning flamingo in-context-learning language-model multimodal-learning pytorch

Last synced: 19 Mar 2024

https://github.com/pykale/pykale

Knowledge-Aware machine LEarning (KALE): accessible machine learning from multiple sources for interdisciplinary research, part of the πŸ”₯PyTorch ecosystem. ⭐ Star to support our work!

computer-vision data-science deep-learning domain-adaptation graph-analysis knowledge-aware-learning machine-learning medical-image-analysis meta-learning multimodal multimodal-learning python pytorch transfer-learning

Last synced: 17 Mar 2024