https://github.com/x-plug/mplug

mPLUG: Effective and Efficient Vision-Language Learning by Cross-modal Skip-connections. (EMNLP 2022)
https://github.com/x-plug/mplug

image-captioning image-text image-text-retrieval multimodal pretraining pytorch transformer visual-language vqa

Last synced: about 1 year ago
JSON representation

mPLUG: Effective and Efficient Vision-Language Learning by Cross-modal Skip-connections. (EMNLP 2022)

Host: GitHub
URL: https://github.com/x-plug/mplug
Owner: X-PLUG
Created: 2023-05-08T07:32:30.000Z (about 3 years ago)
Default Branch: main
Last Pushed: 2023-05-08T07:45:16.000Z (about 3 years ago)
Last Synced: 2025-05-30T12:20:25.634Z (about 1 year ago)
Topics: image-captioning, image-text, image-text-retrieval, multimodal, pretraining, pytorch, transformer, visual-language, vqa
Language: Python
Homepage: https://arxiv.org/abs/2205.12005
Size: 1.56 MB
Stars: 91
Watchers: 2
Forks: 8
Open Issues: 11

ecosyste.ms