https://github.com/x-plug/mplug-2
mPLUG-2: A Modularized Multi-modal Foundation Model Across Text, Image and Video (ICML 2023)
https://github.com/x-plug/mplug-2
foundation-models image-retrieval mllm mplug multimodal multimodal-pretraining video video-question-answering video-retrieval vqa
Last synced: 9 months ago
JSON representation
mPLUG-2: A Modularized Multi-modal Foundation Model Across Text, Image and Video (ICML 2023)
- Host: GitHub
- URL: https://github.com/x-plug/mplug-2
- Owner: X-PLUG
- License: apache-2.0
- Created: 2023-05-22T13:09:51.000Z (about 3 years ago)
- Default Branch: main
- Last Pushed: 2023-07-21T06:29:21.000Z (almost 3 years ago)
- Last Synced: 2025-05-30T12:20:25.654Z (about 1 year ago)
- Topics: foundation-models, image-retrieval, mllm, mplug, multimodal, multimodal-pretraining, video, video-question-answering, video-retrieval, vqa
- Language: Python
- Homepage:
- Size: 2.36 MB
- Stars: 227
- Watchers: 4
- Forks: 20
- Open Issues: 17