https://github.com/OpenGVLab/InternVideo

[ECCV2024] Video Foundation Models & Data for Multimodal Understanding
https://github.com/OpenGVLab/InternVideo

action-recognition benchmark contrastive-learning foundation-models instruction-tuning masked-autoencoder multimodal open-set-recognition self-supervised spatio-temporal-action-localization temporal-action-localization video-clip video-data video-dataset video-question-answering video-retrieval video-understanding vision-transformer zero-shot-classification zero-shot-retrieval

Last synced: 7 months ago
JSON representation

[ECCV2024] Video Foundation Models & Data for Multimodal Understanding

Host: GitHub
URL: https://github.com/OpenGVLab/InternVideo
Owner: OpenGVLab
License: apache-2.0
Created: 2022-11-23T12:57:00.000Z (almost 3 years ago)
Default Branch: main
Last Pushed: 2024-12-07T16:20:41.000Z (10 months ago)
Last Synced: 2024-12-09T08:52:15.241Z (10 months ago)
Topics: action-recognition, benchmark, contrastive-learning, foundation-models, instruction-tuning, masked-autoencoder, multimodal, open-set-recognition, self-supervised, spatio-temporal-action-localization, temporal-action-localization, video-clip, video-data, video-dataset, video-question-answering, video-retrieval, video-understanding, vision-transformer, zero-shot-classification, zero-shot-retrieval
Language: Python
Homepage:
Size: 53.2 MB
Stars: 1,452
Watchers: 27
Forks: 90
Open Issues: 92
Metadata Files:
- Readme: README.md
- License: LICENSE

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/OpenGVLab/InternVideo

Awesome Lists containing this project