Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

Projects in Awesome Lists tagged with video-conversation

A curated list of projects in awesome lists tagged with video-conversation .

https://github.com/mbzuai-oryx/video-chatgpt

[ACL 2024 🔥] Video-ChatGPT is a video conversation model capable of generating meaningful conversation about videos. It combines the capabilities of LLMs with a pretrained visual encoder adapted for spatiotemporal video representation. We also introduce a rigorous 'Quantitative Evaluation Benchmarking' for video-based conversational models.

chatbot clip gpt-4 llama llava mulit-modal vicuna video-chatboat video-conversation vision-language vision-language-pretraining

Last synced: 19 Dec 2024

https://github.com/mbzuai-oryx/Video-ChatGPT

[ACL 2024 🔥] Video-ChatGPT is a video conversation model capable of generating meaningful conversation about videos. It combines the capabilities of LLMs with a pretrained visual encoder adapted for spatiotemporal video representation. We also introduce a rigorous 'Quantitative Evaluation Benchmarking' for video-based conversational models.

chatbot clip gpt-4 llama llava mulit-modal vicuna video-chatboat video-conversation vision-language vision-language-pretraining

Last synced: 24 Oct 2024

https://github.com/mbzuai-oryx/video-llava

PG-Video-LLaVA: Pixel Grounding in Large Multimodal Video Models

grounding llm lmm transcription video video-conversation video-grounding

Last synced: 18 Dec 2024

https://github.com/mbzuai-oryx/Video-LLaVA

PG-Video-LLaVA: Pixel Grounding in Large Multimodal Video Models

grounding llm lmm transcription video video-conversation video-grounding

Last synced: 30 Nov 2024

https://github.com/mbzuai-oryx/videogpt-plus

Official Repository of paper VideoGPT+: Integrating Image and Video Encoders for Enhanced Video Understanding

chatbot clip dual-encoder gpt4 gpt4o image-encoder llama3 llava multimodal phi-3-mini vicuna video-chatbot video-conversation video-encoder vision-language vision-language-pretraining

Last synced: 20 Dec 2024

https://github.com/mbzuai-oryx/VideoGPT-plus

Official Repository of paper VideoGPT+: Integrating Image and Video Encoders for Enhanced Video Understanding

chatbot clip dual-encoder gpt4 gpt4o image-encoder llama3 llava multimodal phi-3-mini vicuna video-chatbot video-conversation video-encoder vision-language vision-language-pretraining

Last synced: 12 Dec 2024