Projects in Awesome Lists by AILab-CVC
A curated list of projects in awesome lists by AILab-CVC .
https://github.com/ailab-cvc/yolo-world
[CVPR 2024] Real-Time Open-Vocabulary Object Detection
Last synced: 12 May 2025
https://github.com/ailab-cvc/videocrafter
VideoCrafter2: Overcoming Data Limitations for High-Quality Video Diffusion Models
image-to-video text-to-video video-generation
Last synced: 14 May 2025
https://github.com/AILab-CVC/YOLO-World
[CVPR 2024] Real-Time Open-Vocabulary Object Detection
Last synced: 20 Mar 2025
https://github.com/AILab-CVC/VideoCrafter
VideoCrafter2: Overcoming Data Limitations for High-Quality Video Diffusion Models
image-to-video text-to-video video-generation
Last synced: 28 Mar 2025
https://ailab-cvc.github.io/videocrafter/
VideoCrafter2: Overcoming Data Limitations for High-Quality Video Diffusion Models
image-to-video text-to-video video-generation
Last synced: 28 Mar 2025
https://github.com/ailab-cvc/unireplknet
[CVPR'24] UniRepLKNet: A Universal Perception Large-Kernel ConvNet for Audio, Video, Point Cloud, Time-Series and Image Recognition
architecture artificial-intelligence convolutional-neural-networks deep-learning multimodal-learning
Last synced: 15 May 2025
https://github.com/AILab-CVC/UniRepLKNet
[CVPR'24] UniRepLKNet: A Universal Perception Large-Kernel ConvNet for Audio, Video, Point Cloud, Time-Series and Image Recognition
architecture artificial-intelligence convolutional-neural-networks deep-learning multimodal-learning
Last synced: 20 Mar 2025
https://github.com/StevenGrove/GPT4Tools
GPT4Tools is an intelligent system that can automatically decide, control, and utilize different visual foundation models, allowing the user to interact with images during a conversation.
Last synced: 21 Apr 2025
https://github.com/AILab-CVC/GPT4Tools
GPT4Tools is an intelligent system that can automatically decide, control, and utilize different visual foundation models, allowing the user to interact with images during a conversation.
Last synced: 19 Mar 2025
https://github.com/ailab-cvc/gpt4tools
GPT4Tools is an intelligent system that can automatically decide, control, and utilize different visual foundation models, allowing the user to interact with images during a conversation.
Last synced: 04 Apr 2025
https://github.com/ailab-cvc/seed
Official implementation of SEED-LLaMA (ICLR 2024).
foundation-model multimodal vision-language
Last synced: 09 Apr 2025
https://github.com/ailab-cvc/seed-bench
(CVPR2024)A benchmark for evaluating Multimodal LLMs using multiple-choice questions.
Last synced: 16 May 2025
https://github.com/ailab-cvc/freenoise
[ICLR 2024] Code for FreeNoise based on VideoCrafter
aigc diffusion generative-model video-diffusion-model
Last synced: 06 Apr 2025
https://github.com/AILab-CVC/FreeNoise
[ICLR 2024] Code for FreeNoise based on VideoCrafter
aigc diffusion generative-model video-diffusion-model
Last synced: 28 Mar 2025
https://github.com/AILab-CVC/SEED-Bench
(CVPR2024)A benchmark for evaluating Multimodal LLMs using multiple-choice questions.
Last synced: 24 Jul 2025
https://github.com/AILab-CVC/CV-VAE
[NeurIPS 2024] CV-VAE: A Compatible Video VAE for Latent Generative Video Models
Last synced: 28 Mar 2025
https://github.com/ailab-cvc/cv-vae
[NeurIPS 2024] CV-VAE: A Compatible Video VAE for Latent Generative Video Models
Last synced: 12 Apr 2025
https://github.com/ailab-cvc/talecrafter
[SIGGRAPH Asia 2023] An interactive story visualization tool that support multiple characters
siggprah-asia siggraph-asia-2023 storycreation storytelling
Last synced: 27 Jan 2026
https://ailab-cvc.github.io/TaleCrafter/
[SIGGRAPH Asia 2023] An interactive story visualization tool that support multiple characters
siggprah-asia siggraph-asia-2023 storycreation storytelling
Last synced: 27 Mar 2025
https://github.com/AILab-CVC/TaleCrafter
[SIGGRAPH Asia 2023] An interactive story visualization tool that support multiple characters
siggprah-asia siggraph-asia-2023 storycreation storytelling
Last synced: 27 Mar 2025
https://github.com/ailab-cvc/animate-a-story
Retrieval-Augmented Video Generation for Telling a Story
Last synced: 27 Jan 2026
https://ailab-cvc.github.io/Animate-A-Story/
Retrieval-Augmented Video Generation for Telling a Story
Last synced: 27 Mar 2025
https://github.com/AILab-CVC/Animate-A-Story
Retrieval-Augmented Video Generation for Telling a Story
Last synced: 27 Mar 2025
https://github.com/ailab-cvc/videogen-eval
VideoGen-Eval: Agent-based System for Video Generation Evaluation
aigc benchmark image-to-video sora-video-ai text-to-video video-evaluation video-generation video-to-video
Last synced: 27 Jan 2026
https://github.com/ailab-cvc/make-your-video
[IEEE TVCG 2024] Customized Video Generation Using Textual and Structural Guidance
Last synced: 04 Jul 2025
https://github.com/AILab-CVC/Make-Your-Video
[IEEE TVCG 2024] Customized Video Generation Using Textual and Structural Guidance
Last synced: 28 Mar 2025
https://github.com/ailab-cvc/groupmixformer
GroupMixAttention and GroupMixFormer
Last synced: 04 Mar 2026
https://github.com/ailab-cvc/m2pt
[CVPR'24] Multimodal Pathway: Improve Transformers with Irrelevant Data from Other Modalities
artificial-intelligence deep-learning multimodal transformers
Last synced: 25 Jul 2025
https://github.com/ailab-cvc/vl-gpt
VL-GPT: A Generative Pre-trained Transformer for Vision and Language Understanding and Generation
Last synced: 28 Jan 2026
https://github.com/ailab-cvc/hifi-123
[ECCV 2024] HiFi-123: Towards High-fidelity One Image to 3D Content Generation
Last synced: 20 Aug 2025
https://github.com/ailab-cvc/ailab-cvc.github.io
Homepage of Tencent AI Lab CVC.
Last synced: 07 Mar 2026