An open API service indexing awesome lists of open source software.

Projects in Awesome Lists by AILab-CVC

A curated list of projects in awesome lists by AILab-CVC .

https://github.com/ailab-cvc/yolo-world

[CVPR 2024] Real-Time Open-Vocabulary Object Detection

Last synced: 12 May 2025

https://github.com/ailab-cvc/videocrafter

VideoCrafter2: Overcoming Data Limitations for High-Quality Video Diffusion Models

image-to-video text-to-video video-generation

Last synced: 14 May 2025

https://github.com/AILab-CVC/YOLO-World

[CVPR 2024] Real-Time Open-Vocabulary Object Detection

Last synced: 20 Mar 2025

https://github.com/AILab-CVC/VideoCrafter

VideoCrafter2: Overcoming Data Limitations for High-Quality Video Diffusion Models

image-to-video text-to-video video-generation

Last synced: 28 Mar 2025

https://ailab-cvc.github.io/videocrafter/

VideoCrafter2: Overcoming Data Limitations for High-Quality Video Diffusion Models

image-to-video text-to-video video-generation

Last synced: 28 Mar 2025

https://github.com/ailab-cvc/unireplknet

[CVPR'24] UniRepLKNet: A Universal Perception Large-Kernel ConvNet for Audio, Video, Point Cloud, Time-Series and Image Recognition

architecture artificial-intelligence convolutional-neural-networks deep-learning multimodal-learning

Last synced: 15 May 2025

https://github.com/AILab-CVC/UniRepLKNet

[CVPR'24] UniRepLKNet: A Universal Perception Large-Kernel ConvNet for Audio, Video, Point Cloud, Time-Series and Image Recognition

architecture artificial-intelligence convolutional-neural-networks deep-learning multimodal-learning

Last synced: 20 Mar 2025

https://github.com/StevenGrove/GPT4Tools

GPT4Tools is an intelligent system that can automatically decide, control, and utilize different visual foundation models, allowing the user to interact with images during a conversation.

Last synced: 21 Apr 2025

https://github.com/AILab-CVC/GPT4Tools

GPT4Tools is an intelligent system that can automatically decide, control, and utilize different visual foundation models, allowing the user to interact with images during a conversation.

Last synced: 19 Mar 2025

https://github.com/ailab-cvc/gpt4tools

GPT4Tools is an intelligent system that can automatically decide, control, and utilize different visual foundation models, allowing the user to interact with images during a conversation.

Last synced: 04 Apr 2025

https://github.com/ailab-cvc/seed

Official implementation of SEED-LLaMA (ICLR 2024).

foundation-model multimodal vision-language

Last synced: 09 Apr 2025

https://github.com/ailab-cvc/seed-x

Multimodal Models in Real World

Last synced: 15 May 2025

https://github.com/AILab-CVC/SEED-X

Multimodal Models in Real World

Last synced: 25 Apr 2025

https://github.com/ailab-cvc/seed-bench

(CVPR2024)A benchmark for evaluating Multimodal LLMs using multiple-choice questions.

Last synced: 16 May 2025

https://github.com/ailab-cvc/freenoise

[ICLR 2024] Code for FreeNoise based on VideoCrafter

aigc diffusion generative-model video-diffusion-model

Last synced: 06 Apr 2025

https://github.com/AILab-CVC/FreeNoise

[ICLR 2024] Code for FreeNoise based on VideoCrafter

aigc diffusion generative-model video-diffusion-model

Last synced: 28 Mar 2025

https://github.com/AILab-CVC/SEED-Bench

(CVPR2024)A benchmark for evaluating Multimodal LLMs using multiple-choice questions.

Last synced: 24 Jul 2025

https://github.com/AILab-CVC/CV-VAE

[NeurIPS 2024] CV-VAE: A Compatible Video VAE for Latent Generative Video Models

Last synced: 28 Mar 2025

https://github.com/ailab-cvc/cv-vae

[NeurIPS 2024] CV-VAE: A Compatible Video VAE for Latent Generative Video Models

Last synced: 12 Apr 2025

https://github.com/ailab-cvc/talecrafter

[SIGGRAPH Asia 2023] An interactive story visualization tool that support multiple characters

siggprah-asia siggraph-asia-2023 storycreation storytelling

Last synced: 27 Jan 2026

https://ailab-cvc.github.io/TaleCrafter/

[SIGGRAPH Asia 2023] An interactive story visualization tool that support multiple characters

siggprah-asia siggraph-asia-2023 storycreation storytelling

Last synced: 27 Mar 2025

https://github.com/AILab-CVC/TaleCrafter

[SIGGRAPH Asia 2023] An interactive story visualization tool that support multiple characters

siggprah-asia siggraph-asia-2023 storycreation storytelling

Last synced: 27 Mar 2025

https://github.com/ailab-cvc/animate-a-story

Retrieval-Augmented Video Generation for Telling a Story

Last synced: 27 Jan 2026

https://ailab-cvc.github.io/Animate-A-Story/

Retrieval-Augmented Video Generation for Telling a Story

Last synced: 27 Mar 2025

https://github.com/AILab-CVC/Animate-A-Story

Retrieval-Augmented Video Generation for Telling a Story

Last synced: 27 Mar 2025

https://github.com/ailab-cvc/videogen-eval

VideoGen-Eval: Agent-based System for Video Generation Evaluation

aigc benchmark image-to-video sora-video-ai text-to-video video-evaluation video-generation video-to-video

Last synced: 27 Jan 2026

https://github.com/ailab-cvc/make-your-video

[IEEE TVCG 2024] Customized Video Generation Using Textual and Structural Guidance

Last synced: 04 Jul 2025

https://github.com/AILab-CVC/Make-Your-Video

[IEEE TVCG 2024] Customized Video Generation Using Textual and Structural Guidance

Last synced: 28 Mar 2025

https://github.com/ailab-cvc/groupmixformer

GroupMixAttention and GroupMixFormer

Last synced: 04 Mar 2026

https://github.com/ailab-cvc/m2pt

[CVPR'24] Multimodal Pathway: Improve Transformers with Irrelevant Data from Other Modalities

artificial-intelligence deep-learning multimodal transformers

Last synced: 25 Jul 2025

https://github.com/ailab-cvc/vl-gpt

VL-GPT: A Generative Pre-trained Transformer for Vision and Language Understanding and Generation

Last synced: 28 Jan 2026

https://github.com/ailab-cvc/hifi-123

[ECCV 2024] HiFi-123: Towards High-fidelity One Image to 3D Content Generation

Last synced: 20 Aug 2025

https://github.com/ailab-cvc/ailab-cvc.github.io

Homepage of Tencent AI Lab CVC.

Last synced: 07 Mar 2026