An open API service indexing awesome lists of open source software.

Projects in Awesome Lists tagged with visual-language-models

A curated list of projects in awesome lists tagged with visual-language-models .

https://github.com/thudm/cogvlm

a state-of-the-art-level open visual language model | 多模态预训练模型

cross-modality language-model multi-modal pretrained-models visual-language-models

Last synced: 14 May 2025

https://github.com/THUDM/CogVLM

a state-of-the-art-level open visual language model | 多模态预训练模型

cross-modality language-model multi-modal pretrained-models visual-language-models

Last synced: 28 Mar 2025

https://github.com/camel-ai/crab

🦀️ CRAB: Cross-environment Agent Benchmark for Multimodal Language Model Agents. https://crab.camel-ai.org/

gui-automation language-model-agent large-language-models multi-agent-systems visual-language-models

Last synced: 15 May 2025

https://github.com/tianyu-z/vcr

Official Repo for the paper: VCR: Visual Caption Restoration. Check arxiv.org/pdf/2406.06462 for details.

benchmark deep-learning visual-language-models

Last synced: 07 Oct 2025

https://github.com/amathislab/wildclip

Scene and animal attribute retrieval from camera trap data with domain-adapted vision-language models

behavior camera-trap clip computer-vision computervision visual-language-models

Last synced: 03 Feb 2026

https://github.com/declare-lab/sealing

[NAACL 2024] Official Implementation of paper "Self-Adaptive Sampling for Efficient Video Question Answering on Image--Text Models"

multimodality naacl2024 video-question-answering video-understanding visual-language-models

Last synced: 14 Apr 2025

https://github.com/shreydan/vlm-od

experimental: finetune smolVLM on COCO (without any special <locXYZ> tokens)

computer-vision deep-learning llm object-detection transformers visual-language-models vlm

Last synced: 28 Jun 2025

https://github.com/legalaspro/modern_ai_foundations

A collection of implementations exploring modern AI architectures and foundational models.

cvae diffusion-models flowmatching vae vae-pytorch vision-transformer visual-language-models vlms

Last synced: 23 Jun 2025