https://github.com/NVlabs/VILA

VILA - a multi-image visual language model with training, inference and evaluation recipe, deployable from cloud to edge (Jetson Orin and laptops)
https://github.com/NVlabs/VILA

Last synced: 5 months ago
JSON representation

VILA - a multi-image visual language model with training, inference and evaluation recipe, deployable from cloud to edge (Jetson Orin and laptops)

Host: GitHub
URL: https://github.com/NVlabs/VILA
Owner: NVlabs
License: apache-2.0
Created: 2024-02-23T09:19:16.000Z (over 1 year ago)
Default Branch: main
Last Pushed: 2024-10-24T13:29:43.000Z (12 months ago)
Last Synced: 2024-10-29T15:35:20.672Z (11 months ago)
Language: Python
Homepage:
Size: 42.4 MB
Stars: 1,944
Watchers: 27
Forks: 157
Open Issues: 53
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

awesome-llm-and-aigc - NVILA - a multi-image visual language model with training, inference and evaluation recipe, deployable from cloud to edge (Jetson Orin and laptops). "NVILA: Efficient Frontier Visual Language Models". (**[arXiv 2024](https://arxiv.org/abs/2412.04468)**). (Summary)
ai-game-devtools - VILA - training for Visual Language Models. |[arXiv](https://arxiv.org/abs/2312.07533) | | Visual | (<span id="visual">VLM (Visual)</span> / <span id="tool">LLM (LLM & Tool)</span>)
StarryDivineSky - NVlabs/VILA - 一种具有训练、推理和评估配方的多图像视觉语言模型，可从云部署到边缘（Jetson Orin 和笔记本电脑）。VILA 是一种视觉语言模型（VLM），使用大规模交错的图文数据进行预训练，可实现视频理解和多图像理解能力。VILA 可通过 AWQ 4bit 量化和 TinyChat 框架在边缘部署。我们发现：（1）图文对是不够的，交错的图文是必不可少的;（2）交错图文预训练中的解冻LLM使上下文学习成为可能;（3）重新混合纯文本指令数据对于提高VLM和纯文本性能至关重要;（4）令牌压缩扩展 #video 帧。VILA展示了吸引人的功能，包括：视频推理、上下文学习、视觉思维链和更好的世界知识。 (多模态大模型 / 网络服务_其他)
AiTreasureBox - NVlabs/VILA - 10-08_3577_0](https://img.shields.io/github/stars/NVlabs/VILA.svg)|VILA - a multi-image visual language model with training, inference and evaluation recipe, deployable from cloud to edge (Jetson Orin and laptops)| (Repos)

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/NVlabs/VILA

Awesome Lists containing this project