https://github.com/mbzuai-oryx/groundinglmm
Grounding Large Multimodal Model (GLaMM), the first-of-its-kind model capable of generating natural language responses that are seamlessly integrated with object segmentation masks [CVPR 2024].
https://github.com/mbzuai-oryx/groundinglmm
foundation-models llm-agent lmm vision-and-language vision-language-model
Last synced: 6 months ago
JSON representation
Grounding Large Multimodal Model (GLaMM), the first-of-its-kind model capable of generating natural language responses that are seamlessly integrated with object segmentation masks [CVPR 2024].
- Host: GitHub
- URL: https://github.com/mbzuai-oryx/groundinglmm
- Owner: mbzuai-oryx
- Created: 2023-11-02T16:53:47.000Z (almost 2 years ago)
- Default Branch: main
- Last Pushed: 2024-04-03T13:07:02.000Z (over 1 year ago)
- Last Synced: 2024-04-03T14:29:17.490Z (over 1 year ago)
- Topics: foundation-models, llm-agent, lmm, vision-and-language, vision-language-model
- Language: Python
- Homepage: https://grounding-anything.com
- Size: 109 MB
- Stars: 536
- Watchers: 31
- Forks: 25
- Open Issues: 5
-
Metadata Files:
- Readme: README.md