https://github.com/monatis/lmm.cpp

Inference of Large Multimodal Models in C/C++. LLaVA and others
https://github.com/monatis/lmm.cpp

Last synced: 3 months ago
JSON representation

Inference of Large Multimodal Models in C/C++. LLaVA and others

Host: GitHub
URL: https://github.com/monatis/lmm.cpp
Owner: monatis
License: mit
Created: 2023-07-19T08:16:20.000Z (over 2 years ago)
Default Branch: main
Last Pushed: 2023-07-19T08:22:25.000Z (over 2 years ago)
Last Synced: 2023-07-19T09:34:42.103Z (over 2 years ago)
Language: C
Size: 84 KB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

# lmm.cpp
Inference of Large Multimodal Models in C/C++

## Warning
This is still work in progress and not ready for anything.

## Description
This repo implements LLaVA inference in C/C++ on top of
[clip.cpp](https://github.com/monatis/clip.cpp)
and [llama.cpp](https://github.com/ggerganov/llama.cpp).
Eventually, it will support inference of other large multimodal models, but LLaVA is chosen as a starting point.

## Roadmap
- [ ] Get rid of text model and other unnecessary artifacts in `clip.cpp`
- [ ] Write the conversion script for `LLaVA`. Initially, it should be two-file format **one for the visual encoder and the other for LLaMA.
- [ ] Come up with a way to support single-file format the includes the CLIP backbone, the multimodal projector and LLaMA weights together.
- [ ] Support other models such as `instructblip`.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/monatis/lmm.cpp

Awesome Lists containing this project

README