https://github.com/LLaVA-VL/LLaVA-Interactive-Demo

LLaVA-Interactive-Demo
https://github.com/LLaVA-VL/LLaVA-Interactive-Demo

lmm multimodal

Last synced: about 1 month ago
JSON representation

LLaVA-Interactive-Demo

Host: GitHub
URL: https://github.com/LLaVA-VL/LLaVA-Interactive-Demo
Owner: LLaVA-VL
License: apache-2.0
Created: 2023-10-12T04:22:50.000Z (over 1 year ago)
Default Branch: main
Last Pushed: 2024-07-25T18:03:44.000Z (11 months ago)
Last Synced: 2024-11-14T04:34:35.800Z (7 months ago)
Topics: lmm, multimodal
Language: Python
Homepage: https://llava-vl.github.io/llava-interactive/
Size: 1.79 MB
Stars: 351
Watchers: 16
Forks: 26
Open Issues: 3
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

# 🌋 LLaVA-Interactive

*An All-in-One Demo for Image Chat, Segmentation and Generation/Editing.*

[[Project Page](https://llava-vl.github.io/llava-interactive/)] [Demo] [[Paper](https://arxiv.org/abs/2311.00571)]

> ⚠️ As of Jun 10, 2024 the live demo or playground website is disabled.

# Install

Installing this project requires CUDA 11.7 or above. Follow the steps below:

```bash
git clone https://github.com/LLaVA-VL/LLaVA-Interactive-Demo.git
conda create -n llava_int -c conda-forge -c pytorch python=3.10.8 pytorch=2.0.1 -y
conda activate llava_int
cd LLaVA-Interactive-Demo
pip install -r requirements.txt
source setup.sh
```

# Run the demo

To run the demo, simply run the shell script.

```bash
./run_demo.sh
```

# Citation

If you find LLaVA-Interactive useful for your research and applications, please cite using this BibTeX:
```bash
@article{chen2023llava_interactive,
author = {Chen, Wei-Ge and Spiridonova, Irina and Yang, Jianwei and Gao, Jianfeng and Li, Chunyuan},
title = {LLaVA-Interactive: An All-in-One Demo for Image Chat, Segmentation, Generation and Editing},
publisher = {arXiv:2311.00571},
year = {2023}
}
```

# Related Projects

- [LLaVA: Large Language and Vision Assistant](https://github.com/haotian-liu/LLaVA)
- [SEEM: Segment Everything Everywhere All at Once](https://github.com/UX-Decoder/Segment-Everything-Everywhere-All-At-Once)
- [GLIGEN: Open-Set Grounded Text-to-Image Generation](https://github.com/gligen/GLIGEN)

# Acknowledgement

- [LaMa](https://github.com/advimman/lama): A nice tool we use to fill the background holes in images.

# Terms of use

By using this service, users are required to agree to the following terms: The service is a research preview intended for non-commercial use only. It only provides limited safety measures and may generate offensive content. It must not be used for any illegal, harmful, violent, racist, or sexual purposes. The service may collect user dialogue data for future research. For an optimal experience, please use desktop computers for this demo, as mobile devices may compromise its quality.

# License

This project including LLaVA and SEEM are licensed under the Apache License. See the [LICENSE](LICENSE) file for more details. The GLIGEN project is licensed under the MIT License.

The service is a research preview intended for non-commercial use only, subject to the model [License](https://github.com/facebookresearch/llama/blob/main/MODEL_CARD.md) of LLaMA, [Terms of Use](https://openai.com/policies/terms-of-use) of the data generated by OpenAI, and [Privacy Practices](https://chrome.google.com/webstore/detail/sharegpt-share-your-chatg/daiacboceoaocpibfodeljbdfacokfjb) of ShareGPT. Please contact us if you find any potential violation.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/LLaVA-VL/LLaVA-Interactive-Demo

Awesome Lists containing this project

README