Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/LLaVA-VL/LLaVA-Interactive-Demo
LLaVA-Interactive-Demo
https://github.com/LLaVA-VL/LLaVA-Interactive-Demo
lmm multimodal
Last synced: 2 days ago
JSON representation
LLaVA-Interactive-Demo
- Host: GitHub
- URL: https://github.com/LLaVA-VL/LLaVA-Interactive-Demo
- Owner: LLaVA-VL
- License: apache-2.0
- Created: 2023-10-12T04:22:50.000Z (about 1 year ago)
- Default Branch: main
- Last Pushed: 2024-07-25T18:03:44.000Z (4 months ago)
- Last Synced: 2024-08-04T00:11:38.674Z (3 months ago)
- Topics: lmm, multimodal
- Language: Python
- Homepage: https://llava-vl.github.io/llava-interactive/
- Size: 1.79 MB
- Stars: 341
- Watchers: 16
- Forks: 25
- Open Issues: 3
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# 🌋 LLaVA-Interactive
*An All-in-One Demo for Image Chat, Segmentation and Generation/Editing.*
[[Project Page](https://llava-vl.github.io/llava-interactive/)] [Demo] [[Paper](https://arxiv.org/abs/2311.00571)]
> ⚠️ As of Jun 10, 2024 the live demo or playground website is disabled.
# Install
Installing this project requires CUDA 11.7 or above. Follow the steps below:
```bash
git clone https://github.com/LLaVA-VL/LLaVA-Interactive-Demo.git
conda create -n llava_int -c conda-forge -c pytorch python=3.10.8 pytorch=2.0.1 -y
conda activate llava_int
cd LLaVA-Interactive-Demo
pip install -r requirements.txt
source setup.sh
```# Run the demo
To run the demo, simply run the shell script.
```bash
./run_demo.sh
```
# Citation
If you find LLaVA-Interactive useful for your research and applications, please cite using this BibTeX:
```bash
@article{chen2023llava_interactive,
author = {Chen, Wei-Ge and Spiridonova, Irina and Yang, Jianwei and Gao, Jianfeng and Li, Chunyuan},
title = {LLaVA-Interactive: An All-in-One Demo for Image Chat, Segmentation, Generation and Editing},
publisher = {arXiv:2311.00571},
year = {2023}
}
```# Related Projects
- [LLaVA: Large Language and Vision Assistant](https://github.com/haotian-liu/LLaVA)
- [SEEM: Segment Everything Everywhere All at Once](https://github.com/UX-Decoder/Segment-Everything-Everywhere-All-At-Once)
- [GLIGEN: Open-Set Grounded Text-to-Image Generation](https://github.com/gligen/GLIGEN)# Acknowledgement
- [LaMa](https://github.com/advimman/lama): A nice tool we use to fill the background holes in images.
# Terms of use
By using this service, users are required to agree to the following terms: The service is a research preview intended for non-commercial use only. It only provides limited safety measures and may generate offensive content. It must not be used for any illegal, harmful, violent, racist, or sexual purposes. The service may collect user dialogue data for future research. For an optimal experience, please use desktop computers for this demo, as mobile devices may compromise its quality.
# License
This project including LLaVA and SEEM are licensed under the Apache License. See the [LICENSE](LICENSE) file for more details. The GLIGEN project is licensed under the MIT License.
The service is a research preview intended for non-commercial use only, subject to the model [License](https://github.com/facebookresearch/llama/blob/main/MODEL_CARD.md) of LLaMA, [Terms of Use](https://openai.com/policies/terms-of-use) of the data generated by OpenAI, and [Privacy Practices](https://chrome.google.com/webstore/detail/sharegpt-share-your-chatg/daiacboceoaocpibfodeljbdfacokfjb) of ShareGPT. Please contact us if you find any potential violation.