https://github.com/fmxexpress/ai-vision-chat

Chat with large languages models about the contents of an image via this native desktop client for Windows, macOS, and Linux.
https://github.com/fmxexpress/ai-vision-chat

ai delphi delphi-sample desktop-app linux llava llm macos replicate-api vicuna vision-language-model windows

Last synced: 3 months ago
JSON representation

Chat with large languages models about the contents of an image via this native desktop client for Windows, macOS, and Linux.

Host: GitHub
URL: https://github.com/fmxexpress/ai-vision-chat
Owner: FMXExpress
License: mit
Created: 2023-10-16T03:36:19.000Z (over 1 year ago)
Default Branch: main
Last Pushed: 2023-10-20T04:32:12.000Z (over 1 year ago)
Last Synced: 2025-04-13T17:16:13.481Z (3 months ago)
Topics: ai, delphi, delphi-sample, desktop-app, linux, llava, llm, macos, replicate-api, vicuna, vision-language-model, windows
Language: Pascal
Homepage: https://www.fmxexpress.com/
Size: 3.48 MB
Stars: 20
Watchers: 3
Forks: 4
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

# AI Vision Chat
Chat with large languages models about the contents of an image via this native desktop client for Windows, macOS, and Linux.

Features llava-13b which uses a vision encoder and Vicuna to allow you to discuss the contents of an image with the large language model (LLM). Additionally, it includes an API X-Ray tab so you can easily how the interaction with the API server works.

The AI Vision Chat Desktop client is a powerful UI for allowing you to discuss the contents of an image with a chat bot.

Visual Language Models supported:
* LLaVA-13b
* fuyu-8b
* minigpt-4
* instructblip-vicuna13b
* mplug-owl
* blip-2

Built with Delphi using the FireMonkey [cross-platform development](https://www.embarcadero.com/products/delphi/) framework this client works on Windows, macOS, and Linux (and maybe Android+iOS) with a single codebase and single UI. At the moment it is specifically set up for Windows.

It also features a REST integration with Replicate.com for hosting the model used in the client. You need to sign up for an API key to access that functionality. Replicate models can be run in the cloud or locally via docker.

```
docker run -d -p 5000:5000 --gpus=all r8.im/yorickvp/llava-13b@sha256:6bc1c7bb0d2a34e413301fee8f7cc728d2d4e75bfab186aa995f63292bda92fc
curl http://localhost:5000/predictions -X POST \
-H "Content-Type: application/json" \
-d '{"input": {
"image": "https://url/to/file",
"prompt": "...",
"top_p": "...",
"temperature": "...",
"max_tokens": "..."
}}'
```

# AI Vision Chat Desktop client Screeshot on Windows
![AI Vision Chat Desktop client on Windows](/screenshot.png)

Other Delphi AI clients:

[SDXL Inpainting](https://github.com/FMXExpress/SDXL-Inpainting)

[Stable Diffusion Desktop Client](https://github.com/FMXExpress/Stable-Diffusion-Desktop-Client)

[CodeDroidAI](https://github.com/FMXExpress/CodeDroidAI)

[ControlNet Sketch To Image](https://github.com/FMXExpress/ControlNet-Sketch-To-Image)

[AutoBlogAI](https://github.com/FMXExpress/AutoBlogAI)

[AI Code Translator](https://github.com/FMXExpress/AI-Code-Translator)

[AI Playground](https://github.com/FMXExpress/AI-Playground-DesktopClient)

[Song Writer AI](https://github.com/FMXExpress/Song-Writer-AI)

[Stable Diffusion Text To Image Prompts](https://github.com/FMXExpress/Stable-Diffusion-Text-To-Image-Prompts)

[Generative AI Prompts](https://github.com/FMXExpress/Generative-AI-Prompts)

[Dreambooth Desktop Client](https://github.com/FMXExpress/DreamBooth-Desktop-Client)

[Text To Vector Desktop Client](https://github.com/FMXExpress/Text-To-Vector-Desktop-Client)

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/fmxexpress/ai-vision-chat

Awesome Lists containing this project

README