{"id":19988559,"url":"https://github.com/mapluisch/llava-websocket-server","last_synced_at":"2026-04-09T08:55:13.173Z","repository":{"id":209841585,"uuid":"725077570","full_name":"mapluisch/LLaVA-WebSocket-Server","owner":"mapluisch","description":"Python-based WebSocket for CLI LLaVA inference.","archived":false,"fork":false,"pushed_at":"2023-11-30T14:35:18.000Z","size":27,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":2,"default_branch":"main","last_synced_at":"2024-10-18T21:59:14.672Z","etag":null,"topics":["inference","llama","llama2","llava","llm","llm-inference","python","websocket","websockets"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/mapluisch.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-11-29T11:47:53.000Z","updated_at":"2024-02-01T13:38:51.000Z","dependencies_parsed_at":"2024-08-09T22:10:38.627Z","dependency_job_id":null,"html_url":"https://github.com/mapluisch/LLaVA-WebSocket-Server","commit_stats":{"total_commits":16,"total_committers":1,"mean_commits":16.0,"dds":0.0,"last_synced_commit":"6fc31414cadefc3c8a9739895da1a0643929b16b"},"previous_names":["mapluisch/llava-websocket","mapluisch/llava-websocket-server"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mapluisch%2FLLaVA-WebSocket-Server","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mapluisch%2FLLaVA-WebSocket-Server/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mapluisch%2FLLaVA-WebSocket-Server/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/mapluisch%2FLLaVA-WebSocket-Server/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/mapluisch","download_url":"https://codeload.github.com/mapluisch/LLaVA-WebSocket-Server/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":241426484,"owners_count":19961035,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["inference","llama","llama2","llava","llm","llm-inference","python","websocket","websockets"],"created_at":"2024-11-13T04:43:24.518Z","updated_at":"2026-04-09T08:55:13.104Z","avatar_url":"https://github.com/mapluisch.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# LLaVA-WebSocket\nPython-based WebSocket for CLI LLaVA inference. The WebSocket server receives prompts and images from other clients and returns the CLI-inference results via the socket connection.\n\n## Overview\nThis project is a quick and simple implementation of LLaVA ([Website](https://llava-vl.github.io/), [GitHub](https://github.com/haotian-liu/LLaVA)) CLI-based inference via a Python WebSocket. It is based on LLaVA's own `cli.py`, adding WebSocket capabilites.\n\nWhen running `python llava-websocket.py`, the checkpoint shards are loaded and stay in cache while a WebSocket server is started. Clients in your local network can send new prompts and images for inference without having to re-load checkpoint shards and without using Gradio.\n\nThis project, in its current form, is not designed for conversation, but rather for one-time prompt processing, enabling the inference of various images and prompts based on individual requests. I might add conversation capabilities in the near future.\n\nThis project was tested on Ubuntu.\n\n## Setup\nYou should follow the LLaVA tutorial, so that you have the pretrained model / checkpoint shards ready. Then, put my script into your LLaVA directory and start it while in the LLaVA conda-environment (`conda activate llava`).\n\n\n## Usage \n```\npython llava-websocket.py [ARGS]\n```\n\n### Arguments\n\nGiven that this project is based on LLaVA's `cli.py`, the following base arguments can be specified:\n```\n--model-path, default=\"liuhaotian/llava-v1.5-13b\"\n--model-base, default=None\n--device, default=\"cuda\"\n--conv-mode, default=None\n--temperature, default=0.2\n--max-new-tokens, default=512\n--load-8bit, action=\"store_true\"\n--load-4bit, action=\"store_true\"\n--debug, action=\"store_true\"\n```\n\nAdditionally added args:\n```\n--port, default=1995\n--verbose, action=\"store_true\"\n--json, action=\"store_true\"\n```\nUsing `--port [int]`, you can specify your own WebSocket port.\n\nUsing `--verbose`, you will receive verbose output on the server-side console (WebSocket connection info, transmitted inference results).\n\nUsing `--json`, the WebSocket responses will be formatted as JSON, containing a timestamp and the inference result: \n\n```json\n{\n    \"time\": \"11:48:52.632415\",\n    \"result\": \"The image features a wooden pier (...)\"\n}\n```\n\n## WebSocket Communication\nIn your local network, clients can access the WebSocket via `ws://[your-ip]:1995`. Specify your own port when calling the python script via `--port [int]`.\n\nYou can probably also make the WebSocket accessible from outside of your LAN via port forwarding.\n\nThe WebSocket server waits for a JSON object:\n\n```json\n{\n    \"prompt\": \"Here goes your prompt (e.g., describe this image.)\",\n    \"image\": \"URL to an image file, or BASE64-encoded string of an image file\"\n}\n```\n\nand returns the written inference result via the socket connection.\n\n### Demo\nhttps://github.com/mapluisch/LLaVA-WebSocket-Server/assets/31780571/db3639e8-1aa8-4be3-8fb5-3ab9ad0e27d0\n\nThe input image is LLaVA's test image: \n\n\u003cimg src=\"https://llava-vl.github.io/static/images/view.jpg\" alt=\"LLaVA's test image\" style=\"width:50%;\"/\u003e\n(Source: https://llava-vl.github.io/static/images/view.jpg)\n\n## Disclaimer\nThis project is a prototype and serves as a basic example of using a WebSocket for LLaVA's CLI inference. Feel free to create a PR.\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmapluisch%2Fllava-websocket-server","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmapluisch%2Fllava-websocket-server","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmapluisch%2Fllava-websocket-server/lists"}