https://github.com/nullxjx/vllm-docker-compose
docker-compose for vllm, support sticky-sessions using traefik to enable the prefix-caching feature of vllm
https://github.com/nullxjx/vllm-docker-compose
docker-compose traefik vllm
Last synced: 2 months ago
JSON representation
docker-compose for vllm, support sticky-sessions using traefik to enable the prefix-caching feature of vllm
- Host: GitHub
- URL: https://github.com/nullxjx/vllm-docker-compose
- Owner: nullxjx
- Created: 2024-11-13T08:40:59.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2024-12-21T09:02:34.000Z (over 1 year ago)
- Last Synced: 2024-12-21T10:18:14.655Z (over 1 year ago)
- Topics: docker-compose, traefik, vllm
- Language: Shell
- Homepage:
- Size: 7.81 KB
- Stars: 0
- Watchers: 1
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# vllm-docker-compose
docker-compose for [vllm](https://github.com/vllm-project/vllm/), support [sticky-sessions]( https://doc.traefik.io/traefik/routing/services/#sticky-sessions) using traefik to enable the [prefix-caching](https://docs.vllm.ai/en/v0.5.5/automatic_prefix_caching/apc.html) feature of vllm
## Usage
start services
```bash
docker-compose up -d
```
stop services
```bash
docker-compose down
```
test sticky-sessions
```bash
bash request.sh
```
You can observe that if you do not delete the cookie, your request will be load balanced to the same vllm instance every time.
If you delete the cookies, or if you don't use cookies, then load balancing between different instances will work normally.