https://github.com/opokiu/modal-vllm-server
Run LLMs quickly and efficiently with the modal-vllm-server, which optimizes GPU settings for each model. This project simplifies deployment on Modal's serverless infrastructure, making it easy to manage and serve your models. 🐙✨
https://github.com/opokiu/modal-vllm-server
api fastapi first-repository llama llm modal modal-labs openai python pytorch transformer vllm
Last synced: 4 months ago
JSON representation
Run LLMs quickly and efficiently with the modal-vllm-server, which optimizes GPU settings for each model. This project simplifies deployment on Modal's serverless infrastructure, making it easy to manage and serve your models. 🐙✨
- Host: GitHub
- URL: https://github.com/opokiu/modal-vllm-server
- Owner: opokiu
- License: mit
- Created: 2025-06-14T19:50:49.000Z (4 months ago)
- Default Branch: main
- Last Pushed: 2025-06-15T00:05:47.000Z (4 months ago)
- Last Synced: 2025-06-15T01:19:16.277Z (4 months ago)
- Topics: api, fastapi, first-repository, llama, llm, modal, modal-labs, openai, python, pytorch, transformer, vllm
- Language: Python
- Size: 39.1 KB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE