https://github.com/raspoli/mlx-serve
Local inference server for Apple Silicon — hot-swaps MLX models (LLM, vision, embeddings, TTS, STT) via OpenAI API
https://github.com/raspoli/mlx-serve
apple-silicon embeddings fastapi inference-server llm local-inference local-llm machine-learning macos mlx mlx-lm model-serving openai-api openai-compatible python speech-to-text text-to-speech unified-memory vision-language-model
Last synced: 3 months ago
JSON representation
Local inference server for Apple Silicon — hot-swaps MLX models (LLM, vision, embeddings, TTS, STT) via OpenAI API
- Host: GitHub
- URL: https://github.com/raspoli/mlx-serve
- Owner: raspoli
- License: mit
- Created: 2026-03-31T13:53:26.000Z (3 months ago)
- Default Branch: main
- Last Pushed: 2026-03-31T14:22:40.000Z (3 months ago)
- Last Synced: 2026-03-31T15:28:30.211Z (3 months ago)
- Topics: apple-silicon, embeddings, fastapi, inference-server, llm, local-inference, local-llm, machine-learning, macos, mlx, mlx-lm, model-serving, openai-api, openai-compatible, python, speech-to-text, text-to-speech, unified-memory, vision-language-model
- Language: Python
- Size: 282 KB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0