https://github.com/arechen/mlx_inference_openai
MLX inference service compatible with OpenAI API, built on MLX-LM and MLX-VLM.基于MLX-LM和MLX-VLM构建的OpenAI API兼容的MLX推理服务.
https://github.com/arechen/mlx_inference_openai
mlx mlx-lm mlx-vlm openai openai-api self-hosted
Last synced: 9 days ago
JSON representation
MLX inference service compatible with OpenAI API, built on MLX-LM and MLX-VLM.基于MLX-LM和MLX-VLM构建的OpenAI API兼容的MLX推理服务.
- Host: GitHub
- URL: https://github.com/arechen/mlx_inference_openai
- Owner: AreChen
- Created: 2025-04-22T03:30:38.000Z (17 days ago)
- Default Branch: master
- Last Pushed: 2025-04-28T07:33:47.000Z (10 days ago)
- Last Synced: 2025-04-28T08:24:44.097Z (10 days ago)
- Topics: mlx, mlx-lm, mlx-vlm, openai, openai-api, self-hosted
- Language: Python
- Homepage:
- Size: 17.6 KB
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README

# MLX INFERENCE[](LICENSE)
[](README_ZH.md)### Project Introduction
MLX INFERENCE is an OpenAI API compatible inference service based on MLX-LM and MLX-VLM, providing the following endpoints:
- `/v1/chat/completions` - Chat completion interface
- `/v1/responses` - Response interface
- `/v1/models` - Get available model list### Installation
```bash
pip install -r requirements.txt
# Copy environment file
cp .env.example .env
```### Start Service
Execute in project root directory:
```bash
uvicorn mlx_Inference:app --workers 1 --port 8002
```Parameters:
- `--workers`: Number of worker processes
- `--port`: Service port number### Features
- Compatible with OpenAI API specifications
- Backend inference uses [MLX-LM](https://github.com/ml-explore/mlx-lm) and [MLX-VLM](https://github.com/Blaizzy/mlx-vlm), supports [mlx-community](https://huggingface.co/mlx-community) models
- Easy to deploy and use