{"id":23952203,"url":"https://github.com/madroidmaq/mlx-omni-server","last_synced_at":"2025-10-09T00:12:05.383Z","repository":{"id":265936438,"uuid":"883682853","full_name":"madroidmaq/mlx-omni-server","owner":"madroidmaq","description":"MLX Omni Server is a local inference server powered by Apple's MLX framework, specifically designed for Apple Silicon (M-series) chips. It implements OpenAI-compatible API endpoints, enabling seamless integration with existing OpenAI SDK clients while leveraging the power of local ML inference.","archived":false,"fork":false,"pushed_at":"2025-09-02T14:50:19.000Z","size":5250,"stargazers_count":540,"open_issues_count":12,"forks_count":48,"subscribers_count":9,"default_branch":"main","last_synced_at":"2025-09-02T16:30:22.732Z","etag":null,"topics":["function-calling","genai","mlx","openai","openai-api","structured-output","stt","tools","tts"],"latest_commit_sha":null,"homepage":"https://deepwiki.com/madroidmaq/mlx-omni-server/1-overview","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/madroidmaq.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2024-11-05T11:52:00.000Z","updated_at":"2025-09-02T14:50:14.000Z","dependencies_parsed_at":"2025-01-01T18:22:26.606Z","dependency_job_id":"91af7cd3-80c9-4845-bc9c-0fa1fb285008","html_url":"https://github.com/madroidmaq/mlx-omni-server","commit_stats":null,"previous_names":["madroidmaq/mlx-omni-server"],"tags_count":21,"template":false,"template_full_name":null,"purl":"pkg:github/madroidmaq/mlx-omni-server","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/madroidmaq%2Fmlx-omni-server","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/madroidmaq%2Fmlx-omni-server/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/madroidmaq%2Fmlx-omni-server/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/madroidmaq%2Fmlx-omni-server/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/madroidmaq","download_url":"https://codeload.github.com/madroidmaq/mlx-omni-server/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/madroidmaq%2Fmlx-omni-server/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":274797638,"owners_count":25351776,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-09-12T02:00:09.324Z","response_time":60,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["function-calling","genai","mlx","openai","openai-api","structured-output","stt","tools","tts"],"created_at":"2025-01-06T13:01:38.868Z","updated_at":"2025-10-09T00:12:00.344Z","avatar_url":"https://github.com/madroidmaq.png","language":"Python","funding_links":[],"categories":["HarmonyOS","Libraries and Tools","Python","CLIs","LLM \u0026 Inference"],"sub_categories":["Windows Manager","2024"],"readme":"\u003cdiv align=\"center\"\u003e\n\n# MLX Omni Server\n\n*Local AI inference server optimized for Apple Silicon*\n\n[![PyPI version](https://img.shields.io/pypi/v/mlx-omni-server.svg)](https://pypi.python.org/pypi/mlx-omni-server)\n[![Python 3.11+](https://img.shields.io/badge/python-3.11+-blue.svg)](https://python.org)\n[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)\n[![Ask DeepWiki](https://deepwiki.com/badge.svg)](https://deepwiki.com/madroidmaq/mlx-omni-server)\n\n![MLX Omni Server Banner](docs/banner.png)\n\n**MLX Omni Server** provides dual API compatibility with both **OpenAI** and **Anthropic APIs**, enabling seamless local inference on Apple Silicon using the MLX framework.\n\n[Installation](#-installation) • [Quick Start](#-quick-start) • [Documentation](#-documentation) • [Contributing](#-contributing)\n\n\u003c/div\u003e\n\n## ✨ Features\n\n- 🚀 **Apple Silicon Optimized** - Built on MLX framework for M1/M2/M3/M4 chips\n- 🔌 **Dual API Support** - Compatible with both OpenAI and Anthropic APIs\n- 🎯 **Complete AI Suite** - Chat, audio processing, image generation, embeddings\n- ⚡ **High Performance** - Local inference with hardware acceleration\n- 🔐 **Privacy-First** - All processing happens locally on your machine\n- 🛠 **Drop-in Replacement** - Works with existing OpenAI and Anthropic SDKs\n\n## 🚀 Installation\n\n```bash\npip install mlx-omni-server\n```\n\n## ⚡ Quick Start\n\n1. **Start the server:**\n   ```bash\n   mlx-omni-server\n   ```\n\n2. **Choose your preferred API:**\n\n   \u003cdetails\u003e\n   \u003csummary\u003e\u003cb\u003eOpenAI API\u003c/b\u003e (Click to expand)\u003c/summary\u003e\n\n   ```python\n   from openai import OpenAI\n\n   client = OpenAI(\n       base_url=\"http://localhost:10240/v1\",\n       api_key=\"not-needed\"\n   )\n\n   response = client.chat.completions.create(\n       model=\"mlx-community/gemma-3-1b-it-4bit-DWQ\",\n       messages=[{\"role\": \"user\", \"content\": \"Hello!\"}]\n   )\n   print(response.choices[0].message.content)\n   ```\n   \u003c/details\u003e\n\n   \u003cdetails\u003e\n   \u003csummary\u003e\u003cb\u003eAnthropic API\u003c/b\u003e (Click to expand)\u003c/summary\u003e\n\n   ```python\n   import anthropic\n\n   client = anthropic.Anthropic(\n       base_url=\"http://localhost:10240/anthropic\",\n       api_key=\"not-needed\"\n   )\n\n   message = client.messages.create(\n       model=\"mlx-community/gemma-3-1b-it-4bit-DWQ\",\n       max_tokens=1000,\n       messages=[{\"role\": \"user\", \"content\": \"Hello!\"}]\n   )\n   print(message.content[0].text)\n   ```\n   \u003c/details\u003e\n\n🎉 **That's it!** You're now running AI locally on your Mac.\n\n## 📋 API Support\n\n### OpenAI Compatible Endpoints (`/v1/*`)\n\n| Endpoint | Feature | Status |\n|----------|---------|--------|\n| `/v1/chat/completions` | Chat with tools, streaming, structured output | ✅ |\n| `/v1/audio/speech` | Text-to-Speech | ✅ |\n| `/v1/audio/transcriptions` | Speech-to-Text | ✅ |\n| `/v1/images/generations` | Image Generation | ✅ |\n| `/v1/embeddings` | Text Embeddings | ✅ |\n| `/v1/models` | Model Management | ✅ |\n\n### Anthropic Compatible Endpoints (`/anthropic/v1/*`)\n\n| Endpoint | Feature | Status |\n|----------|---------|--------|\n| `/anthropic/v1/messages` | Messages with tools, streaming, thinking mode | ✅ |\n| `/anthropic/v1/models` | Model listing with pagination | ✅ |\n\n\n## ⚙️ Configuration\n\n```bash\n# Default (port 10240)\nmlx-omni-server\n\n# Custom options\nmlx-omni-server --port 8000\nMLX_OMNI_LOG_LEVEL=debug mlx-omni-server\n\n# View all options\nmlx-omni-server --help\n```\n\n## 🛠 Development\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003cb\u003eDevelopment Setup\u003c/b\u003e\u003c/summary\u003e\n\n```bash\ngit clone https://github.com/madroidmaq/mlx-omni-server.git\ncd mlx-omni-server\nuv sync\n\n# Start with hot-reload\nuv run uvicorn mlx_omni_server.main:app --reload --host 0.0.0.0 --port 10240\n```\n\n**Testing:**\n```bash\nuv run pytest                    # All tests\nuv run pytest tests/chat/openai/ # OpenAI tests\nuv run pytest tests/chat/anthropic/ # Anthropic tests\n```\n\n**Code Quality:**\n```bash\nuv run black . \u0026\u0026 uv run isort . # Format code\nuv run pre-commit run --all-files # Run hooks\n```\n\u003c/details\u003e\n\n## 🎯 Key Features\n\n**Model Management**\n- Auto-discovery of MLX models in HuggingFace cache\n- On-demand loading and intelligent caching\n- Automatic model downloading when needed\n\n**Advanced Capabilities**\n- Function calling with model-specific parsers\n- Real-time streaming for both APIs\n- JSON schema validation and structured output\n- Extended reasoning (thinking mode) for supported models\n\n## 📚 Documentation\n\n| Resource | Description |\n|----------|-------------|\n| [OpenAI API Guide](docs/openai-api.md) | Complete OpenAI API reference |\n| [Anthropic API Guide](docs/anthropic-api.md) | Complete Anthropic API reference |\n| [Examples](examples/) | Practical usage examples |\n\n## 🔍 Troubleshooting\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003cb\u003eCommon Issues\u003c/b\u003e\u003c/summary\u003e\n\n**Requirements:**\n- Python 3.11+\n- Apple Silicon Mac (M1/M2/M3/M4)\n- MLX framework installed\n\n**Quick fixes:**\n```bash\n# Check requirements\npython --version  # Should be 3.11+\npython -c \"import mlx; print(mlx.__version__)\"\n\n# Pre-download models (if needed)\nhuggingface-cli download mlx-community/gemma-3-1b-it-4bit-DWQ\n\n# Enable debug logging\nMLX_OMNI_LOG_LEVEL=debug mlx-omni-server\n```\n\u003c/details\u003e\n\n## 🤝 Contributing\n\n**Quick contributor setup:**\n```bash\ngit clone https://github.com/madroidmaq/mlx-omni-server.git\ncd mlx-omni-server\nuv sync \u0026\u0026 uv run pytest\n```\n\n\u003cdiv align=\"center\"\u003e\n\n---\n\n## 🙏 Acknowledgments\n\nBuilt with [MLX](https://github.com/ml-explore/mlx) by Apple • [FastAPI](https://fastapi.tiangolo.com/) • [MLX-LM](https://github.com/ml-explore/mlx-lm)\n\n## 📄 License\n\n[MIT License](LICENSE) • Not affiliated with OpenAI, Anthropic, or Apple\n\n## 🌟 Star History\n\n[![Star History Chart](https://api.star-history.com/svg?repos=madroidmaq/mlx-omni-server\u0026type=Date)](https://star-history.com/#madroidmaq/mlx-omni-server\u0026Date)\n\n\u003c/div\u003e\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmadroidmaq%2Fmlx-omni-server","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmadroidmaq%2Fmlx-omni-server","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmadroidmaq%2Fmlx-omni-server/lists"}