{"id":30130604,"url":"https://github.com/scouzi1966/maclocal-api","last_synced_at":"2026-04-08T02:03:57.544Z","repository":{"id":308952896,"uuid":"1034175804","full_name":"scouzi1966/maclocal-api","owner":"scouzi1966","description":"'afm' command cli: macOS server and single prompt mode that exposes Apple's Foundation and MLX Models and other APIs running on your Mac through a single aggregated OpenAI-compatible API endpoint. Supports Apple Vision and single command (non-server) inference with piping as well . Now with Web Browser and  local AI API aggregator","archived":false,"fork":false,"pushed_at":"2026-03-01T16:19:34.000Z","size":55437,"stargazers_count":154,"open_issues_count":2,"forks_count":8,"subscribers_count":3,"default_branch":"main","last_synced_at":"2026-03-01T18:57:47.030Z","etag":null,"topics":["ai","apple-foundation-models","apple-intelligence","apple-llm","apple-llm-integration","apple-silicon","finetuning-llms","local-llm","localai","lora","macos-app","macos-swift","mlx","mlx-swift","openai-api","openclaw","opencode","opencode-ai","oss"],"latest_commit_sha":null,"homepage":"","language":"Swift","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/scouzi1966.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-08-08T01:20:14.000Z","updated_at":"2026-03-01T16:20:26.000Z","dependencies_parsed_at":null,"dependency_job_id":"b78b5913-64ec-4984-9740-820083738def","html_url":"https://github.com/scouzi1966/maclocal-api","commit_stats":null,"previous_names":["scouzi1966/maclocal-api"],"tags_count":28,"template":false,"template_full_name":null,"purl":"pkg:github/scouzi1966/maclocal-api","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/scouzi1966%2Fmaclocal-api","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/scouzi1966%2Fmaclocal-api/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/scouzi1966%2Fmaclocal-api/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/scouzi1966%2Fmaclocal-api/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/scouzi1966","download_url":"https://codeload.github.com/scouzi1966/maclocal-api/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/scouzi1966%2Fmaclocal-api/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":30107646,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-03-05T01:39:18.192Z","status":"online","status_checked_at":"2026-03-05T02:00:06.710Z","response_time":93,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai","apple-foundation-models","apple-intelligence","apple-llm","apple-llm-integration","apple-silicon","finetuning-llms","local-llm","localai","lora","macos-app","macos-swift","mlx","mlx-swift","openai-api","openclaw","opencode","opencode-ai","oss"],"created_at":"2025-08-10T18:03:41.612Z","updated_at":"2026-04-08T02:03:57.517Z","avatar_url":"https://github.com/scouzi1966.png","language":"Swift","funding_links":[],"categories":["LLM \u0026 Inference"],"sub_categories":[],"readme":"If you find this useful, please ⭐ the repo! \u0026nbsp; Also check out [Vesta AI Explorer](https://kruks.ai/)! — my full-featured native macOS AI app.\n\n\u003e [!NOTE]\n\u003e\n\u003e 31 Mar, 2026. AFM was pinned to an older version of https://github.com/huggingface/swift-huggingface. I have now pinned to the latest which uses hub for model cache. The older version downloaded models to the ~/Documents/Huggingface folder which was causing some pain with iCloud sync. They are now stored under ~/.cache which is not in iCloud scope. the TLDR is that models will be re-downloaded again. You can manually delete the older models located in ~/Documents/Huggingface to regain some valuable space available (spring cleaning!). Please report any issues.\n\u003e \n\u003e **Attention M-series Mac AI enthusiasts!** You don't need to be a Swift developer to explore. Vibe coding really allows anyone to participate in this project. A lot of the hype is real! It does work.\n\u003e\n\u003e [Fork this repo](https://github.com/scouzi1966/maclocal-api/fork) first, then clone your fork to submit PRs:\n\u003e\n\u003e ```bash\n\u003e git clone https://github.com/\u003cyour-username\u003e/maclocal-api.git   \n\u003e cd maclocal-api\n\u003e claude\n\u003e /build-afm\n\u003e ```\n\u003e\n\u003e To just experiment locally\n\u003e \n\u003e ```bash\n\u003e git clone https://github.com/scouzi1966/maclocal-api.git   \n\u003e cd maclocal-api\n\u003e claude\n\u003e /build-afm\n\u003e ```\n\u003e\n\u003e /build-afm is an AI skill that builds for the first time so that you can start coding\n\u003e\n\u003e Start vibe coding! I will add support for skills with more coding agents in the future.\n\n# afm — Run Any MLX LLM on Your Mac, 100% Local\n\nExtensive testing of Qwen3.5-35B-A3B with afm. Uses an experimental technique with Claude and Codex as judges for evaluation scoring. Click the link below to view test results.\n\n### [afm-next Nightly Test Report — Qwen3.5-35B-A3B Focus](https://kruks.ai/macafm/)\n\nRun open-source MLX models **or** Apple's on-device Foundation Model through an OpenAI-compatible API. Built entirely in Swift for maximum Metal GPU performance. No Python runtime, no cloud, no API keys.\n\n## Install\n\n|  | Stable (v0.9.9) | Nightly (afm-next) |\n|---|---|---|\n| **Homebrew** | `brew install scouzi1966/afm/afm` | `brew install scouzi1966/afm/afm-next` |\n| **pip** | `pip install macafm` | `pip install --extra-index-url https://kruks.ai/afm/wheels/simple/ macafm-next` |\n| **Release notes** | [v0.9.9](https://github.com/scouzi1966/maclocal-api/releases/tag/v0.9.9) | [v0.9.10-next](https://github.com/scouzi1966/maclocal-api/releases/tag/nightly-20260408-628c2bb) |\n\n\u003e [!NOTE]\n\u003e The stable release (v0.9.9) and the latest nightly are currently at the same level. Either one will give you the same experience.\n\n\u003e [!TIP]\n\u003e **Switching between stable and nightly:**\n\u003e ```bash\n\u003e brew unlink afm \u0026\u0026 brew install scouzi1966/afm/afm-next   # switch to nightly\n\u003e brew unlink afm-next \u0026\u0026 brew link afm                      # switch back to stable\n\u003e ASSUMES you did a brew install scouzi1966/afm/afm previously\n\u003e ```\n\n## What's new in afm-next\n\n\u003e [!IMPORTANT]\n\u003e The nightly build is the future stable release. It includes everything in v0.9.8 plus:\n\u003e - No new features yet — nightly is currently in sync with the stable release\n\n## Quick Start\n\n```bash\n# Run any MLX model with WebUI\nafm mlx -m mlx-community/Qwen3.5-35B-A3B-4bit -w\n\n# Or any smaller model\nafm mlx -m mlx-community/gemma-3-4b-it-8bit -w\n\n# Chat from the terminal (auto-downloads from Hugging Face)\nafm mlx -m Qwen3-0.6B-4bit -s \"Explain quantum computing\"\n\n# Interactive model picker (lists your downloaded models)\nMACAFM_MLX_MODEL_CACHE=/path/to/models afm mlx -w\n\n# Apple's on-device Foundation Model with WebUI\nafm -w\n```\n\n## Use with OpenCode\n\n[OpenCode](https://opencode.ai/) is a terminal-based AI coding assistant. Connect it to afm for a fully local coding experience — no cloud, no API keys. No Internet required (other than initially download the model of course!)\n\n**1. Configure OpenCode** (`~/.config/opencode/opencode.json`):\n\n```json\n{\n  \"$schema\": \"https://opencode.ai/config.json\",\n  \"provider\": {\n    \"ollama\": {\n      \"npm\": \"@ai-sdk/openai-compatible\",\n      \"name\": \"macafm (local)\",\n      \"options\": {\n        \"baseURL\": \"http://localhost:9999/v1\"\n      },\n      \"models\": {\n        \"mlx-community/Qwen3-Coder-Next-4bit\": {\n          \"name\": \"mlx-community/Qwen3-Coder-Next-4bit\"\n        }\n      }\n    }\n  }\n}\n```\n\n**2. Start afm with a coding model:**\n```bash\nafm mlx -m mlx-community/Qwen3-Coder-Next-4bit -t 1.0 --top-p 0.95 --max-tokens 8192\n```\n\n**3. Launch OpenCode** and type `/connect`. Scroll down to the very bottom of the provider list — `macafm (local)` will likely be the last entry. Select it, and when prompted for an API key, enter any value (e.g. `x`) — tokenized access is not yet implemented in afm so the key is ignored. All inference runs locally on your Mac's GPU.\n\n---\n\n## 28+ MLX Models Tested\n\n![MLX Models](test-reports/MLX-Models.png)\n\n28 models tested and verified including Qwen3, Gemma 3/3n, GLM-4/5, DeepSeek V3, LFM2, SmolLM3, Llama 3.2, MiniMax M2.5, Nemotron, and more. See [test reports](test-reports/).\n\n---\n\n[![Swift](https://img.shields.io/badge/Swift-6.2+-orange.svg)](https://swift.org)\n[![macOS](https://img.shields.io/badge/macOS-26+-blue.svg)](https://developer.apple.com/macos/)\n[![License](https://img.shields.io/badge/License-MIT-green.svg)](LICENSE)\n\n## ⭐ Star History\n\n[![Star History Chart](https://api.star-history.com/svg?repos=scouzi1966/maclocal-api\u0026type=Date)](https://star-history.com/#scouzi1966/maclocal-api\u0026Date)\n\n## Related Projects\n\n- [Vesta AI Explorer](https://kruks.ai/) — full-featured native macOS AI chat app\n- [AFMTrainer](https://github.com/scouzi1966/AFMTrainer) — LoRA fine-tuning wrapper for Apple's toolkit (Mac M-series \u0026 Linux CUDA)\n- [Apple Foundation Model Adapters](https://developer.apple.com/apple-intelligence/foundation-models-adapter/) — Apple's adapter training toolkit\n\n## 🌟 Features\n\n- **🔗 OpenAI API Compatible** - Works with existing OpenAI client libraries and applications\n- **🧠 MLX Local Models** - Run any Hugging Face MLX model locally (Qwen, Gemma, Llama, DeepSeek, GLM, and 28+ tested models)\n- **🌐 API Gateway** - Auto-discovers and proxies Ollama, LM Studio, Jan, and other local backends into a single API\n- **⚡ LoRA adapter support** - Supports fine-tuning with LoRA adapters using Apple's tuning Toolkit\n- **📱 Apple Foundation Models** - Uses Apple's on-device 3B parameter language model\n- **👁️ Vision OCR** - Extract text from images and PDFs using Apple Vision (`afm vision`)\n- **🖥️ Built-in WebUI** - Chat interface with model selection (`afm -w`)\n- **🔒 Privacy-First** - All processing happens locally on your device\n- **⚡ Fast \u0026 Lightweight** - No network calls, no API keys required\n- **🛠️ Easy Integration** - Drop-in replacement for OpenAI API endpoints\n- **📊 Token Usage Tracking** - Provides accurate token consumption metrics\n\n## 📋 Requirements\n\n- **macOS 26 (Tahoe) or later\n- **Apple Silicon Mac** (M1/M2/M3/M4 series)\n- **Apple Intelligence enabled** in System Settings\n- **Xcode 26 (for building from source)\n\n## 🚀 Quick Start\n\n### Installation\n\n#### Option 1: Homebrew (Recommended)\n\n```bash\n# Add the tap\nbrew tap scouzi1966/afm\n\n# Install AFM\nbrew install afm\n\n# Verify installation\nafm --version\n```\n#### Option 2: pip (PyPI)\n\n```bash\n# Install from PyPI\npip install macafm\n\n# Verify installation\nafm --version\n```\n\n#### Option 3: Build from Source\n\n```bash\n# Clone the repository with submodules\ngit clone --recurse-submodules https://github.com/scouzi1966/maclocal-api.git\ncd maclocal-api\n\n# Build everything from scratch (patches + webui + release build)\n./Scripts/build-from-scratch.sh\n\n# Or skip webui if you don't have Node.js\n./Scripts/build-from-scratch.sh --skip-webui\n\n# Or use make (patches + release build, no webui)\nmake\n\n# Run\n./.build/release/afm --version\n```\n\n### Running\n\n```bash\n# API server only (Apple Foundation Model on port 9999)\nafm\n\n# API server with WebUI chat interface\nafm -w\n\n# WebUI + API gateway (auto-discovers Ollama, LM Studio, Jan, etc.)\nafm -w -g\n\n# Custom port with verbose logging\nafm -p 8080 -v\n\n# Show help\nafm -h\n```\n\n### MLX Local Models\n\nRun open-source models locally on Apple Silicon using MLX:\n\n```bash\n# Run a model with single prompt\nafm mlx -m mlx-community/Qwen2.5-0.5B-Instruct-4bit -s \"Explain gravity\"\n\n# Start MLX model with WebUI\nafm mlx -m mlx-community/gemma-3-4b-it-8bit -w\n\n# Interactive model picker (lists downloaded models)\nafm mlx -w\n\n# MLX model as API server\nafm mlx -m mlx-community/Llama-3.2-1B-Instruct-4bit -p 8080\n\n# Pipe mode\ncat essay.txt | afm mlx -m mlx-community/Qwen3-0.6B-4bit -i \"Summarize this\"\n\n# MLX help\nafm mlx --help\n```\n\nModels are downloaded from Hugging Face on first use and cached locally. Any model from the [mlx-community](https://huggingface.co/mlx-community) collection is supported.\n\n## 📡 API Endpoints\n\n### Chat Completions\n**POST** `/v1/chat/completions`\n\nCompatible with OpenAI's chat completions API.\n\n```bash\ncurl -X POST http://localhost:9999/v1/chat/completions \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\n    \"model\": \"foundation\",\n    \"messages\": [\n      {\"role\": \"user\", \"content\": \"Hello, how are you?\"}\n    ]\n  }'\n```\n\n### List Models\n**GET** `/v1/models`\n\nReturns available Foundation Models.\n\n```bash\ncurl http://localhost:9999/v1/models\n```\n\n### Health Check\n**GET** `/health`\n\nServer health status endpoint.\n\n```bash\ncurl http://localhost:9999/health\n```\n\n## 💻 Usage Examples\n\n### Python with OpenAI Library\n\n```python\nfrom openai import OpenAI\n\n# Point to your local MacLocalAPI server\nclient = OpenAI(\n    api_key=\"not-needed-for-local\",\n    base_url=\"http://localhost:9999/v1\"\n)\n\nresponse = client.chat.completions.create(\n    model=\"foundation\",\n    messages=[\n        {\"role\": \"user\", \"content\": \"Explain quantum computing in simple terms\"}\n    ]\n)\n\nprint(response.choices[0].message.content)\n```\n\n### JavaScript/Node.js\n\n```javascript\nimport OpenAI from 'openai';\n\nconst openai = new OpenAI({\n  apiKey: 'not-needed-for-local',\n  baseURL: 'http://localhost:9999/v1',\n});\n\nconst completion = await openai.chat.completions.create({\n  messages: [{ role: 'user', content: 'Write a haiku about programming' }],\n  model: 'foundation',\n});\n\nconsole.log(completion.choices[0].message.content);\n```\n\n### curl Examples\n\n```bash\n# Basic chat completion\ncurl -X POST http://localhost:9999/v1/chat/completions \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\n    \"model\": \"foundation\",\n    \"messages\": [\n      {\"role\": \"system\", \"content\": \"You are a helpful assistant.\"},\n      {\"role\": \"user\", \"content\": \"What is the capital of France?\"}\n    ]\n  }'\n\n# With temperature control\ncurl -X POST http://localhost:9999/v1/chat/completions \\\n  -H \"Content-Type: application/json\" \\\n  -d '{\n    \"model\": \"foundation\",\n    \"messages\": [{\"role\": \"user\", \"content\": \"Be creative!\"}],\n    \"temperature\": 0.8\n  }'\n```\n\n### Single Prompt \u0026 Pipe Examples\n\n```bash\n# Single prompt mode\nafm -s \"Explain quantum computing\"\n\n# Piped input from other commands\necho \"What is the meaning of life?\" | afm\ncat file.txt | afm\ngit log --oneline | head -5 | afm\n\n# Custom instructions with pipe\necho \"Review this code\" | afm -i \"You are a senior software engineer\"\n```\n\n## 🏗️ Architecture\n\n```\nMacLocalAPI/\n├── Package.swift                    # Swift Package Manager config\n├── Sources/MacLocalAPI/\n│   ├── main.swift                   # CLI entry point \u0026 ArgumentParser\n│   ├── Server.swift                 # Vapor web server configuration\n│   ├── Controllers/\n│   │   └── ChatCompletionsController.swift  # OpenAI API endpoints\n│   └── Models/\n│       ├── FoundationModelService.swift     # Apple Foundation Models wrapper\n│       ├── OpenAIRequest.swift              # Request data models\n│       └── OpenAIResponse.swift             # Response data models\n└── README.md\n```\n\n## 🔧 Configuration\n\n### Command Line Options\n\n```\nOVERVIEW: macOS server that exposes Apple's Foundation Models through\nOpenAI-compatible API\n\nUse -w to enable the WebUI, -g to enable API gateway mode (auto-discovers and\nproxies to Ollama, LM Studio, Jan, and other local LLM backends).\n\nUSAGE: afm \u003coptions\u003e\n       afm mlx [\u003coptions\u003e]      Run local MLX models from Hugging Face\n       afm vision \u003cimage\u003e       OCR text extraction from images/PDFs\n\nOPTIONS:\n  -s, --single-prompt \u003csingle-prompt\u003e\n                          Run a single prompt without starting the server\n  -i, --instructions \u003cinstructions\u003e\n                          Custom instructions for the AI assistant (default:\n                          You are a helpful assistant)\n  -v, --verbose           Enable verbose logging\n  --no-streaming          Disable streaming responses (streaming is enabled by\n                          default)\n  -a, --adapter \u003cadapter\u003e Path to a .fmadapter file for LoRA adapter fine-tuning\n  -p, --port \u003cport\u003e       Port to run the server on (default: 9999)\n  -H, --hostname \u003chostname\u003e\n                          Hostname to bind server to (default: 127.0.0.1)\n  -t, --temperature \u003ctemperature\u003e\n                          Temperature for response generation (0.0-1.0)\n  -r, --randomness \u003crandomness\u003e\n                          Sampling mode: 'greedy', 'random',\n                          'random:top-p=\u003c0.0-1.0\u003e', 'random:top-k=\u003cint\u003e', with\n                          optional ':seed=\u003cint\u003e'\n  -P, --permissive-guardrails\n                          Permissive guardrails for unsafe or inappropriate\n                          responses\n  -w, --webui             Enable webui and open in default browser\n  -g, --gateway           Enable API gateway mode: discover and proxy to local\n                          LLM backends (Ollama, LM Studio, Jan, etc.)\n  --prewarm \u003cprewarm\u003e     Pre-warm the model on server startup for faster first\n                          response (y/n, default: y)\n  --version               Show the version.\n  -h, --help              Show help information.\n\nNote: afm also accepts piped input from other commands, equivalent to using -s\nwith the piped content as the prompt.\n```\n\n### Environment Variables\n\nThe server respects standard logging environment variables:\n- `LOG_LEVEL` - Set logging level (trace, debug, info, notice, warning, error, critical)\n\n## ⚠️ Limitations \u0026 Notes\n\n- **Model Scope**: Apple Foundation Model is a 3B parameter model (optimized for on-device performance)\n- **macOS 26+ Only**: Requires the latest macOS with Foundation Models framework\n- **Apple Intelligence Required**: Must be enabled in System Settings\n- **Token Estimation**: Uses word-based approximation for token counting (Foundation model only; proxied backends report real counts)\n\n## 🔍 Troubleshooting\n\n### \"Foundation Models framework is not available\"\n1. Ensure you're running **macOS 26 or later\n2. Enable **Apple Intelligence** in System Settings → Apple Intelligence \u0026 Siri\n3. Verify you're on an **Apple Silicon Mac**\n4. Restart the application after enabling Apple Intelligence\n\n### Server Won't Start\n1. Check if the port is already in use: `lsof -i :9999`\n2. Try a different port: `afm -p 8080`\n3. Enable verbose logging: `afm -v`\n\n### Build Issues\n1. Ensure you have **Xcode 26 installed\n2. Update Swift toolchain: `xcode-select --install`\n3. Clean and rebuild: `swift package clean \u0026\u0026 swift build -c release`\n\n## 🤝 Contributing\n\nContributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.\n\n### Development Setup\n\n```bash\n# Clone the repo with submodules\ngit clone --recurse-submodules https://github.com/scouzi1966/maclocal-api.git\ncd maclocal-api\n\n# Full build from scratch (submodules + patches + webui + release)\n./Scripts/build-from-scratch.sh\n\n# Or for debug builds during development\n./Scripts/build-from-scratch.sh --debug --skip-webui\n\n# Run with verbose logging\n./.build/debug/afm -w -g -v\n```\n\n## 📄 License\n\nThis project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.\n\n## 🙏 Acknowledgments\n\n- Apple for the Foundation Models framework\n- The Vapor Swift web framework team\n- OpenAI for the API specification standard\n- The Swift community for excellent tooling\n\n## 📞 Support\n\nIf you encounter any issues or have questions:\n\n1. Check the [Troubleshooting](#-troubleshooting) section\n2. Search existing [GitHub Issues](https://github.com/scouzi1966/maclocal-api/issues)\n3. Create a new issue with detailed information about your problem\n\n## 🗺️ Roadmap\n\n- [x] Streaming response support\n- [x] MLX local model support (28+ models tested)\n- [x] Multiple model support (API gateway mode)\n- [x] Web UI for testing (llama.cpp WebUI integration)\n- [x] Vision OCR subcommand\n- [x] Function/tool calling (OpenAI-compatible, multiple formats)\n- [ ] Performance optimizations\n- [ ] [BFCL](https://github.com/ShishirPatil/gorilla/tree/main/berkeley-function-call-leaderboard) integration for automated tool calling validation\n- [ ] Docker containerization (when supported)\n\n---\n\n**Made with ❤️ for the Apple Silicon community**\n\n*Bringing the power of local AI to your fingertips.*\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fscouzi1966%2Fmaclocal-api","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fscouzi1966%2Fmaclocal-api","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fscouzi1966%2Fmaclocal-api/lists"}