{"id":13493049,"url":"https://github.com/exo-explore/exo","last_synced_at":"2026-02-25T22:06:54.927Z","repository":{"id":248429984,"uuid":"819554665","full_name":"exo-explore/exo","owner":"exo-explore","description":"Run your own AI cluster at home with everyday devices 📱💻 🖥️⌚","archived":false,"fork":false,"pushed_at":"2025-03-21T22:23:32.000Z","size":12294,"stargazers_count":28005,"open_issues_count":397,"forks_count":1745,"subscribers_count":237,"default_branch":"main","last_synced_at":"2025-05-07T20:34:28.241Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/exo-explore.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2024-06-24T18:36:22.000Z","updated_at":"2025-05-07T19:27:16.000Z","dependencies_parsed_at":"2024-10-26T02:07:43.724Z","dependency_job_id":"d1e91be4-bcee-44c6-8828-1ac4f14b71f8","html_url":"https://github.com/exo-explore/exo","commit_stats":null,"previous_names":["exo-explore/exo"],"tags_count":16,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/exo-explore%2Fexo","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/exo-explore%2Fexo/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/exo-explore%2Fexo/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/exo-explore%2Fexo/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/exo-explore","download_url":"https://codeload.github.com/exo-explore/exo/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":254227603,"owners_count":22035667,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-07-31T19:01:11.752Z","updated_at":"2026-02-25T22:06:54.920Z","avatar_url":"https://github.com/exo-explore.png","language":"Python","readme":"\u003cdiv align=\"center\"\u003e\n\n\u003cpicture\u003e\n  \u003csource media=\"(prefers-color-scheme: light)\" srcset=\"/docs/imgs/exo-logo-black-bg.jpg\"\u003e\n  \u003cimg alt=\"exo logo\" src=\"/docs/imgs/exo-logo-transparent.png\" width=\"50%\" height=\"50%\"\u003e\n\u003c/picture\u003e\n\nexo: Run frontier AI locally. Maintained by [exo labs](https://x.com/exolabs).\n\n\u003cp align=\"center\"\u003e\n  \u003ca href=\"https://discord.gg/TJ4P57arEm\" target=\"_blank\" rel=\"noopener noreferrer\"\u003e\u003cimg src=\"https://img.shields.io/badge/Discord-Join%20Server-5865F2?logo=discord\u0026logoColor=white\" alt=\"Discord\"\u003e\u003c/a\u003e\n  \u003ca href=\"https://x.com/exolabs\" target=\"_blank\" rel=\"noopener noreferrer\"\u003e\u003cimg src=\"https://img.shields.io/twitter/follow/exolabs?style=social\" alt=\"X\"\u003e\u003c/a\u003e\n  \u003ca href=\"https://www.apache.org/licenses/LICENSE-2.0.html\" target=\"_blank\" rel=\"noopener noreferrer\"\u003e\u003cimg src=\"https://img.shields.io/badge/License-Apache2.0-blue.svg\" alt=\"License: Apache-2.0\"\u003e\u003c/a\u003e\n\u003c/p\u003e\n\n\u003c/div\u003e\n\n---\n\nexo connects all your devices into an AI cluster. Not only does exo enable running models larger than would fit on a single device, but with [day-0 support for RDMA over Thunderbolt](https://x.com/exolabs/status/2001817749744476256?s=20), makes models run faster as you add more devices.\n\n## Features\n\n- **Automatic Device Discovery**: Devices running exo automatically discover each other - no manual configuration.\n- **RDMA over Thunderbolt**: exo ships with [day-0 support for RDMA over Thunderbolt 5](https://x.com/exolabs/status/2001817749744476256?s=20), enabling 99% reduction in latency between devices.\n- **Topology-Aware Auto Parallel**: exo figures out the best way to split your model across all available devices based on a realtime view of your device topology. It takes into account device resources and network latency/bandwidth between each link.\n- **Tensor Parallelism**: exo supports sharding models, for up to 1.8x speedup on 2 devices and 3.2x speedup on 4 devices.\n- **MLX Support**: exo uses [MLX](https://github.com/ml-explore/mlx) as an inference backend and [MLX distributed](https://ml-explore.github.io/mlx/build/html/usage/distributed.html) for distributed communication.\n\n## Dashboard\n\nexo includes a built-in dashboard for managing your cluster and chatting with models.\n\n\u003cp align=\"center\"\u003e\n  \u003cimg src=\"docs/imgs/dashboard-cluster-view.png\" alt=\"exo dashboard - cluster view showing 4 x M3 Ultra Mac Studio with DeepSeek v3.1 and Kimi-K2-Thinking loaded\" width=\"80%\" /\u003e\n\u003c/p\u003e\n\u003cp align=\"center\"\u003e\u003cem\u003e4 × 512GB M3 Ultra Mac Studio running DeepSeek v3.1 (8-bit) and Kimi-K2-Thinking (4-bit)\u003c/em\u003e\u003c/p\u003e\n\n## Benchmarks\n\n\u003cdetails\u003e\n  \u003csummary\u003eQwen3-235B (8-bit) on 4 × M3 Ultra Mac Studio with Tensor Parallel RDMA\u003c/summary\u003e\n  \u003cimg src=\"docs/benchmarks/jeffgeerling/mac-studio-cluster-ai-full-1-qwen3-235b.jpeg\" alt=\"Benchmark - Qwen3-235B (8-bit) on 4 × M3 Ultra Mac Studio with Tensor Parallel RDMA\" width=\"80%\" /\u003e\n  \u003cp\u003e\n    \u003cstrong\u003eSource:\u003c/strong\u003e \u003ca href=\"https://www.jeffgeerling.com/blog/2025/15-tb-vram-on-mac-studio-rdma-over-thunderbolt-5\"\u003eJeff Geerling: 15 TB VRAM on Mac Studio – RDMA over Thunderbolt 5\u003c/a\u003e\n  \u003c/p\u003e\n\u003c/details\u003e\n\n\u003cdetails\u003e\n  \u003csummary\u003eDeepSeek v3.1 671B (8-bit) on 4 × M3 Ultra Mac Studio with Tensor Parallel RDMA\u003c/summary\u003e\n  \u003cimg src=\"docs/benchmarks/jeffgeerling/mac-studio-cluster-ai-full-2-deepseek-3.1-671b.jpeg\" alt=\"Benchmark - DeepSeek v3.1 671B (8-bit) on 4 × M3 Ultra Mac Studio with Tensor Parallel RDMA\" width=\"80%\" /\u003e\n  \u003cp\u003e\n    \u003cstrong\u003eSource:\u003c/strong\u003e \u003ca href=\"https://www.jeffgeerling.com/blog/2025/15-tb-vram-on-mac-studio-rdma-over-thunderbolt-5\"\u003eJeff Geerling: 15 TB VRAM on Mac Studio – RDMA over Thunderbolt 5\u003c/a\u003e\n  \u003c/p\u003e\n\u003c/details\u003e\n\n\u003cdetails\u003e\n  \u003csummary\u003eKimi K2 Thinking (native 4-bit) on 4 × M3 Ultra Mac Studio with Tensor Parallel RDMA\u003c/summary\u003e\n  \u003cimg src=\"docs/benchmarks/jeffgeerling/mac-studio-cluster-ai-full-3-kimi-k2-thinking.jpeg\" alt=\"Benchmark - Kimi K2 Thinking (native 4-bit) on 4 × M3 Ultra Mac Studio with Tensor Parallel RDMA\" width=\"80%\" /\u003e\n  \u003cp\u003e\n    \u003cstrong\u003eSource:\u003c/strong\u003e \u003ca href=\"https://www.jeffgeerling.com/blog/2025/15-tb-vram-on-mac-studio-rdma-over-thunderbolt-5\"\u003eJeff Geerling: 15 TB VRAM on Mac Studio – RDMA over Thunderbolt 5\u003c/a\u003e\n  \u003c/p\u003e\n\u003c/details\u003e\n\n---\n\n## Quick Start\n\nDevices running exo automatically discover each other, without needing any manual configuration. Each device provides an API and a dashboard for interacting with your cluster (runs at `http://localhost:52415`).\n\nThere are two ways to run exo:\n\n### Run from Source (macOS)\n\nIf you have [Nix](https://nixos.org/) installed, you can skip most of the steps below and run exo directly:\n\n```bash\nnix run .#exo\n```\n\n**Note:** To accept the Cachix binary cache (and avoid the Xcode Metal ToolChain), add to `/etc/nix/nix.conf`:\n```\ntrusted-users = root    (or your username)\nexperimental-features = nix-command flakes\n```\nThen restart the Nix daemon: `sudo launchctl kickstart -k system/org.nixos.nix-daemon`\n\n**Prerequisites:**\n- [Xcode](https://developer.apple.com/xcode/) (provides the Metal ToolChain required for MLX compilation)\n- [brew](https://github.com/Homebrew/brew) (for simple package management on macOS)\n\n  ```bash\n  /bin/bash -c \"$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)\"\n  ```\n- [uv](https://github.com/astral-sh/uv) (for Python dependency management)\n- [macmon](https://github.com/vladkens/macmon) (for hardware monitoring on Apple Silicon)\n- [node](https://github.com/nodejs/node) (for building the dashboard)\n\n  ```bash\n  brew install uv macmon node\n  ```\n- [rust](https://github.com/rust-lang/rustup) (to build Rust bindings, nightly for now)\n\n  ```bash\n  curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh\n  rustup toolchain install nightly\n  ```\n\nClone the repo, build the dashboard, and run exo:\n\n```bash\n# Clone exo\ngit clone https://github.com/exo-explore/exo\n\n# Build dashboard\ncd exo/dashboard \u0026\u0026 npm install \u0026\u0026 npm run build \u0026\u0026 cd ..\n\n# Run exo\nuv run exo\n```\n\nThis starts the exo dashboard and API at http://localhost:52415/\n\n\n*Please view the section on RDMA to enable this feature on MacOS \u003e=26.2!*\n\n\n### Run from Source (Linux)\n\n**Prerequisites:**\n\n- [uv](https://github.com/astral-sh/uv) (for Python dependency management)\n- [node](https://github.com/nodejs/node) (for building the dashboard) - version 18 or higher\n- [rust](https://github.com/rust-lang/rustup) (to build Rust bindings, nightly for now)\n\n**Installation methods:**\n\n**Option 1: Using system package manager (Ubuntu/Debian example):**\n```bash\n# Install Node.js and npm\nsudo apt update\nsudo apt install nodejs npm\n\n# Install uv\ncurl -LsSf https://astral.sh/uv/install.sh | sh\n\n# Install Rust (using rustup)\ncurl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh\nrustup toolchain install nightly\n```\n\n**Option 2: Using Homebrew on Linux (if preferred):**\n```bash\n# Install Homebrew on Linux\n/bin/bash -c \"$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)\"\n\n# Install dependencies\nbrew install uv node\n\n# Install Rust (using rustup)\ncurl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh\nrustup toolchain install nightly\n```\n\n**Note:** The `macmon` package is macOS-only and not required for Linux.\n\nClone the repo, build the dashboard, and run exo:\n\n```bash\n# Clone exo\ngit clone https://github.com/exo-explore/exo\n\n# Build dashboard\ncd exo/dashboard \u0026\u0026 npm install \u0026\u0026 npm run build \u0026\u0026 cd ..\n\n# Run exo\nuv run exo\n```\n\nThis starts the exo dashboard and API at http://localhost:52415/\n\n**Important note for Linux users:** Currently, exo runs on CPU on Linux. GPU support for Linux platforms is under development. If you'd like to see support for your specific Linux hardware, please [search for existing feature requests](https://github.com/exo-explore/exo/issues) or create a new one.\n\n**Configuration Options:**\n\n- `--no-worker`: Run exo without the worker component. Useful for coordinator-only nodes that handle networking and orchestration but don't execute inference tasks. This is helpful for machines without sufficient GPU resources but with good network connectivity.\n\n  ```bash\n  uv run exo --no-worker\n  ```\n\n**File Locations (Linux):**\n\nexo follows the [XDG Base Directory Specification](https://specifications.freedesktop.org/basedir-spec/basedir-spec-latest.html) on Linux:\n\n- **Configuration files**: `~/.config/exo/` (or `$XDG_CONFIG_HOME/exo/`)\n- **Data files**: `~/.local/share/exo/` (or `$XDG_DATA_HOME/exo/`)\n- **Cache files**: `~/.cache/exo/` (or `$XDG_CACHE_HOME/exo/`)\n\nYou can override these locations by setting the corresponding XDG environment variables.\n\n### macOS App\n\nexo ships a macOS app that runs in the background on your Mac.\n\n\u003cimg src=\"docs/imgs/macos-app-one-macbook.png\" alt=\"exo macOS App - running on a MacBook\" width=\"35%\" /\u003e\n\nThe macOS app requires macOS Tahoe 26.2 or later.\n\nDownload the latest build here: [EXO-latest.dmg](https://assets.exolabs.net/EXO-latest.dmg).\n\nThe app will ask for permission to modify system settings and install a new Network profile. Improvements to this are being worked on.\n\n**Custom Namespace for Cluster Isolation:**\n\nThe macOS app includes a custom namespace feature that allows you to isolate your exo cluster from others on the same network. This is configured through the `EXO_LIBP2P_NAMESPACE` setting:\n\n- **Use cases**:\n  - Running multiple separate exo clusters on the same network\n  - Isolating development/testing clusters from production clusters\n  - Preventing accidental cluster joining\n\n- **Configuration**: Access this setting in the app's Advanced settings (or set the `EXO_LIBP2P_NAMESPACE` environment variable when running from source)\n\nThe namespace is logged on startup for debugging purposes.\n\n#### Uninstalling the macOS App\n\nThe recommended way to uninstall is through the app itself: click the menu bar icon → Advanced → Uninstall. This cleanly removes all system components.\n\nIf you've already deleted the app, you can run the standalone uninstaller script:\n\n```bash\nsudo ./app/EXO/uninstall-exo.sh\n```\n\nThis removes:\n- Network setup LaunchDaemon\n- Network configuration script\n- Log files\n- The \"exo\" network location\n\n**Note:** You'll need to manually remove EXO from Login Items in System Settings → General → Login Items.\n\n---\n\n### Enabling RDMA on macOS\n\nRDMA is a new capability added to macOS 26.2. It works on any Mac with Thunderbolt 5 (M4 Pro Mac Mini, M4 Max Mac Studio, M4 Max MacBook Pro, M3 Ultra Mac Studio).\n\nPlease refer to the caveats for immediate troubleshooting.\n\nTo enable RDMA on macOS, follow these steps:\n\n1. Shut down your Mac.\n2. Hold down the power button for 10 seconds until the boot menu appears.\n3. Select \"Options\" to enter Recovery mode.\n4. When the Recovery UI appears, open the Terminal from the Utilities menu.\n5. In the Terminal, type:\n   ```\n   rdma_ctl enable\n   ```\n   and press Enter.\n6. Reboot your Mac.\n\nAfter that, RDMA will be enabled in macOS and exo will take care of the rest.\n\n**Important Caveats**\n\n1. Devices that wish to be part of an RDMA cluster must be connected to all other devices in the cluster.\n2. The cables must support TB5.\n3. On a Mac Studio, you cannot use the Thunderbolt 5 port next to the Ethernet port.\n4. If running from source, please use the script found at `tmp/set_rdma_network_config.sh`, which will disable Thunderbolt Bridge and set dhcp on each RDMA port.\n5. RDMA ports may be unable to discover each other on different versions of MacOS. Please ensure that OS versions match exactly (even beta version numbers) on all devices.\n\n---\n\n### Using the API\n\nIf you prefer to interact with exo via the API, here is an example creating an instance of a small model (`mlx-community/Llama-3.2-1B-Instruct-4bit`), sending a chat completions request and deleting the instance.\n\n---\n\n**1. Preview instance placements**\n\nThe `/instance/previews` endpoint will preview all valid placements for your model.\n\n```bash\ncurl \"http://localhost:52415/instance/previews?model_id=llama-3.2-1b\"\n```\n\nSample response:\n\n```json\n{\n  \"previews\": [\n    {\n      \"model_id\": \"mlx-community/Llama-3.2-1B-Instruct-4bit\",\n      \"sharding\": \"Pipeline\",\n      \"instance_meta\": \"MlxRing\",\n      \"instance\": {...},\n      \"memory_delta_by_node\": {\"local\": 729808896},\n      \"error\": null\n    }\n    // ...possibly more placements...\n  ]\n}\n```\n\nThis will return all valid placements for this model. Pick a placement that you like.\nTo pick the first one, pipe into `jq`:\n\n```bash\ncurl \"http://localhost:52415/instance/previews?model_id=llama-3.2-1b\" | jq -c '.previews[] | select(.error == null) | .instance' | head -n1\n```\n\n---\n\n**2. Create a model instance**\n\nSend a POST to `/instance` with your desired placement in the `instance` field (the full payload must match types as in `CreateInstanceParams`), which you can copy from step 1:\n\n```bash\ncurl -X POST http://localhost:52415/instance \\\n  -H 'Content-Type: application/json' \\\n  -d '{\n    \"instance\": {...}\n  }'\n```\n\n\nSample response:\n\n```json\n{\n  \"message\": \"Command received.\",\n  \"command_id\": \"e9d1a8ab-....\"\n}\n```\n\n---\n\n**3. Send a chat completion**\n\nNow, make a POST to `/v1/chat/completions` (the same format as OpenAI's API):\n\n```bash\ncurl -N -X POST http://localhost:52415/v1/chat/completions \\\n  -H 'Content-Type: application/json' \\\n  -d '{\n    \"model\": \"mlx-community/Llama-3.2-1B-Instruct-4bit\",\n    \"messages\": [\n      {\"role\": \"user\", \"content\": \"What is Llama 3.2 1B?\"}\n    ],\n    \"stream\": true\n  }'\n```\n\n---\n\n**4. Delete the instance**\n\nWhen you're done, delete the instance by its ID (find it via `/state` or `/instance` endpoints):\n\n```bash\ncurl -X DELETE http://localhost:52415/instance/YOUR_INSTANCE_ID\n```\n\n**Other useful API endpoints*:**\n\n- List all models: `curl http://localhost:52415/models`\n- Inspect instance IDs and deployment state: `curl http://localhost:52415/state`\n\nFor further details, see:\n\n- API basic documentation in [docs/api.md](docs/api.md).\n- API types and endpoints in [src/exo/master/api.py](src/exo/master/api.py).\n\n---\n\n## Benchmarking\n\nThe `exo-bench` tool measures model prefill and token generation speed across different placement configurations. This helps you optimize model performance and validate improvements.\n\n**Prerequisites:**\n- Nodes should be running with `uv run exo` before benchmarking\n- The tool uses the `/bench/chat/completions` endpoint\n\n**Basic usage:**\n\n```bash\nuv run bench/exo_bench.py \\\n  --model Llama-3.2-1B-Instruct-4bit \\\n  --pp 128,256,512 \\\n  --tg 128,256\n```\n\n**Key parameters:**\n\n- `--model`: Model to benchmark (short ID or HuggingFace ID)\n- `--pp`: Prompt size hints (comma-separated integers)\n- `--tg`: Generation lengths (comma-separated integers)\n- `--max-nodes`: Limit placements to N nodes (default: 4)\n- `--instance-meta`: Filter by `ring`, `jaccl`, or `both` (default: both)\n- `--sharding`: Filter by `pipeline`, `tensor`, or `both` (default: both)\n- `--repeat`: Number of repetitions per configuration (default: 1)\n- `--warmup`: Warmup runs per placement (default: 0)\n- `--json-out`: Output file for results (default: bench/results.json)\n\n**Example with filters:**\n\n```bash\nuv run bench/exo_bench.py \\\n  --model Llama-3.2-1B-Instruct-4bit \\\n  --pp 128,512 \\\n  --tg 128 \\\n  --max-nodes 2 \\\n  --sharding tensor \\\n  --repeat 3 \\\n  --json-out my-results.json\n```\n\nThe tool outputs performance metrics including prompt tokens per second (prompt_tps), generation tokens per second (generation_tps), and peak memory usage for each configuration.\n\n---\n\n## Hardware Accelerator Support\n\nOn macOS, exo uses the GPU. On Linux, exo currently runs on CPU. We are working on extending hardware accelerator support. If you'd like support for a new hardware platform, please [search for an existing feature request](https://github.com/exo-explore/exo/issues) and add a thumbs up so we know what hardware is important to the community.\n\n---\n\n## Contributing\n\nSee [CONTRIBUTING.md](CONTRIBUTING.md) for guidelines on how to contribute to exo.","funding_links":[],"categories":["Python","Open Source Projects","A01_文本生成_文本对话","Inference Runtimes \u0026 Backends","Deployment and Serving","Tools, SDKs, and Compilers","Open-Source Local LLM Projects","\u003ca id=\"tools\"\u003e\u003c/a\u003e🛠️ Tools","Repos","⚙️ Platforms \u0026 Infrastructure","Inference engines","开源工具"],"sub_categories":["大语言对话模型及数据","Bleeding Edge ⚗️","🧬 Hybrid or Multi-Platform","好用工具"],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fexo-explore%2Fexo","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fexo-explore%2Fexo","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fexo-explore%2Fexo/lists"}