{"id":15136486,"url":"https://github.com/nexaai/nexa-sdk","last_synced_at":"2026-01-16T15:29:23.024Z","repository":{"id":254317731,"uuid":"843570824","full_name":"NexaAI/nexa-sdk","owner":"NexaAI","description":"Nexa SDK is a comprehensive toolkit for supporting GGML and ONNX models. It supports text generation, image generation, vision-language models (VLM), Audio Language Model, auto-speech-recognition (ASR), and text-to-speech (TTS) capabilities.","archived":false,"fork":false,"pushed_at":"2025-03-06T23:59:34.000Z","size":204432,"stargazers_count":4533,"open_issues_count":83,"forks_count":628,"subscribers_count":424,"default_branch":"main","last_synced_at":"2025-05-11T05:46:50.177Z","etag":null,"topics":["asr","audio","edge-computing","language-model","llm","on-device-ai","on-device-ml","sdk","sdk-python","stable-diffusion","transformers","tts","vlm","whisper"],"latest_commit_sha":null,"homepage":"https://docs.nexa.ai/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/NexaAI.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":".github/CODEOWNERS","security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-08-16T20:13:07.000Z","updated_at":"2025-05-09T23:16:44.000Z","dependencies_parsed_at":"2024-08-26T19:35:59.933Z","dependency_job_id":"baccbc93-9ad7-45df-8c0c-ef75713646b8","html_url":"https://github.com/NexaAI/nexa-sdk","commit_stats":{"total_commits":500,"total_committers":30,"mean_commits":"16.666666666666668","dds":0.804,"last_synced_commit":"e98f3cdd243ecc82e515203381e7f65f2de80bf9"},"previous_names":["nexaai/nexa-sdk","nexaai/nexaai-sdk-cpp"],"tags_count":130,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/NexaAI%2Fnexa-sdk","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/NexaAI%2Fnexa-sdk/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/NexaAI%2Fnexa-sdk/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/NexaAI%2Fnexa-sdk/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/NexaAI","download_url":"https://codeload.github.com/NexaAI/nexa-sdk/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":253523720,"owners_count":21921818,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["asr","audio","edge-computing","language-model","llm","on-device-ai","on-device-ml","sdk","sdk-python","stable-diffusion","transformers","tts","vlm","whisper"],"created_at":"2024-09-26T06:22:09.242Z","updated_at":"2026-01-16T15:29:23.006Z","avatar_url":"https://github.com/NexaAI.png","language":"Python","funding_links":[],"categories":["Langchain"],"sub_categories":[],"readme":"\u003cdiv align=\"center\" style=\"text-decoration: none;\"\u003e\n  \u003cimg width=\"100%\" src=\"assets/banner1.png\" alt=\"Nexa AI Banner\"\u003e\n  \u003cp style=\"font-size: 1.3em; font-weight: 600; margin-bottom: 20px;\"\u003e \n    \u003ca href=\"README_zh.md\"\u003e 简体中文 \u003c/a\u003e\n    |\n    \u003ca href=\"README.md\"\u003e English \u003c/a\u003e\n  \u003c/p\u003e\n  \u003cp style=\"font-size: 1.3em; font-weight: 600; margin-bottom: 20px;\"\u003e🤝 Supported chipmakers \u003c/p\u003e\n    \u003cpicture\u003e\n      \u003csource srcset=\"assets/chipmakers-dark.png\" media=\"(prefers-color-scheme: dark)\"\u003e\n      \u003csource srcset=\"assets/chipmakers.png\" media=\"(prefers-color-scheme: light)\"\u003e\n      \u003cimg src=\"assets/chipmakers.png\" style=\"max-height:30px; height:auto; width:auto;\"\u003e\n    \u003c/picture\u003e\n  \u003c/p\u003e\n  \u003cp\u003e\n    \u003ca href=\"https://www.producthunt.com/products/nexasdk-for-mobile?embed=true\u0026utm_source=badge-top-post-badge\u0026utm_medium=badge\u0026utm_campaign=badge-nexasdk-for-mobile\" target=\"_blank\" rel=\"noopener noreferrer\"\u003e\n        \u003cimg alt=\"NexaSDK for Mobile - #1 Product of the Day\" width=\"180\" height=\"39\" src=\"https://api.producthunt.com/widgets/embed-image/v1/top-post-badge.svg?post_id=1049998\u0026theme=dark\u0026period=daily\u0026t=1765991451976\"\u003e\n    \u003c/a\u003e\n    \u003ca href=\"https://trendshift.io/repositories/12239\" target=\"_blank\" rel=\"noopener noreferrer\"\u003e\n        \u003cimg alt=\"NexaAI/nexa-sdk - #1 Repository of the Day\" height=\"39\" src=\"https://trendshift.io/api/badge/repositories/12239\"\u003e\n    \u003c/a\u003e\n  \u003c/p\u003e\n  \u003cp\u003e\n    \u003ca href=\"https://docs.nexa.ai\"\u003e\n        \u003cimg src=\"https://img.shields.io/badge/docs-website-brightgreen?logo=readthedocs\" alt=\"Documentation\"\u003e\n    \u003c/a\u003e\n    \u003ca href=\"https://sdk.nexa.ai/wishlist\"\u003e\n        \u003cimg src=\"https://img.shields.io/badge/🎯_Vote_for-Next_Models-ff69b4?style=flat-square\" alt=\"Vote for Next Models\"\u003e\n    \u003c/a\u003e\n    \u003ca href=\"https://x.com/nexa_ai\"\u003e\u003cimg alt=\"X account\" src=\"https://img.shields.io/twitter/url/https/twitter.com/diffuserslib.svg?style=social\u0026label=Follow%20%40Nexa_AI\"\u003e\u003c/a\u003e\n    \u003ca href=\"https://discord.com/invite/nexa-ai\"\u003e\n        \u003cimg src=\"https://img.shields.io/discord/1192186167391682711?color=5865F2\u0026logo=discord\u0026logoColor=white\u0026style=flat-square\" alt=\"Join us on Discord\"\u003e\n    \u003c/a\u003e\n    \u003ca href=\"https://join.slack.com/t/nexa-ai-community/shared_invite/zt-3837k9xpe-LEty0disTTUnTUQ4O3uuNw\"\u003e\n        \u003cimg src=\"https://img.shields.io/badge/slack-join%20chat-4A154B?logo=slack\u0026logoColor=white\" alt=\"Join us on Slack\"\u003e\n    \u003c/a\u003e\n  \u003c/p\u003e\n\u003c/div\u003e\n\n# NexaSDK\n\n**NexaSDK lets you build the smartest and fastest on-device AI with minimum energy.** It is a highly performant local inference framework that runs the latest multimodal AI models locally on NPU, GPU, and CPU - across Android, Windows, Linux, macOS, and iOS devices with a few lines of code.\n\nNexaSDK supports latest models **weeks or months before anyone else** — Qwen3-VL, DeepSeek-OCR, Gemma3n (Vision), and more.\n\n\u003e ⭐ **Star this repo** to keep up with exciting updates and new releases about latest on-device AI capabilities.\n\n## 🏆 Recognized Milestones\n\n- **Qualcomm** featured us **3 times** in official blogs.\n  - [Innovating Multimodal AI on Qualcomm Hexagon NPU](https://www.qualcomm.com/developer/blog/2025/09/omnineural-4b-nexaml-qualcomm-hexagon-npu).\n  - [First-ever Day-0 model support on Qualcomm Hexagon NPU for compute and mobile platforms, Auto and IoT](https://www.qualcomm.com/developer/blog/2025/10/granite-4-0-to-the-edge-on-device-ai-for-real-world-performance).\n  - [A simple way to bring on-device AI to smartphones with Snapdragon](https://www.qualcomm.com/developer/blog/2025/11/nexa-ai-for-android-simple-way-to-bring-on-device-ai-to-smartphones-with-snapdragon)\n- **Qwen** featured us for [Day-0 Qwen3-VL support on NPU, GPU, and CPU](https://x.com/Alibaba_Qwen/status/1978154384098754943). We were 3 weeks ahead of Ollama and llama.cpp on GGUF support, and no one else supports it on NPU to date.\n- **IBM** featured our NexaML inference engine alongside vLLM, llama.cpp, and MLX in [official IBM blog](https://www.ibm.com/new/announcements/ibm-granite-4-0-hyper-efficient-high-performance-hybrid-models) and also for Day-0 Granite 4.0 support.\n- **Google** featured us for [EmbeddingGemma Day-0 NPU support](https://x.com/googleaidevs/status/1969188152049889511).\n- **AMD** featured us for [enabling SDXL-turbo image generation on AMD NPU](https://www.amd.com/en/developer/resources/technical-articles/2025/advancing-ai-with-nexa-ai--image-generation-on-amd-npu-with-sdxl.html).\n- **NVIDIA** featured Hyperlink, a viral local AI app powered by NexaSDK, in their [official blog](https://blogs.nvidia.com/blog/rtx-ai-garage-nexa-hyperlink-local-agent/).\n- **Microsoft** presented us on stage at Microsoft Ignite 2025 as [official partner](https://www.linkedin.com/posts/mixen_excited-to-celebrate-our-developer-partnerships-activity-7396601602327007232-AmCR?utm_source=share\u0026utm_medium=member_desktop\u0026rcm=ACoAAChXnS8B4gqbBLUlWfwt-ck0XAv472NzT4k).\n- **Intel** featured us for [Intel NPU support in NexaSDK](https://www.linkedin.com/posts/intel-software_ai-ondeviceai-nexasdk-activity-7376337062087667712-xw7i?utm_source=share\u0026utm_medium=member_desktop\u0026rcm=ACoAAChXnS8B4gqbBLUlWfwt-ck0XAv472NzT4k).\n\n## 🚀 Quick Start\n\n| Platform        | Links                                                                                     |\n| --------------- | ----------------------------------------------------------------------------------------- |\n| 🖥️ CLI          | [Quick Start](#-cli) ｜ [Docs](https://docs.nexa.ai/en/nexa-sdk-go/NexaCLI)               |\n| 🐍 Python       | [Quick Start](#-python-sdk) ｜ [Docs](https://docs.nexa.ai/en/nexa-sdk-python/overview)   |\n| 🤖 Android      | [Quick Start](#-android-sdk) ｜ [Docs](https://docs.nexa.ai/en/nexa-sdk-android/overview) |\n| 🐳 Linux Docker | [Quick Start](#-linux-docker) ｜ [Docs](https://docs.nexa.ai/en/nexa-sdk-docker/overview) |\n| 🍎 iOS          | [Quick Start](#-ios-sdk) ｜ [Docs](https://docs.nexa.ai/en/nexa-sdk-ios/overview)         |\n\n---\n\n### 🖥️ CLI\n\n**Download:**\n\n| Windows                                                                                                  | macOS                                                                                                   | Linux                                                                                        |\n| -------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------- |\n| [arm64 (Qualcomm NPU)](https://public-storage.nexa4ai.com/nexa_sdk/downloads/nexa-cli_windows_arm64.exe) | [arm64 (Apple Silicon)](https://public-storage.nexa4ai.com/nexa_sdk/downloads/nexa-cli_macos_arm64.pkg) | [arm64](https://github.com/NexaAI/nexa-sdk/releases/latest/download/nexa-cli_linux_arm64.sh) |\n| [x64 (Intel/AMD NPU)](https://public-storage.nexa4ai.com/nexa_sdk/downloads/nexa-cli_windows_x86_64.exe) | [x64](https://public-storage.nexa4ai.com/nexa_sdk/downloads/nexa-cli_macos_x86_64.pkg)                  | [x64](https://github.com/NexaAI/nexa-sdk/releases/latest/download/nexa-cli_linux_x86_64.sh)  |\n\n**Run your first model:**\n\n```bash\n# Chat with Qwen3\nnexa infer ggml-org/Qwen3-1.7B-GGUF\n\n# Multimodal: drag images into the CLI\nnexa infer NexaAI/Qwen3-VL-4B-Instruct-GGUF\n\n# NPU (Windows arm64 with Snapdragon X Elite)\nnexa infer NexaAI/OmniNeural-4B\n```\n\n- **Models:** LLM, Multimodal, ASR, OCR, Rerank, Object Detection, Image Generation, Embedding\n- **Formats:** GGUF, MLX, NEXA\n- **NPU Models:** [Model Hub](https://sdk.nexa.ai/model)\n- 📖 [CLI Reference Docs](https://docs.nexa.ai/en/nexa-sdk-go/NexaCLI)\n\n---\n\n### 🐍 Python SDK\n\n```bash\npip install nexaai\n```\n\n```python\nfrom nexaai import LLM, GenerationConfig, ModelConfig, LlmChatMessage\n\nllm = LLM.from_(model=\"NexaAI/Qwen3-0.6B-GGUF\", config=ModelConfig())\n\nconversation = [\n    LlmChatMessage(role=\"user\", content=\"Hello, tell me a joke\")\n]\nprompt = llm.apply_chat_template(conversation)\nfor token in llm.generate_stream(prompt, GenerationConfig(max_tokens=100)):\n    print(token, end=\"\", flush=True)\n```\n\n- **Models:** LLM, Multimodal, ASR, OCR, Rerank, Object Detection, Image Generation, Embedding\n- **Formats:** GGUF, MLX, NEXA\n- **NPU Models:** [Model Hub](https://sdk.nexa.ai/model)\n- 📖 [Python SDK Docs](https://docs.nexa.ai/en/nexa-sdk-python/quickstart)\n\n---\n\n### 🤖 Android SDK\n\nAdd to your `app/AndroidManifest.xml`\n\n```xml\n\u003capplication android:extractNativeLibs=\"true\"\u003e\n```\n\nAdd to your `build.gradle.kts`:\n\n```kotlin\ndependencies {\n    implementation(\"ai.nexa:core:0.0.15\")\n}\n```\n\n```kotlin\n// Initialize SDK\nNexaSdk.getInstance().init(this)\n\n// Load and run model\nVlmWrapper.builder()\n    .vlmCreateInput(VlmCreateInput(\n        model_name = \"omni-neural\",\n        model_path = \"/data/data/your.app/files/models/OmniNeural-4B/files-1-1.nexa\",\n        plugin_id = \"npu\",\n        config = ModelConfig()\n    ))\n    .build()\n    .onSuccess { vlm -\u003e\n        vlm.generateStreamFlow(\"Hello!\", GenerationConfig()).collect { print(it) }\n    }\n```\n\n- **Requirements:** Android minSdk 27, Qualcomm Snapdragon 8 Gen 4 Chip\n- **Models:** LLM, Multimodal, ASR, OCR, Rerank, Embedding\n- **NPU Models:** [Supported Models](https://docs.nexa.ai/en/nexa-sdk-android/overview#supported-models)\n- 📖 [Android SDK Docs](https://docs.nexa.ai/en/nexa-sdk-android/quickstart)\n\n---\n\n### 🐳 Linux Docker\n\n```bash\ndocker pull nexa4ai/nexasdk:latest\n\nexport NEXA_TOKEN=\"your_token_here\"\ndocker run --rm -it --privileged \\\n  -e NEXA_TOKEN \\\n  nexa4ai/nexasdk:latest infer NexaAI/Granite-4.0-h-350M-NPU\n```\n\n- **Requirements:** Qualcomm Dragonwing IQ9, ARM64 systems\n- **Models:** LLM, VLM, ASR, CV, Rerank, Embedding\n- **NPU Models:** [Supported Models](https://docs.nexa.ai/en/nexa-sdk-docker/overview#supported-models)\n- 📖 [Linux Docker Docs](https://docs.nexa.ai/en/nexa-sdk-docker/quickstart)\n\n---\n\n### 🍎 iOS SDK\n\nDownload [NexaSdk.xcframework](https://nexa-model-hub-bucket.s3.us-west-1.amazonaws.com/public/ios/latest/NexaSdk.xcframework.zip) and add to your Xcode project.\n\n```swift\nimport NexaSdk\n\n// Example: Speech Recognition\nlet asr = try Asr(plugin: .ane)\ntry await asr.load(from: modelURL)\n\nlet result = try await asr.transcribe(options: .init(audioPath: \"audio.wav\"))\nprint(result.asrResult.transcript)\n```\n\n- **Requirements:** iOS 17.0+ / macOS 15.0+, Swift 5.9+\n- **Models:** LLM, ASR, OCR, Rerank, Embedding\n- **ANE Models:** [Apple Neural Engine Models](https://huggingface.co/collections/NexaAI/apple-neural-engine)\n- 📖 [iOS SDK Docs](https://docs.nexa.ai/en/nexa-sdk-ios/quickstart)\n\n## ⚙️ Features \u0026 Comparisons\n\n\u003cdiv align=\"center\"\u003e\n\n| Features                                 | **NexaSDK**                                                | **Ollama** | **llama.cpp** | **LM Studio** |\n| ---------------------------------------- | ---------------------------------------------------------- | ---------- | ------------- | ------------- |\n| NPU support                              | ✅ NPU-first                                               | ❌         | ❌            | ❌            |\n| Android/iOS SDK support                  | ✅ NPU/GPU/CPU support                                     | ⚠️         | ⚠️            | ❌            |\n| Linux support (Docker image)             | ✅                                                         | ✅         | ✅            | ❌            |\n| Day-0 model support in GGUF, MLX, NEXA   | ✅                                                         | ❌         | ⚠️            | ❌            |\n| Full multimodality support               | ✅ Image, Audio, Text, Embedding, Rerank, ASR, TTS         | ⚠️         | ⚠️            | ⚠️            |\n| Cross-platform support                   | ✅ Desktop, Mobile (Android, iOS), Automotive, IoT (Linux) | ⚠️         | ⚠️            | ⚠️            |\n| One line of code to run                  | ✅                                                         | ✅         | ⚠️            | ✅            |\n| OpenAI-compatible API + Function calling | ✅                                                         | ✅         | ✅            | ✅            |\n\n\u003cp align=\"center\" style=\"margin-top:14px\"\u003e\n  \u003ci\u003e\n      \u003cb\u003eLegend:\u003c/b\u003e\n      \u003cspan title=\"Full support\"\u003e✅ Supported\u003c/span\u003e \u0026nbsp; | \u0026nbsp;\n      \u003cspan title=\"Partial or limited support\"\u003e⚠️ Partial or limited support \u003c/span\u003e \u0026nbsp; | \u0026nbsp;\n      \u003cspan title=\"Not Supported\"\u003e❌ No\u003c/span\u003e\n  \u003c/i\u003e\n\u003c/p\u003e\n\u003c/div\u003e\n\n## 🙏 Acknowledgements\n\nWe would like to thank the following projects:\n\n- [ggml](https://github.com/ggml-org/ggml)\n- [mlx-lm](https://github.com/ml-explore/mlx-lm)\n- [mlx-vlm](https://github.com/Blaizzy/mlx-vlm)\n- [mlx-audio](https://github.com/Blaizzy/mlx-audio)\n\n## 📄 License\n\nNexaSDK uses a dual licensing model:\n\n### CPU/GPU Components\n\nLicensed under [Apache License 2.0](LICENSE).\n\n### NPU Components\n\n- **Personal Use**: Free license key available from [Nexa AI Model Hub](https://sdk.nexa.ai/model). Each key activates 1 device for NPU usage.\n- **Commercial Use**: Contact [hello@nexa.ai](mailto:hello@nexa.ai) for licensing.\n\n## 🤝 Contact \u0026 Community Support\n\n### Business Inquiries\n\nFor model launching partner, business inquiries, or any other questions, please schedule a call with us [here](https://nexa.ai/book-a-call).\n\n### Community \u0026 Support\n\nWant more model support, backend support, device support or other features? We'd love to hear from you!\n\nFeel free to [submit an issue](https://github.com/NexaAI/nexa-sdk/issues) on our GitHub repository with your requests, suggestions, or feedback. Your input helps us prioritize what to build next.\n\nJoin our community:\n\n- [Discord](https://discord.gg/thRu2HaK4D)\n- [Slack](https://join.slack.com/t/nexaai/shared_invite/zt-30a8yfv8k-1JqAXv~OjKJKLqvbKqHJxA)\n- **[Nexa Wishlist](https://sdk.nexa.ai/wishlist)** — Request and vote for the models you want to run on-device.\n\n## 🏆 Nexa × Qualcomm On-Device Bounty Program\n\nRound 1: Build a working Android AI app that runs fully on-device on Qualcomm Hexagon NPU with NexaSDK.\n\nTimeline (PT): Jan 15 → Feb 15\nPrizes: $6,500 cash prize, Qualcomm official spotlight, flagship Snapdragon device, expert mentorship, and more\n\n👉 Join \u0026 details: [https://sdk.nexa.ai/bounty](https://sdk.nexa.ai/bounty)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnexaai%2Fnexa-sdk","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fnexaai%2Fnexa-sdk","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnexaai%2Fnexa-sdk/lists"}