{"id":50897793,"url":"https://github.com/autonomi-ai/nos","last_synced_at":"2026-07-03T16:01:28.146Z","repository":{"id":204577187,"uuid":"628749964","full_name":"autonomi-ai/nos","owner":"autonomi-ai","description":"⚡️ A fast and flexible PyTorch inference server that runs locally, on any cloud or AI HW. ","archived":false,"fork":false,"pushed_at":"2024-06-08T19:22:47.000Z","size":17256,"stargazers_count":147,"open_issues_count":60,"forks_count":12,"subscribers_count":1,"default_branch":"main","last_synced_at":"2026-06-08T14:04:06.542Z","etag":null,"topics":["computer-vision","generative-ai","inference","inference-acceleration","llm-inference","machine-learning"],"latest_commit_sha":null,"homepage":"https://docs.nos.run/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/autonomi-ai.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"docs/CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":"docs/support.md","governance":null,"roadmap":"docs/roadmap.md","authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-04-16T22:20:05.000Z","updated_at":"2026-01-19T10:02:32.000Z","dependencies_parsed_at":"2023-11-13T16:47:33.884Z","dependency_job_id":"221609ca-957d-4578-abb6-746b104760ef","html_url":"https://github.com/autonomi-ai/nos","commit_stats":null,"previous_names":["autonomi-ai/nos"],"tags_count":23,"template":false,"template_full_name":null,"purl":"pkg:github/autonomi-ai/nos","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/autonomi-ai%2Fnos","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/autonomi-ai%2Fnos/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/autonomi-ai%2Fnos/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/autonomi-ai%2Fnos/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/autonomi-ai","download_url":"https://codeload.github.com/autonomi-ai/nos/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/autonomi-ai%2Fnos/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":35092185,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-07-03T02:00:05.635Z","response_time":110,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["computer-vision","generative-ai","inference","inference-acceleration","llm-inference","machine-learning"],"created_at":"2026-06-16T01:31:30.079Z","updated_at":"2026-07-03T16:01:28.128Z","avatar_url":"https://github.com/autonomi-ai.png","language":"Python","funding_links":[],"categories":["computer-vision"],"sub_categories":[],"readme":"\u003ccenter\u003e\u003cimg src=\"./docs/assets/nos-header.svg\" alt=\"Nitro Boost for your AI Infrastructure\"\u003e\u003c/center\u003e\n\u003cp\u003e\u003c/p\u003e\n\u003cp align=\"center\"\u003e\n\u003ca href=\"https://docs.nos.run/\"\u003e\u003cb\u003eWebsite\u003c/b\u003e\u003c/a\u003e | \u003ca href=\"https://docs.nos.run/\"\u003e\u003cb\u003eDocs\u003c/b\u003e\u003c/a\u003e | \u003ca href=\"https://github.com/autonomi-ai/nos/tree/main/examples/tutorials\"\u003e\u003cb\u003eTutorials\u003c/b\u003e\u003c/a\u003e | \u003ca href=\"https://github.com/autonomi-ai/nos-playground\"\u003e\u003cb\u003ePlayground\u003c/b\u003e\u003c/a\u003e | \u003ca href=\"https://docs.nos.run/docs/blog\"\u003e\u003cb\u003eBlog\u003c/b\u003e\u003c/a\u003e | \u003ca href=\"https://discord.gg/QAGgvTuvgg\"\u003e\u003cb\u003eDiscord\u003c/b\u003e\u003c/a\u003e\n\u003c/p\u003e\n\u003cp align=\"center\"\u003e\n\u003ca href=\"https://pypi.org/project/torch-nos/\"\u003e\u003cimg alt=\"PyPI Version\" src=\"https://badge.fury.io/py/torch-nos.svg\"\u003e\u003c/a\u003e\n\u003ca href=\"https://pypi.org/project/torch-nos/\"\u003e\u003cimg alt=\"PyPI Version\" src=\"https://img.shields.io/pypi/pyversions/torch-nos\"\u003e\u003c/a\u003e\n\u003ca href=\"https://www.pepy.tech/projects/torch-nos\"\u003e\u003cimg alt=\"PyPI Downloads\" src=\"https://img.shields.io/pypi/dm/torch-nos\"\u003e\u003c/a\u003e\n\u003ca href=\"https://hub.docker.com/repository/docker/autonomi/nos/general\"\u003e\u003cimg alt=\"Docker Pulls\" src=\"https://img.shields.io/docker/pulls/autonomi/nos.svg\"\u003e\u003c/a\u003e\u003cbr\u003e\n\u003ca href=\"https://github.com/autonomi-ai/nos/blob/main/LICENSE\"\u003e\u003cimg alt=\"PyPi Downloads\" src=\"https://img.shields.io/github/license/autonomi-ai/nos.svg\"\u003e\u003c/a\u003e\n\u003ca href=\"https://discord.gg/QAGgvTuvgg\"\u003e\u003cimg alt=\"Discord\" src=\"https://img.shields.io/badge/discord-chat-purple?color=%235765F2\u0026label=discord\u0026logo=discord\"\u003e\u003c/a\u003e\n\u003ca href=\"https://twitter.com/autonomi_ai\"\u003e\u003cimg alt=\"PyPi Version\" src=\"https://img.shields.io/twitter/follow/autonomi_ai.svg?style=social\u0026logo=twitter\"\u003e\u003c/a\u003e\n\u003c/p\u003e\n\n**NOS** is a fast and flexible PyTorch inference server that runs on any cloud or AI HW.\n\n## 🛠️ Key Features\n\n- 👩‍💻 **Easy-to-use**: Built for [PyTorch](https://pytorch.org/) and designed to optimize, serve and auto-scale Pytorch models in production without compromising on developer experience.\n- 🥷 **Multi-modal \u0026 Multi-model**: Serve multiple foundational AI models ([LLMs](https://github.com/autonomi-ai/nos/blob/main/nos/models/llm.py), [Diffusion](https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0), [Embeddings](https://github.com/autonomi-ai/nos/blob/main/nos/models/clip.py), [Speech-to-Text](https://github.com/autonomi-ai/nos/blob/main/nos/models/clip.py) and [Object Detection](https://github.com/autonomi-ai/nos/blob/main/nos/models/yolox.py)) simultaneously, in a single server.\n- ⚙️ **HW-aware Runtime:** Deploy PyTorch models effortlessly on modern AI accelerators (NVIDIA GPUs, AWS Inferentia2, AMD - coming soon, and even CPUs).\n- ☁️ **Cloud-agnostic Containers:** Run on any cloud (AWS, GCP, Azure, Lambda Labs, On-Prem) with our ready-to-use inference server containers.\n\n## 🔥 What's New\n\n* **[Feb 2024]** ✍️ [blog] [Introducing the NOS Inferentia2 (`inf2`) runtime](https://docs.nos.run/docs/blog/introducing-the-nos-inferentia2-runtime.html).\n* **[Jan 2024]** ✍️ [blog] [Serving LLMs on a budget](https://docs.nos.run/docs/blog/serving-llms-on-a-budget.html) with [SkyServe](https://skypilot.readthedocs.io/en/latest/serving/sky-serve.html).\n* **[Jan 2024]** 📚 [docs] [NOS x SkyPilot Integration](https://docs.nos.run/docs/integrations/skypilot.html) page!\n* **[Jan 2024]** ✍️ [blog] [Getting started with NOS tutorials](https://docs.nos.run/docs/blog/-getting-started-with-nos-tutorials.html) is available [here](./examples/tutorials/)!\n* **[Dec 2023]** 🛝 [repo] We open-sourced the [NOS playground](https://github.com/autonomi-ai/nos-playground) to help you get started with more examples built on NOS!\n\n## 🚀 Quickstart\n\nWe highly recommend that you go to our [quickstart guide](https://docs.nos.run/docs/quickstart.html) to get started. To install the NOS client, you can run the following command:\n\n```bash\nconda create -n nos python=3.8 -y\nconda activate nos\npip install torch-nos\n```\n\nOnce the client is installed, you can start the NOS server via the NOS `serve` CLI. This will automatically detect your local environment, download the docker runtime image and spin up the NOS server:\n\n```bash\nnos serve up --http --logging-level INFO\n```\n\nYou are now ready to run your [first inference request](#👩‍💻-what-can-nos-do) with NOS! You can run any of the following commands to try things out. You can set the logging level to `DEBUG` if you want more detailed information from the server.\n\n## 👩‍💻 **What can NOS do?**\n\n### 💬 Chat / LLM Agents (ChatGPT-as-a-Service)\n---\nNOS provides an OpenAI-compatible server with streaming support so that you can connect your favorite OpenAI-compatible LLM client to talk to NOS.\n\n\u003cimg src=\"docs/assets/llama_nos.gif\" width=\"400\"\u003e\n\n\u003cbr\u003e\n\u003cdetails\u003e\n\u003csummary\u003e API / Usage\u003c/summary\u003e\n\u003cbr\u003e\n\n\u003cb\u003egRPC API ⚡\u003c/b\u003e\n```python\nfrom nos.client import Client\n\nclient = Client()\n\nmodel = client.Module(\"TinyLlama/TinyLlama-1.1B-Chat-v1.0\")\nresponse = model.chat(message=\"Tell me a story of 1000 words with emojis\", _stream=True)\n```\n\n\u003cb\u003eREST API\u003c/b\u003e\n```bash\ncurl \\\n-X POST http://localhost:8000/v1/chat/completions \\\n-H \"Content-Type: application/json\" \\\n-d '{\n    \"model\": \"TinyLlama/TinyLlama-1.1B-Chat-v1.0\",\n    \"messages\": [{\n        \"role\": \"user\",\n        \"content\": \"Tell me a story of 1000 words with emojis\"\n    }],\n    \"temperature\": 0.7,\n    \"stream\": true\n  }'\n```\n\n\u003c/details\u003e\n\n### 🏞️ Image Generation (Stable-Diffusion-as-a-Service)\n---\nBuild MidJourney discord bots in seconds.\n\n\u003cimg src=\"docs/assets/hippo_with_glasses_sdxl.jpg\" width=\"400\"\u003e\n\n\u003cbr\u003e\n\u003cdetails\u003e\n\u003csummary\u003e API / Usage\u003c/summary\u003e\n\u003cbr\u003e\n\n\u003cb\u003egRPC API ⚡\u003c/b\u003e\n\n```python\nfrom nos.client import Client\n\nclient = Client()\n\nsdxl = client.Module(\"stabilityai/stable-diffusion-xl-base-1-0\")\nimage, = sdxl(prompts=[\"hippo with glasses in a library, cartoon styling\"],\n              width=1024, height=1024, num_images=1)\n```\n\n\u003cb\u003eREST API\u003c/b\u003e\n\n```bash\ncurl \\\n-X POST http://localhost:8000/v1/infer \\\n-H 'Content-Type: application/json' \\\n-d '{\n    \"model_id\": \"stabilityai/stable-diffusion-xl-base-1-0\",\n    \"inputs\": {\n        \"prompts\": [\"hippo with glasses in a library, cartoon styling\"],\n        \"width\": 1024, \"height\": 1024,\n        \"num_images\": 1\n    }\n}'\n```\n\n\u003c/details\u003e\n\n### 🧠 Text \u0026 Image Embedding (CLIP-as-a-Service)\n---\nBuild [scalable semantic search of images/videos](https://docs.nos.run/docs/demos/video-search.html) in minutes.\n\n\u003cimg src=\"docs/assets/embedding.png\" width=\"400\"\u003e\n\n\u003cbr\u003e\n\u003cdetails\u003e\n\u003csummary\u003e API / Usage\u003c/summary\u003e\n\u003cbr\u003e\n\n\u003cb\u003egRPC API ⚡\u003c/b\u003e\n\n```python\nfrom nos.client import Client\n\nclient = Client()\n\nclip = client.Module(\"openai/clip-vit-base-patch32\")\ntxt_vec = clip.encode_text(texts=[\"fox jumped over the moon\"])\n```\n\n\u003cb\u003eREST API\u003c/b\u003e\n\n```bash\ncurl \\\n-X POST http://localhost:8000/v1/infer \\\n-H 'Content-Type: application/json' \\\n-d '{\n    \"model_id\": \"openai/clip-vit-base-patch32\",\n    \"method\": \"encode_text\",\n    \"inputs\": {\n        \"texts\": [\"fox jumped over the moon\"]\n    }\n}'\n```\n\n\u003c/details\u003e\n\n\n### 🎙️ Audio Transcription (Whisper-as-a-Service)\n---\nPerform [real-time audio transcription](./examples/tutorials/04-serving-multiple-models/) using Whisper.\n\n\u003cimg src=\"docs/assets/transcription.png\" width=\"400\"\u003e\n\n\u003cbr\u003e\n\u003cdetails\u003e\n\u003csummary\u003e API / Usage\u003c/summary\u003e\n\u003cbr\u003e\n\n\u003cb\u003egRPC API ⚡\u003c/b\u003e\n\n```python\nfrom pathlib import Path\nfrom nos.client import Client\n\nclient = Client()\n\nmodel = client.Module(\"openai/whisper-small.en\")\nwith client.UploadFile(Path(\"audio.wav\")) as remote_path:\n  response = model(path=remote_path)\n# {\"chunks\": ...}\n```\n\n\u003cb\u003eREST API\u003c/b\u003e\n\n```bash\ncurl \\\n-X POST http://localhost:8000/v1/infer/file \\\n-H 'accept: application/json' \\\n-H 'Content-Type: multipart/form-data' \\\n-F 'model_id=openai/whisper-small.en' \\\n-F 'file=@audio.wav'\n```\n\n\u003c/details\u003e\n\n### 🧐 Object Detection (YOLOX-as-a-Service)\n---\nRun classical computer-vision tasks in 2 lines of code.\n\n\u003cimg src=\"docs/assets/bench_park_detections.png\" width=\"400\"\u003e\n\n\u003cbr\u003e\n\u003cdetails\u003e\n\u003csummary\u003e API / Usage\u003c/summary\u003e\n\u003cbr\u003e\n\n\u003cb\u003egRPC API ⚡\u003c/b\u003e\n\n```python\nfrom pathlib import Path\nfrom nos.client import Client\n\nclient = Client()\n\nmodel = client.Module(\"yolox/medium\")\nresponse = model(images=[Image.open(\"image.jpg\")])\n```\n\n\u003cb\u003eREST API\u003c/b\u003e\n\n```bash\ncurl \\\n-X POST http://localhost:8000/v1/infer/file \\\n-H 'accept: application/json' \\\n-H 'Content-Type: multipart/form-data' \\\n-F 'model_id=yolox/medium' \\\n-F 'file=@image.jpg'\n```\n\n\u003c/details\u003e\n\n### ⚒️ Custom models\n---\nWant to run models not supported by NOS? You can easily add your own models following the examples in the [NOS Playground](https://github.com/autonomi-ai/nos-playground/tree/main/examples).\n\n## 📄 License\n\nThis project is licensed under the [Apache-2.0 License](LICENSE).\n\n## 📡 Telemetry\n\nNOS collects anonymous usage data using [Sentry](https://sentry.io/). This is used to help us understand how the community is using NOS and to help us prioritize features. You can opt-out of telemetry by setting `NOS_TELEMETRY_ENABLED=0`.\n\n## 🤝 Contributing\nWe welcome contributions! Please see our [contributing guide](CONTRIBUTING.md) for more information.\n\n## 🔗  Quick Links\n\n* 💬 Send us an email at [support@autonomi.ai](mailto:support@autonomi.ai) or join our [Discord](https://discord.gg/QAGgvTuvgg) for help.\n* 📣 Follow us on [Twitter](https://twitter.com/autonomi\\_ai), and [LinkedIn](https://www.linkedin.com/company/autonomi-ai) to keep up-to-date on our products.\n\n\u003cbr\u003e\n\u003cstyle\u003e .md-typeset h1, .md-content__button { display: none; } \u003c/style\u003e\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fautonomi-ai%2Fnos","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fautonomi-ai%2Fnos","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fautonomi-ai%2Fnos/lists"}