{"id":31642444,"url":"https://github.com/tameronline/repo-fastapi","last_synced_at":"2026-04-20T06:02:14.565Z","repository":{"id":314506607,"uuid":"1055776934","full_name":"TamerOnLine/repo-fastapi","owner":"TamerOnLine","description":"GPU-Ready FastAPI AI Inference Server with plugin system (CUDA/CPU/MPS/ROCm)","archived":false,"fork":false,"pushed_at":"2025-09-15T02:31:03.000Z","size":140,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-10-21T14:50:27.734Z","etag":null,"topics":["ai-server","cuda","deep-learning","fastapi","inference","mps","nlp","plugins","pytorch","rocm"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/TamerOnLine.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-09-12T19:37:21.000Z","updated_at":"2025-09-15T02:31:06.000Z","dependencies_parsed_at":"2025-09-12T22:42:28.328Z","dependency_job_id":null,"html_url":"https://github.com/TamerOnLine/repo-fastapi","commit_stats":null,"previous_names":["tameronline/repo-fastapi"],"tags_count":1,"template":true,"template_full_name":null,"purl":"pkg:github/TamerOnLine/repo-fastapi","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/TamerOnLine%2Frepo-fastapi","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/TamerOnLine%2Frepo-fastapi/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/TamerOnLine%2Frepo-fastapi/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/TamerOnLine%2Frepo-fastapi/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/TamerOnLine","download_url":"https://codeload.github.com/TamerOnLine/repo-fastapi/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/TamerOnLine%2Frepo-fastapi/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":32035276,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-20T00:18:06.643Z","status":"online","status_checked_at":"2026-04-20T02:00:06.527Z","response_time":94,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["ai-server","cuda","deep-learning","fastapi","inference","mps","nlp","plugins","pytorch","rocm"],"created_at":"2025-10-07T03:59:28.324Z","updated_at":"2026-04-20T06:02:14.532Z","avatar_url":"https://github.com/TamerOnLine.png","language":"Python","funding_links":["https://paypal.me/tameronline"],"categories":[],"sub_categories":[],"readme":"# 🚀 NeuroServe — GPU-Ready FastAPI AI Server\n\n## 📊 Project Status\n\n| Category      | Badges |\n|---------------|--------|\n| **Languages** | ![Python](https://img.shields.io/badge/Python-3.12%2B-blue) ![HTML](https://img.shields.io/badge/HTML-5-orange) ![CSS](https://img.shields.io/badge/CSS-3-blueviolet) |\n| **Framework** | ![FastAPI](https://img.shields.io/badge/FastAPI-0.116.x-009688) |\n| **ML / GPU**  | ![PyTorch](https://img.shields.io/badge/PyTorch-2.6.x-ee4c2c) ![CUDA Ready](https://img.shields.io/badge/CUDA-Ready-76B900?logo=nvidia\u0026logoColor=white) |\n| **CI**        | [![Ubuntu CI](https://github.com/TamerOnLine/repo-fastapi/actions/workflows/ci-ubuntu.yml/badge.svg)](https://github.com/TamerOnLine/repo-fastapi/actions/workflows/ci-ubuntu.yml) [![Windows CI](https://github.com/TamerOnLine/repo-fastapi/actions/workflows/ci-windows.yml/badge.svg)](https://github.com/TamerOnLine/repo-fastapi/actions/workflows/ci-windows.yml) [![Windows GPU CI](https://github.com/TamerOnLine/repo-fastapi/actions/workflows/ci-gpu.yml/badge.svg)](https://github.com/TamerOnLine/repo-fastapi/actions/workflows/ci-gpu.yml) [![macOS CI](https://github.com/TamerOnLine/repo-fastapi/actions/workflows/ci-macos.yml/badge.svg)](https://github.com/TamerOnLine/repo-fastapi/actions/workflows/ci-macos.yml) |\n| **License**   | ![License](https://img.shields.io/badge/License-MIT-green) |\n| **Support**   | [![Sponsor](https://img.shields.io/badge/Sponsor-💖-pink)](https://paypal.me/tameronline) |\n\n\n---\n\n## 📖 Overview\n\n**NeuroServe** is an **AI Inference Server** built on **FastAPI**, designed to run seamlessly on **GPU (CUDA/ROCm)**, **CPU**, and **macOS MPS**.  \nIt provides ready-to-use REST APIs, a modular **plugin system**, runtime utilities, and a consistent unified response format — making it the perfect foundation for AI-powered services.\n\n---\n\n## ✨ Key Features\n\n- 🌐 **REST APIs out-of-the-box** with Swagger UI (`/docs`) \u0026 ReDoc (`/redoc`).\n- ⚡ **PyTorch integration** with automatic device selection (`cuda`, `cpu`, `mps`, `rocm`).\n- 🔌 **Plugin system** to extend functionality with custom AI models or services.\n- 📊 **Runtime tools** for GPU info, warm-up routines, and environment inspection.\n- 🧠 **Built-in utilities** like a toy model and model size calculator.\n- 🧱 **Unified JSON responses** for predictable API behavior.\n- 🧪 **Cross-platform CI/CD** (Ubuntu, Windows, macOS, Self-hosted GPU).\n\n---\n\n## 📂 Project Structure\n\n```text\ngpu-server/\n├── app/                 # Main application code\n│   ├── core/            # Config, logging, error handling\n│   ├── routes/          # API routes (auth, inference, plugins, uploads)\n│   ├── plugins/         # Plugin system (dummy, neu_server, base, loader)\n│   ├── utils/           # Unified responses\n│   ├── static/          # Static assets (CSS, favicon)\n│   ├── templates/       # HTML templates\n│   ├── main.py          # FastAPI entrypoint\n│   ├── runtime.py       # Device/GPU management\n│   └── toy_model.py     # Example PyTorch model\n├── scripts/             # Install torch, prefetch models, test API\n├── tests/               # Unit \u0026 integration tests\n├── models_cache/        # Model cache (HuggingFace / Torch)\n├── docs/                # Documentation \u0026 model licenses\n├── logs/                # Errors \u0026 plugin logs\n└── ...\n```\n\n---\n\n## ⚙️ Installation\n\n### 1. Clone the repository\n```bash\ngit clone https://github.com/USERNAME/gpu-server.git\ncd gpu-server\n```\n\n### 2. Create a virtual environment\n```bash\npython -m venv .venv\n# Linux/macOS\nsource .venv/bin/activate\n# Windows\n.venv\\Scripts\\activate\n```\n\n### 3. Install dependencies\n```bash\npip install -r requirements.txt\n```\n\n### 4. (Optional) Auto-install PyTorch\n```bash\npython -m scripts.install_torch --gpu    # or --cpu / --rocm\n```\n\n---\n\n## 🚀 Running the Server\n\n```bash\nuvicorn app.main:app --reload --host 0.0.0.0 --port 8000\n```\n\nAvailable endpoints:\n- 🏠 **Home** → [http://localhost:8000/](http://localhost:8000/)  \n- ❤️ **Health** → [http://localhost:8000/health](http://localhost:8000/health)  \n- 📚 **Swagger UI** → [http://localhost:8000/docs](http://localhost:8000/docs)  \n- 📘 **ReDoc** → [http://localhost:8000/redoc](http://localhost:8000/redoc)  \n- 🧭 **Env Summary** → [http://localhost:8000/env](http://localhost:8000/env)  \n- 🔌 **Plugins** → [http://localhost:8000/plugins](http://localhost:8000/plugins)  \n\nQuick test:\n```bash\ncurl http://localhost:8000/health\n# {\"status\": \"ok\"}\n```\n\n---\n\n## 🔌 Plugin System\n\nEach plugin lives in `app/plugins/\u003cname\u003e/` and typically includes:\n\n```\nmanifest.json\nplugin.py        # Defines Plugin class inheriting AIPlugin\nREADME.md        # Documentation\n```\n\nAPI Endpoints:\n- `GET /plugins` — list all plugins with metadata.  \n- `POST /plugins/{name}/{task}` — execute a task inside a plugin.  \n\nExample:\n```python\nfrom app.plugins.base import AIPlugin\n\nclass Plugin(AIPlugin):\n    name = \"my_plugin\"\n    tasks = [\"infer\"]\n\n    def load(self):\n        # Load models/resources once\n        ...\n\n    def infer(self, payload: dict) -\u003e dict:\n        return {\"message\": \"ok\", \"payload\": payload}\n```\n\n---\n\n## 🧪 Development\n\nInstall dev dependencies:\n```bash\npip install -r requirements-dev.txt\npre-commit install\n```\n\nRun tests:\n```bash\npytest\n```\n\nRuff (lint + format check) runs automatically via pre-commit hooks.\n\n---\n\n## 📦 Model Management\n\nDownload models in advance:\n```bash\npython -m scripts.prefetch_models\n```\n\nModels are cached in `models_cache/` (see `docs/LICENSES.md` for licenses).\n\n---\n\n## 🏭 Deployment Notes\n\n- Use `uvicorn`/`hypercorn` behind a reverse proxy (e.g., Nginx).  \n- Configure environment with `APP_*` variables instead of hardcoding.  \n- Enable HTTPS and configure CORS carefully in production.  \n\n---\n\n## 🗺️ Roadmap\n\n- [ ] Add `/cuda` endpoint → return detailed CUDA info.  \n- [ ] Add `/warmup` endpoint for GPU readiness.  \n- [ ] Provide a **plugin generator CLI**.  \n- [ ] Implement API Key / JWT authentication.  \n- [ ] Example plugins: translation, summarization, image classification.  \n- [ ] Docker support for one-click deployment.  \n- [ ] Benchmark suite for model inference speed.  \n\n---\n\n## 🤝 Contributing\n\nContributions are welcome!  \n- Open **Issues** for bugs or ideas.  \n- Submit **Pull Requests** for improvements.  \n- Follow style guidelines (Ruff + pre-commit).  \n\n---\n\n## 📜 License\n\nLicensed under the **MIT License** — see [LICENSE](./LICENSE).  \n⚠️ AI models may have their own licenses (see `docs/LICENSES.md`).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftameronline%2Frepo-fastapi","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ftameronline%2Frepo-fastapi","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftameronline%2Frepo-fastapi/lists"}