{"id":45277742,"url":"https://github.com/runpod/flash-examples","last_synced_at":"2026-03-11T00:11:13.570Z","repository":{"id":336365481,"uuid":"1096729116","full_name":"runpod/flash-examples","owner":"runpod","description":null,"archived":false,"fork":false,"pushed_at":"2026-02-15T04:04:26.000Z","size":1358,"stargazers_count":2,"open_issues_count":5,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-02-15T11:18:19.899Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/runpod.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-11-14T21:28:48.000Z","updated_at":"2026-02-14T07:10:40.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/runpod/flash-examples","commit_stats":null,"previous_names":["runpod/flash-examples"],"tags_count":3,"template":false,"template_full_name":null,"purl":"pkg:github/runpod/flash-examples","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/runpod%2Fflash-examples","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/runpod%2Fflash-examples/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/runpod%2Fflash-examples/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/runpod%2Fflash-examples/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/runpod","download_url":"https://codeload.github.com/runpod/flash-examples/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/runpod%2Fflash-examples/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":29671513,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-02-21T00:11:43.526Z","status":"online","status_checked_at":"2026-02-21T02:00:07.432Z","response_time":107,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2026-02-21T02:00:41.352Z","updated_at":"2026-03-11T00:11:13.557Z","avatar_url":"https://github.com/runpod.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Runpod Flash Examples\n\nA collection of example applications showcasing Runpod Flash - a framework for building production-ready AI applications with distributed GPU and CPU computing.\n\n## What is Flash?\n\nFlash is a Python framework that lets you run functions on Runpod's Serverless infrastructure with a single decorator. Write code locally, deploy globally—Flash handles provisioning, scaling, and routing automatically.\n\n```python\nfrom runpod_flash import Endpoint, GpuType\n\n@Endpoint(name=\"image-gen\", gpu=GpuType.NVIDIA_GEFORCE_RTX_4090, dependencies=[\"torch\", \"diffusers\"])\nasync def generate_image(prompt: str) -\u003e bytes:\n    # This runs on a cloud GPU, not your laptop\n    ...\n```\n\n**Key features:**\n- **`@Endpoint` decorator**: Mark any async function to run on serverless infrastructure\n- **Auto-scaling**: Scale to zero when idle, scale up under load\n- **Local development**: `flash run` starts a local server with hot reload\n- **One-command deploy**: `flash deploy` packages and ships your code\n\n## Prerequisites\n\n- **Python 3.10+**\n- **uv**: Install with `curl -LsSf https://astral.sh/uv/install.sh | sh`\n- **Runpod account**: [Sign up here](https://runpod.io/console/signup)\n\n### Python version in deployed workers\n\nYour local Python version does not affect what runs in the cloud. `flash build` downloads wheels for the container's Python version automatically.\n\n- **GPU workers**: Python 3.12 only. The GPU base image ships multiple interpreters (3.9-3.14) for interactive pod use, but torch and CUDA libraries are installed only for 3.12.\n- **CPU workers**: Python 3.10, 3.11, or 3.12. Configurable via `PYTHON_VERSION` build arg.\n\n## Quick Start\n\n```bash\n# Clone and install\ngit clone https://github.com/runpod/flash-examples.git\ncd flash-examples\nuv sync \u0026\u0026 uv pip install -e .\n\n# Authenticate with Runpod\nuv run flash login\n\n# Run all examples locally\nuv run flash run\n```\n\nOpen **http://localhost:8888/docs** to explore all endpoints.\n\n\u003e **Using pip, poetry, or conda?** See [DEVELOPMENT.md](./DEVELOPMENT.md) for alternative setups.\n\n## Examples\n\n| Category | Example | Description |\n|----------|---------|-------------|\n| **Getting Started** | [01_hello_world](./01_getting_started/01_hello_world/) | Basic GPU worker |\n| | [02_cpu_worker](./01_getting_started/02_cpu_worker/) | CPU-only worker |\n| | [03_mixed_workers](./01_getting_started/03_mixed_workers/) | GPU + CPU pipeline |\n| | [04_dependencies](./01_getting_started/04_dependencies/) | Dependency management |\n| **ML Inference** | [01_text_to_speech](./02_ml_inference/01_text_to_speech/) | Qwen3-TTS model serving |\n| **Advanced** | [05_load_balancer](./03_advanced_workers/05_load_balancer/) | HTTP routing with load balancer |\n| **Scaling** | [01_autoscaling](./04_scaling_performance/01_autoscaling/) | Worker autoscaling configuration |\n| **Data** | [01_network_volumes](./05_data_workflows/01_network_volumes/) | Persistent storage with network volumes |\n\nMore examples coming soon in each category.\n\n## CLI Commands\n\n```bash\nflash login              # Authenticate with Runpod (opens browser)\nflash run                # Run development server (localhost:8888)\nflash build              # Build deployment package\nflash deploy --env \u003cname\u003e# Build and deploy to environment\nflash undeploy \u003cname\u003e    # Delete deployed endpoint\n```\n\nSee **[CLI-REFERENCE.md](./CLI-REFERENCE.md)** for complete documentation.\n\n## Key Concepts\n\n### Endpoint\n\nThe `Endpoint` class configures functions for execution on Runpod's serverless infrastructure:\n\n**Queue-based (one function = one endpoint):**\n\n```python\nfrom runpod_flash import Endpoint, GpuType\n\n@Endpoint(name=\"my-worker\", gpu=GpuType.NVIDIA_GEFORCE_RTX_4090, workers=(0, 3), dependencies=[\"torch\"])\nasync def process(data: dict) -\u003e dict:\n    import torch\n    # this code runs on Runpod GPUs\n    return {\"result\": \"processed\"}\n```\n\n**Load-balanced (multiple routes, shared workers):**\n\n```python\nfrom runpod_flash import Endpoint\n\napi = Endpoint(name=\"my-api\", cpu=\"cpu3c-1-2\", workers=(1, 3))\n\n@api.get(\"/health\")\nasync def health():\n    return {\"status\": \"ok\"}\n\n@api.post(\"/compute\")\nasync def compute(data: dict) -\u003e dict:\n    return {\"result\": data}\n```\n\n**Client mode (connect to an existing endpoint):**\n\n```python\nfrom runpod_flash import Endpoint\n\nep = Endpoint(id=\"ep-abc123\")\njob = await ep.run({\"prompt\": \"hello\"})\nawait job.wait()\nprint(job.output)\n```\n\n### Resource Types\n\n**GPU Workers** (`gpu=`):\n| Type | Use Case |\n|------|----------|\n| `GpuType.NVIDIA_GEFORCE_RTX_4090` | RTX 4090 (24GB) |\n| `GpuType.NVIDIA_RTX_6000_ADA_GENERATION` | RTX 6000 Ada (48GB) |\n| `GpuType.NVIDIA_A100_80GB_PCIe` | A100 (80GB) |\n\n**CPU Workers** (`cpu=`):\n| Type | Specs |\n|------|-------|\n| `cpu3g-2-8` | 2 vCPU, 8GB RAM |\n| `cpu3c-4-8` | 4 vCPU, 8GB RAM (Compute) |\n| `cpu5c-4-16` | 4 vCPU, 16GB RAM (Latest) |\n\n### Auto-Scaling\n\nWorkers automatically scale based on demand:\n- `workers=(0, 3)` - Scale from 0 to 3 workers (cost-efficient)\n- `workers=(1, 5)` - Keep 1 warm, scale up to 5\n- `idle_timeout=5` - Seconds before scaling down\n\n## Resources\n\n- [Flash documentation](https://docs.runpod.io/flash/overview)\n- [Community Discord](https://discord.gg/runpod)\n\n## Contributing\n\nSee [CONTRIBUTING.md](./CONTRIBUTING.md) for contribution guidelines and [DEVELOPMENT.md](./DEVELOPMENT.md) for development setup.\n\n## License\n\nMIT License - see [LICENSE](./LICENSE) for details.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frunpod%2Fflash-examples","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Frunpod%2Fflash-examples","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frunpod%2Fflash-examples/lists"}