https://github.com/nebari-dev/nebari-rayserve-pack
https://github.com/nebari-dev/nebari-rayserve-pack
Last synced: about 2 months ago
JSON representation
- Host: GitHub
- URL: https://github.com/nebari-dev/nebari-rayserve-pack
- Owner: nebari-dev
- Created: 2026-03-17T13:11:06.000Z (3 months ago)
- Default Branch: main
- Last Pushed: 2026-04-14T20:50:22.000Z (3 months ago)
- Last Synced: 2026-04-14T21:10:12.807Z (3 months ago)
- Language: Makefile
- Size: 27.3 KB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 3
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Nebari Ray Serve Software Pack
A [Nebari Software Pack](https://github.com/nebari-dev/nebari-software-pack-template) that deploys [Ray Serve](https://docs.ray.io/en/latest/serve/index.html) on Kubernetes using the [RayService CRD](https://docs.ray.io/en/latest/serve/production-guide/kubernetes.html), with optional routing, TLS, and OIDC authentication via the [nebari-operator](https://github.com/nebari-dev/nebari-operator).
## Overview
This pack deploys a production-ready Ray Serve instance using the RayService CRD (the [recommended approach](https://docs.ray.io/en/latest/serve/production-guide/kubernetes.html) for running Ray Serve on Kubernetes).
**What gets deployed:**
- KubeRay operator (manages Ray cluster and Serve lifecycle)
- RayService (Ray cluster + Serve proxy, pre-initialized with `host: 0.0.0.0`)
- Stable Kubernetes Services for the dashboard and serve endpoint
- NebariApp resources for external access via Envoy Gateway (optional)
**Two access patterns:**
| Access from | Path | Auth required? |
|-------------|------|----------------|
| Jupyter notebook (in-cluster) | Direct to K8s service | No |
| Browser / external client | Envoy Gateway via NebariApp | Yes (if enabled) |
## Prerequisites
- [kubectl](https://kubernetes.io/docs/tasks/tools/)
- [Helm 3](https://helm.sh/docs/intro/install/)
- A Kubernetes cluster (or [kind](https://kind.sigs.k8s.io/) for local dev)
## Quick Start
### Standalone (no Nebari)
```bash
cd chart
helm dependency update .
helm install rayserve . --create-namespace -n rayserve --wait --timeout 5m
```
Access via port-forward:
```bash
# Ray Dashboard
kubectl port-forward svc/rayserve-nebari-rayserve-head-svc 8265:8265 -n rayserve
# Ray Serve endpoint
kubectl port-forward svc/rayserve-nebari-rayserve-serve-svc 8000:8000 -n rayserve
```
### On a Nebari cluster (via ArgoCD)
Add this to your GitOps repo as `apps/rayserve-pack.yaml`:
```yaml
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: rayserve-pack
namespace: argocd
annotations:
argocd.argoproj.io/sync-wave: "7"
finalizers:
- resources-finalizer.argocd.argoproj.io
spec:
project: default
source:
repoURL: https://github.com/nebari-dev/nebari-rayserve-pack.git
targetRevision: main
path: chart
helm:
releaseName: rayserve
values: |
nebariapp:
enabled: true
serve:
enabled: false # Keep serve endpoint internal-only
dashboard:
enabled: true
hostname: ray-dashboard.example.com
auth:
enabled: true
provider: keycloak
provisionClient: true
redirectURI: /oauth2/callback
destination:
server: https://kubernetes.default.svc
namespace: rayserve
syncPolicy:
automated:
prune: true
selfHeal: true
managedNamespaceMetadata:
labels:
nebari.dev/managed: "true"
syncOptions:
- CreateNamespace=true
- ServerSideApply=true
- SkipDryRunOnMissingResource=true
- RespectIgnoreDifferences=true
retry:
limit: 5
backoff:
duration: 5s
factor: 2
maxDuration: 3m
# The KubeRay controller modifies RayService and Service resources at
# runtime (adding selectors, status fields, etc.), causing a permanent
# OutOfSync state without these ignore rules.
ignoreDifferences:
- group: ""
kind: Service
jsonPointers:
- /spec/selector
- /spec/clusterIP
- /spec/clusterIPs
- group: ray.io
kind: RayService
jsonPointers:
- /spec/rayClusterConfig
- /status
```
**Important:**
- `managedNamespaceMetadata` with `nebari.dev/managed: "true"` is required for the nebari-operator to manage NebariApp resources
- `redirectURI` must be `/oauth2/callback` (Envoy Gateway rejects `/`)
- Set `serve.enabled: false` to keep the serve endpoint internal-only (recommended — notebooks access it via cluster DNS)
## Connecting from Jupyter
From a notebook running in the same cluster (e.g., via the [nebari-data-science-pack](https://github.com/nebari-dev/nebari-data-science-pack)):
```python
import ray
from ray import serve
import requests
# Connect to the Ray cluster
ray.init("ray://rayserve-nebari-rayserve-head-svc.rayserve.svc.cluster.local:10001")
# Deploy a model
@serve.deployment
class Hello:
async def __call__(self, request):
return "Hello from Ray Serve!"
serve.run(Hello.bind(), name="hello", route_prefix="/hello")
# Run inference
resp = requests.get("http://rayserve-nebari-rayserve-serve-svc.rayserve.svc.cluster.local:8000/hello")
print(resp.text)
# Hello from Ray Serve!
```
No manual Serve initialization is needed — the RayService CRD starts the Serve proxy with `host: 0.0.0.0` automatically.
**Note:** The Ray and Python versions in your Jupyter environment must match the Ray cluster. This chart deploys Ray 2.43.0 with Python 3.9 by default. If using [Nebi](https://github.com/nebari-dev/nebari-nebi-pack) for environment management, create a workspace with:
```toml
[workspace]
name = "ray-serve"
channels = ["conda-forge"]
platforms = ["linux-64"]
[dependencies]
python = "3.9.*"
ray-serve = "2.43.*"
ipykernel = ">=6.0"
```
## Deploying Models (Production)
For production, bake your model code into a custom Docker image and declare applications in `values.yaml`:
```yaml
image:
repository: your-registry/your-ray-image
tag: "2.43.0-custom"
serveApplications:
- name: my-model
route_prefix: /predict
import_path: myapp.model:app
deployments:
- name: MyModel
num_replicas: 2
```
The RayService controller handles deployment, health monitoring, and zero-downtime upgrades automatically.
## Chart Configuration
Key values in `chart/values.yaml`:
### Nebari Integration
| Value | Default | Description |
|-------|---------|-------------|
| `nebariapp.enabled` | `false` | Create NebariApp resources for routing/TLS/auth |
| `nebariapp.serve.enabled` | `true` | Expose the serve endpoint externally (set `false` to keep internal-only) |
| `nebariapp.hostname` | - | Hostname for the Ray Serve endpoint (required when serve.enabled) |
| `nebariapp.dashboard.enabled` | `true` | Create a separate NebariApp for the Ray Dashboard |
| `nebariapp.dashboard.hostname` | - | Hostname for the Ray Dashboard (required when dashboard enabled) |
| `nebariapp.auth.enabled` | `false` | Enable OIDC authentication via Keycloak |
| `nebariapp.auth.redirectURI` | `/oauth2/callback` | OAuth callback path (Envoy Gateway rejects `/`) |
| `nebariapp.gateway` | `public` | Gateway to use (`public` or `internal`) |
### Ray Cluster
| Value | Default | Description |
|-------|---------|-------------|
| `image.repository` | `rayproject/ray` | Ray container image |
| `image.tag` | `2.43.0` | Ray version |
| `head.resources.requests.cpu` | `1` | Head node CPU request |
| `head.resources.requests.memory` | `2Gi` | Head node memory request |
| `head.runtimeClassName` | - | Runtime class for head pod (e.g., `nvidia` for GPU) |
| `worker.replicas` | `1` | Number of worker nodes |
| `worker.minReplicas` | `1` | Min workers (for autoscaling) |
| `worker.maxReplicas` | `1` | Max workers (for autoscaling) |
| `worker.resources.requests.cpu` | `1` | Worker CPU request |
| `worker.resources.requests.memory` | `2Gi` | Worker memory request |
| `worker.runtimeClassName` | - | Runtime class for worker pods (e.g., `nvidia` for GPU) |
### Serve Applications
| Value | Default | Description |
|-------|---------|-------------|
| `serveApplications` | `[]` | Declarative Serve applications (see [Ray Serve config](https://docs.ray.io/en/latest/serve/production-guide/config.html)) |
## Architecture
```mermaid
flowchart TD
subgraph KO["KubeRay Operator"]
op["Manages RayService lifecycle"]
end
subgraph RS["RayService CRD"]
subgraph RC["RayCluster"]
head["Head Pod\n:8265 dashboard\n:8000 serve\n:10001 client"]
workers["Worker Pod(s)\nRay Workers"]
end
end
subgraph SVC["Kubernetes Services"]
headsvc["-head-svc\n:8265 :10001 :6379"]
servesvc["-serve-svc\n:8000"]
end
subgraph NB["NebariApp (optional)"]
route["HTTPRoute + OIDC auth\nvia Envoy Gateway"]
end
jupyter["Jupyter Notebook\n(in-cluster)"]
browser["Browser\n(external)"]
KO --> RS
head --- workers
head --> headsvc
head --> servesvc
servesvc --> route
jupyter -->|"ray:// :10001"| headsvc
jupyter -->|"HTTP :8000"| servesvc
browser -->|"HTTPS"| route
style KO fill:#fef0db,stroke:#e8952c,color:#7c4a03
style RS fill:#eeeef3,stroke:#4a4a6a,color:#1a1a2e
style RC fill:#e8faf8,stroke:#20aaa1,color:#0d5d57
style SVC fill:#d4f5f2,stroke:#20aaa1,color:#0d5d57
style NB fill:#f3e8fc,stroke:#c840e9,color:#6b21a8
```
## Repository Structure
```
nebari-rayserve-pack/
chart/
Chart.yaml # Depends on kuberay-operator
values.yaml # RayService + NebariApp configuration
templates/
_helpers.tpl # Name, label, and service name helpers
rayservice.yaml # RayService CRD
services.yaml # Stable head and serve K8s Services
nebariapp.yaml # NebariApp CRDs (conditional)
NOTES.txt # Post-install usage instructions
dev/
Makefile # Local dev with full Nebari stack on kind
```
## Troubleshooting
### Ray Dashboard returns 500 via NebariApp
The NebariApp may be pointing at a service that doesn't exist. Check the actual service name:
```bash
kubectl get svc -n rayserve
```
The stable services are `-nebari-rayserve-head-svc` and `-nebari-rayserve-serve-svc`.
### Version mismatch connecting from Jupyter
The Ray and Python versions in your notebook environment must match the cluster:
```bash
kubectl exec -n rayserve $(kubectl get pod -n rayserve -l ray.io/node-type=head -o name) -- ray --version
kubectl exec -n rayserve $(kubectl get pod -n rayserve -l ray.io/node-type=head -o name) -- python --version
```
### NebariApp not reaching Ready
Check that the namespace has the managed label:
```bash
kubectl get namespace rayserve --show-labels | grep nebari.dev/managed
```
If missing, add it (or use `managedNamespaceMetadata` in the ArgoCD app):
```bash
kubectl label namespace rayserve nebari.dev/managed=true
```
### JupyterHub notebooks can't reach Ray
The default JupyterHub singleuser network policy blocks egress to private IPs. Add this to your data science pack values:
```yaml
jupyterhub:
singleuser:
networkPolicy:
egressAllowRules:
privateIPs: true
```