https://github.com/temporalio/temporal-auto-scaled-workers
https://github.com/temporalio/temporal-auto-scaled-workers
Last synced: about 2 months ago
JSON representation
- Host: GitHub
- URL: https://github.com/temporalio/temporal-auto-scaled-workers
- Owner: temporalio
- License: mit
- Created: 2026-02-02T19:42:41.000Z (5 months ago)
- Default Branch: main
- Last Pushed: 2026-04-30T21:05:46.000Z (2 months ago)
- Last Synced: 2026-04-30T21:22:26.239Z (2 months ago)
- Language: Go
- Size: 277 KB
- Stars: 2
- Watchers: 0
- Forks: 1
- Open Issues: 2
-
Metadata Files:
- Readme: README.md
- License: LICENSE
- Codeowners: CODEOWNERS
Awesome Lists containing this project
README
# Temporal Auto-Scaled Workers
Automatically scale Temporal workers in response to workload. This project implements a **Worker Controller Instance (WCI)** — a long-running Temporal workflow that monitors task queue metrics and scales workers across cloud compute providers.
## Overview
Each WCI manages a single deployment version (deployment name + build ID). It:
1. Receives task-add signals from the Temporal Matching Service
2. Periodically polls task queue backlog and dispatch metrics
3. Applies a configurable scaling algorithm to decide when to act
4. Invokes workers on the configured compute provider
Multiple **scaling groups** can be defined per WCI, each mapping a set of task queue types (workflow, activity, nexus) to a compute provider and scaling algorithm. One group can act as a catch-all for task types not claimed by other groups.
## Supported Compute Providers
| Provider | Type string | Launch strategy |
|---|---|---|
| AWS Lambda | `aws-lambda` | Invoke (one-off) |
| AWS ECS | `aws-ecs` | Worker set (managed scaling) |
| GCP Cloud Run | `gcp-cloud-run` | Worker set |
| Kubernetes | `k8s` | Worker set |
| Subprocess | `subprocess` | Invoke (dev/test only) |
**Invoke** providers are called once per scaling event to start a short-lived worker.
**Worker-set** providers manage a persistent pool whose size is adjusted up or down.
## Supported Scaling Algorithms
| Algorithm | Type string | Description |
|---|---|---|
| No-sync | `no-sync` | Scales up when backlog or arrival rate exceeds thresholds; cools down between invocations |
## Configuration
### Dynamic config
| Setting | Default | Description |
|---|---|---|
| `WorkerControllerEnabled` | `false` | Enable WCI per namespace |
| `WorkerControllerMaxInstances` | `100` | Max WCIs per namespace |
| `WorkerControllerEnabledComputeProviders` | all | Allowed compute provider types |
| `WorkerControllerEnabledScalingAlgorithms` | all | Allowed scaling algorithm types |
| `WorkerControllerAWSIntermediaryRoles` | `[]` | IAM role chain for AWS STS |
| `WorkerControllerGCPIntermediaryServiceAccounts` | `[]` | Service account chain for GCP |
| `WorkerControllerAWSRequireRoleAndExternalID` | `true` | Enforce role + external ID on AWS configs |
## Spec Format
A WCI spec is a map of named scaling groups:
```json
{
"scaling_group_specs": {
"workflows": {
"task_types": ["WORKFLOW"],
"compute": {
"provider_type": "aws-lambda",
"config": {
"arn": "arn:aws:lambda:us-east-1:123456789012:function:my-worker",
"role": "arn:aws:iam::123456789012:role/temporal-wci",
"role_external_id": "my-external-id"
}
},
"scaling": {
"scaling_algorithm": "no-sync",
"config": {
"scale_up_backlog_threshold": "5",
"scale_up_cooloff_ms": "500",
"max_worker_lifetime_ms": "300000"
}
}
},
"activities": {
"task_types": ["ACTIVITY", "NEXUS"],
"compute": {
"provider_type": "aws-ecs",
"config": {
"cluster": "my-cluster",
"service": "my-worker-service",
"region": "us-east-1",
"role": "arn:aws:iam::123456789012:role/temporal-wci"
}
}
}
}
}
```
A group with no `task_types` acts as a catch-all for any task type not claimed by another group. At most one catch-all group is allowed. The `scaling` block is optional; omitting it leaves the group with the default scaling configuration for the given compute provider.
### `no-sync` algorithm config
| Key | Default | Description |
|---|---|---|
| `scale_up_backlog_threshold` | `0` | Scale up when backlog exceeds this value |
| `scale_up_cooloff_ms` | `100` | Minimum milliseconds between scale-up actions |
| `max_worker_lifetime_ms` | `600000` | Re-invoke workers at least this often (10 min) |
| `scale_up_dispatch_rate_epsilon` | `0` | Suppress scale-up if dispatch rate is stable within this margin |
| `metrics_poll_interval_ms` | `60000` | How often to poll task queue metrics |
## Client API
```go
import "go.temporal.io/auto-scaled-workers/wci/client"
// Register the Fx module in your server
fx.Provide(client.ClientProvider)
// Use the Client interface
type MyComponent struct {
wciClient client.Client
}
// Create or update a WCI
err := wciClient.UpdateWorkerControllerInstance(ctx, ns, deploymentVersion, &client.UpdateWorkerControllerInstanceRequest{
Spec: &client.Spec{
ScalingGroupSpecs: map[string]client.ScalingGroupSpec{
"default": {
Compute: client.ComputeProviderSpec{
ProviderType: "aws-lambda",
Config: map[string]string{"arn": "..."},
},
},
},
},
})
// List all WCIs
resp, err := wciClient.ListWorkerControllerInstances(ctx, ns, pageSize, nextPageToken)
// Delete a WCI
err := wciClient.DeleteWorkerControllerInstance(ctx, ns, deploymentVersion, conflictToken)
```
All mutating operations accept a `ConflictToken` for optimistic concurrency control. Obtain it from `DescribeWorkerControllerInstance` and pass it with updates to detect concurrent modifications.
## Integration with Temporal Server
The `TaskHookFactory` returned by `ClientProvider` must be registered with the Temporal Matching Service. It intercepts task-add events and signals the appropriate WCI workflow.
Enable per namespace via the `WorkerControllerEnabled` dynamic config setting.
## Building
```bash
# Build
make bins
# Run tests
make test
```
## Run Together with a Local Temporal Server
1. Check out the [Temporal Server](https://github.com/temporalio/temporal) alongside this repository.
2. Link the two repositories using either a [Go Workspace](https://go.dev/doc/tutorial/workspaces) or a [`replace` directive](https://go.dev/ref/mod#go-mod-file-replace) in the server's `go.mod`.
3. Compile the Temporal Server: `make bins` (or `make all`).
4. Start the server: `make start` — this uses the SQLite in-memory backend and runs `temporal-auto-scaled-workers` as part of the system workers.
## Scaling Algorithm Simulators
Interactive simulators for the scaling algorithms are available in [`docs/simulators/`](docs/simulators/).
## License
[MIT](LICENSE)