https://github.com/chrispsheehan/aws-serverless-github-deploy
Terraform + GitHub Actions for AWS serverless: Lambda + API Gateway with CodeDeploy blue/green (all-at-once/canary/linear) and provisioned concurrency (none/fixed/autoscaled), shipped via OIDC + Just recipes.
https://github.com/chrispsheehan/aws-serverless-github-deploy
apigateway aws-serverless cicd codedeploy lamda terraform terragrunt
Last synced: 27 days ago
JSON representation
Terraform + GitHub Actions for AWS serverless: Lambda + API Gateway with CodeDeploy blue/green (all-at-once/canary/linear) and provisioned concurrency (none/fixed/autoscaled), shipped via OIDC + Just recipes.
- Host: GitHub
- URL: https://github.com/chrispsheehan/aws-serverless-github-deploy
- Owner: chrispsheehan
- License: mit
- Created: 2025-09-03T09:12:36.000Z (8 months ago)
- Default Branch: main
- Last Pushed: 2026-04-10T12:08:54.000Z (about 1 month ago)
- Last Synced: 2026-04-10T12:17:08.445Z (about 1 month ago)
- Topics: apigateway, aws-serverless, cicd, codedeploy, lamda, terraform, terragrunt
- Language: HCL
- Homepage:
- Size: 449 KB
- Stars: 1
- Watchers: 0
- Forks: 0
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# aws-serverless-github-deploy
**Terraform + GitHub Actions for AWS serverless deployments.**
Lambda + ECS with CodeDeploy rollouts, plus provisioned concurrency controls for Lambda â driven by clean module variables and `just` recipes.
---
## đ setup roles for ci
```sh
just tg ci aws/oidc apply
just tg dev aws/oidc apply
just tg prod aws/oidc apply
```
The `ci` OIDC role is intentionally narrower than the `dev` and `prod` roles. In this repo it is limited to build-artifact management, including the shared code bucket, IAM interactions needed by the existing CI flow, and publishing container images to ECR. It is not the repo's broad deployment role.
## đ§ą prerequisite network
The AWS account must already have the landing-zone or StackSet network in place before deploying this repo.
- the Terraform in this repo reads the VPC and subnets with `data` sources rather than creating them
- the expected VPC and subnets must therefore already exist
- the private subnets must be tagged so the module lookups can find them, for example with names matching `*private*`
If those shared network resources do not exist yet, the infra applies in this repo will fail during data lookup.
The repo `network` module also owns the shared internal ALB and shared HTTP API Gateway surface used by ECS services:
- HTTP API
- default API stage
- VPC link
- internal ALB and target groups
The `api` module is Lambda-specific and plugs the Lambda integration and root routes into that shared API.
Terragrunt also provides a shared default ECR repository name to ECS task modules:
- shared artifact base: `dev -> ---dev`, otherwise `---ci`
- default ECR repository: `-ecs-worker`
- override it in `infra/live//environment_vars.hcl` only if the repository naming diverges from that convention
- the concrete ECS worker task wrapper defaults `local_tunnel = false` and `xray_enabled = false` unless you explicitly set them
The reusable deploy workflows follow the same split: `prod` deploy wrappers read shared artifact resources from `ci`, but still apply `prod` infrastructure stacks using the repo's directory-derived service and lambda matrices.
For code-only release deploys, pass explicit release versions for each runtime you want to roll out. In particular, ECS code-only deploys should provide an `ecs_version` rather than relying on a Lambda-version fallback.
## đ ī¸ local plan some infra
Given a terragrunt file is found at `infra/live/dev/aws/api/terragrunt.hcl`
```sh
just tg dev aws/api plan
```
## âī¸ types of lambda provisioned concurrency
```hcl
module "lambda_example" {
source = "../lambda"
...
provisioned_config = var.your_provisioned_config
}
```
#### â
[default] No provisioned lambdas
- use case: background processes
- we can handle an initial lag while lambda warms up/boots
```hcl
provisioned_config = {
fixed = 0
reserved_concurrency = 2 # only allow 2 concurrent executions THIS ALSO SERVES AS A LIMIT TO AVOID THROTTLING
}
```
#### đ X number of provisioned lambdas
- use case: high predictable usage
- we never want lag due to warm up and can predict traffic
```hcl
provisioned_config = {
fixed = 10
reserved_concurrency = 50
}
```
#### đ Scale provisioning when usage exceeds % tolerance
- use case: react to traffic i.e. api backend
- limit the cost with autoscale.max
- ensure minimal concurrency (no cold starts) with autoscale.min
- set tolerance to amount of used concurrent executions. Below will trigger when 70% are used and add more to meet demands.
- set cool down seconds to reasonable time before you would like the system to react.
```hcl
provisioned_config = {
auto_scale = {
max = 3,
min = 1,
trigger_percent = 70
cool_down_seconds = 60
}
}
```
- before scaling the lambda alias will match the minmum value

- when the trigger percent is exceeded the lambda moves into `In progress (1/2)` state as an additional provisioned lambda is added.

- after scaling the lambda alias will show an additional provisioned lambda

## đĻ types of lambda deploy
```hcl
module "lambda_example" {
source = "../_shared/lambda"
...
deployment_config = var.your_deployment_config
}
```
#### ⥠[default] All at once (fastest):
- use case: background processes
```hcl
deployment_config = {
strategy = "all_at_once"
}
```
#### đ¤ canary deployment:
- use case: api or service serving traffic
- incrementally rolls out new version to 10% of lambdas and rolls back if errors detected. If not goes to 100%.
- waits to make a decision on health after 1 minute
```hcl
deployment_config = {
strategy = "canary"
percentage = 10
interval_minutes = 1
}
```
#### đļ linear deployment:
- use case: api or service serving traffic
- checks for lambda health on 10% of lambdas and rolls back if errors detected
- rolls out changes on increments of 1 minute
```hcl
deployment_config = {
strategy = "linear"
percentage = 10
interval_minutes = 1
}
```
## đĻ types of ecs deploy
```hcl
module "service_example" {
source = "../_shared/service"
...
deployment_strategy = var.your_deployment_strategy
}
```
#### ⥠[default] All at once:
- use case: internal services, queue workers, low-risk changes
- for load-balanced ECS services this uses CodeDeploy and shifts traffic in one step
```hcl
deployment_strategy = "all_at_once"
```
#### đ¤ canary deployment:
- use case: HTTP services behind the load balancer
- shifts 10% of traffic for 5 minutes before moving to 100%
```hcl
deployment_strategy = "canary"
```
#### đļ linear deployment:
- use case: steady rollout with smaller blast radius
- shifts traffic 10% every minute until complete
```hcl
deployment_strategy = "linear"
```
#### đĻđŠ blue/green deployment:
- use case: explicit blue/green semantics while still using the default ECS all-at-once traffic switch
- currently maps to the ECS CodeDeploy all-at-once config
```hcl
deployment_strategy = "blue_green"
```
- ECS CodeDeploy is only created for load-balanced ECS services in `_shared/service`
- internal ECS services without load balancer integration should use native ECS rolling updates instead
- the shared ECS service resource ignores `task_definition` drift so later infra applies do not revert the live task revision after either a rolling deploy or a CodeDeploy rollout
- the deployment workflow:
- applies the new `task_*` revision
- if the service has CodeDeploy resources, reads `codedeploy_app_name` and `codedeploy_deployment_group_name` from `service_*`
- renders [`appspec-ecs.yml`](appspec-ecs.yml)
- uploads the AppSpec to the code bucket
- runs `just ecs-deploy`
- otherwise updates the ECS service to the new task definition with a native rolling deploy
## đĨâŠī¸ deployment roll-back
- use cloudwatch metrics and alarms to automatically roll-back a deployment
- create a [cloudwatch_metric_alarm](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/cloudwatch_metric_alarm) resource and pass in as per below
```hcl
module "lambda_example" {
source = "../_shared/lambda"
...
codedeploy_alarm_names = [
local.api_5xx_alarm_name
]
}
```
- the ECS shared service module accepts the same `codedeploy_alarm_names` input
- if the alarm triggers during a deployment you will see the below in the CI
```
đĻ Running: lambda-deploy
đ Deployment started: d-40UUQH3DF
Attempt 1: Deployment status is InProgress
Attempt 2: Deployment status is InProgress
Attempt 3: Deployment status is InProgress
Attempt 4: Deployment status is InProgress
Attempt 5: Deployment status is Stopped
â Deployment d-40UUQH3DF failed or was stopped.
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
| GetDeployment |
+--------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| ErrorCode | ALARM_ACTIVE |
| ErrorMessage| One or more alarms have been activated according to the Amazon CloudWatch metrics you selected, and the affected deployments have been stopped. Activated alarms: |
| Status | Stopped |
+--------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
error: Recipe `lambda-deploy` failed with exit code 1
Error: Process completed with exit code 1.
```
## đĸ deployment strategies
- Infrastructure and feature code deployments (via codedeploy) are completely decoupled.
- Initial infrastructure deployments deploys `infra/modules/aws/_shared/lambda/bootstrap/index.py` which serves as a place-holder.
- Initial ECS infrastructure deployments can use a bootstrap task, while the deploy workflow later registers a real `task_*` revision and promotes it via CodeDeploy.
- The code deploy app and group are also deployed, which is the mechanism used to deploy the real builds.
- Subsequent re-runs of the infrastructure deployments will not update the code.