{"id":48557417,"url":"https://github.com/rusets/docker-ecs-deployment","last_synced_at":"2026-04-08T11:33:15.102Z","repository":{"id":316136590,"uuid":"936731513","full_name":"rusets/docker-ecs-deployment","owner":"rusets","description":"A fully automated, scale-to-zero AWS ECS Fargate platform — wake-on-demand via API Gateway + Lambda, auto-sleep via EventBridge, Terraform IaC, and GitHub Actions OIDC CI/CD. Zero idle cost. Clean, modern, conference-ready architecture.","archived":false,"fork":false,"pushed_at":"2026-02-19T02:24:00.000Z","size":3517,"stargazers_count":2,"open_issues_count":0,"forks_count":1,"subscribers_count":1,"default_branch":"main","last_synced_at":"2026-02-19T07:43:23.734Z","etag":null,"topics":["api-gateway","autosleep","aws","aws-ecs","cloud-engineering","cost-optimization","devops","docker","ecs-deployment","ecs-fargate","fargate","github-actions","iac","infrastructure-as-code","lambda","oidc","scale-to-zero","serverless","terraform","wake-on-demand"],"latest_commit_sha":null,"homepage":"https://api.ecs-demo.online","language":"HCL","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/rusets.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-02-21T15:35:14.000Z","updated_at":"2026-02-19T02:24:04.000Z","dependencies_parsed_at":"2025-09-22T23:24:30.378Z","dependency_job_id":"4ca8e62c-629f-47c6-ab4c-f1fdb2b06b7d","html_url":"https://github.com/rusets/docker-ecs-deployment","commit_stats":null,"previous_names":["rusets/ecs-lab"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/rusets/docker-ecs-deployment","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rusets%2Fdocker-ecs-deployment","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rusets%2Fdocker-ecs-deployment/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rusets%2Fdocker-ecs-deployment/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rusets%2Fdocker-ecs-deployment/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/rusets","download_url":"https://codeload.github.com/rusets/docker-ecs-deployment/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rusets%2Fdocker-ecs-deployment/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31554093,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-08T10:21:54.569Z","status":"ssl_error","status_checked_at":"2026-04-08T10:21:38.171Z","response_time":54,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["api-gateway","autosleep","aws","aws-ecs","cloud-engineering","cost-optimization","devops","docker","ecs-deployment","ecs-fargate","fargate","github-actions","iac","infrastructure-as-code","lambda","oidc","scale-to-zero","serverless","terraform","wake-on-demand"],"created_at":"2026-04-08T11:33:14.516Z","updated_at":"2026-04-08T11:33:15.067Z","avatar_url":"https://github.com/rusets.png","language":"HCL","readme":"# Docker ECS Deployment — Fargate + On-Demand Provisioning\n\n\u003cp align=\"center\"\u003e\n  \u003cimg src=\"https://img.shields.io/badge/Terraform-IaC-5C4EE5?logo=terraform\" /\u003e\n  \u003cimg src=\"https://img.shields.io/badge/AWS-ECS%20Fargate-FF9900?logo=amazon-ecs\" /\u003e\n  \u003cimg src=\"https://img.shields.io/badge/AWS-Lambda%20%2B%20API_Gateway-FF9900?logo=awslambda\" /\u003e\n  \u003cimg src=\"https://img.shields.io/badge/GitHub_Actions-CI%2FCD-2088FF?logo=github-actions\" /\u003e\n  \u003cimg src=\"https://img.shields.io/badge/Security-tflint%20%7C%20tfsec%20%7C%20checkov-2D76FF\" /\u003e\n\u003c/p\u003e\n\n\n**Wait Page:** https://api.ecs-demo.online  \n\nI built this project as a fully automated, scale-to-zero ECS Fargate environment with on-demand provisioning and automatic shutdown.\n\nThe service runs at $0 by default (`desiredCount=0`).  \nWhen a request hits the Wait Page, API Gateway triggers the Wake Lambda, which scales the ECS service to 1 task and redirects the user to the task’s public IP.  \nAfter a defined idle period, the Auto-Sleep Lambda scales the service back to `0`.\n\nThere is no ALB, no project-created Route 53 hosted zone, and no persistent compute.  \nThe stack works directly on the API Gateway endpoint, with a custom domain as an optional layer.\n\nThe architecture is intentionally minimal: API Gateway + Lambda + ECS.  \nThe goal is deterministic on-demand startup, clean infrastructure design, and the lowest possible AWS cost without sacrificing clarity or control.\n\n---\n\n## **Architecture Overview**\n\n```mermaid\nflowchart LR\n  subgraph GH[GitHub]\n    CI[CI • Build \u0026 Push to ECR\u003cbr/\u003eci.yml]\n    CD[CD • Terraform Apply \u0026 Deploy\u003cbr/\u003ecd.yml]\n    OPS[OPS • Wake / Sleep helpers\u003cbr/\u003eops.yml]\n  end\n\n  CI --\u003e ECR[(ECR repo)]\n  CD --\u003e TF[(Terraform)]\n  TF --\u003e VPC[(VPC + Subnets + SG)]\n  TF --\u003e ECS[ECS Cluster + Fargate Service]\n  TF --\u003e CWL[CloudWatch Logs]\n  TF --\u003e LWA[Lambda • Wake]\n  TF --\u003e LAS[Lambda • Auto-sleep]\n  TF --\u003e APIGW[API Gateway HTTP API]\n  TF --\u003e EVB[EventBridge Rule]\n\n  APIGW --\u003e LWA\n  EVB --\u003e LAS\n  LWA --\u003e|desiredCount=1| ECS\n  LAS --\u003e|desiredCount=0| ECS\n\n  subgraph Runtime\n    ECS --\u003e|public IP| Internet\n  end\n```\n---\n\n## OpenAPI-Driven Wake API\n\nThe wake HTTP API is defined using an **OpenAPI 3** specification located in `infra/api/openapi-wake.yaml`.\n\nTerraform consumes this spec to configure the **API Gateway HTTP API**, including routes, methods, and Lambda integration.  \nThe OpenAPI file is version-controlled alongside the infrastructure code and validated in CI.\n\nBoth the Terraform configuration and the OpenAPI spec are scanned by **Checkov**, ensuring consistent policy enforcement across infrastructure and API definitions.\n\nThis approach keeps the API contract explicit, reviewable in pull requests, and reusable across different clients or environments.\n\n---\n\n## Prerequisites\n\n- AWS account (region `us-east-1` recommended)\n- S3 bucket and DynamoDB table for Terraform remote backend  \n  (or use the configuration in `infra/backend.tf`)\n- IAM role configured for GitHub OIDC with permissions for ECR, ECS, Lambda, and Logs\n- Terraform ≥ 1.6\n- AWS CLI configured locally\n- GitHub repository with Actions enabled\n\n---\n\n## **Quick Start**\n\n### **Local Terraform Deployment**\n```bash\ncd infra\n\nterraform init\nterraform plan -out=tfplan\nterraform apply -auto-approve tfplan\n```\n\n### CI/CD Deployment (Recommended)\n\nDeployment is fully automated through GitHub Actions.\n\nWhen changes are pushed to `main`:\n\n- CI builds the Docker image from `./app`\n- The image is tagged with the **commit SHA** (immutable tag strategy)\n- The image is pushed to **Amazon ECR**\n\nThe CD workflow then:\n\n- Runs `terraform apply`\n- Registers a new ECS Task Definition referencing the SHA image\n- Updates the ECS service to the exact image version produced by CI\n- Waits until the ECS service reaches a **stable** state\n\nThis guarantees deterministic deployments and removes any dependency on mutable tags like `latest`.\n\n---\n\n## Key AWS Services Used\n\n| Service            | Role in the Architecture |\n|--------------------|--------------------------|\n| **API Gateway**    | Public HTTP endpoint defined via OpenAPI, invokes the Wake Lambda |\n| **AWS Lambda**     | Implements wake and auto-sleep logic (scales ECS service up and down) |\n| **Amazon ECS**     | Runs the containerized application as a Fargate service |\n| **AWS Fargate**    | Serverless compute layer for containers (no EC2 management) |\n| **Amazon ECR**     | Stores versioned Docker images (SHA-tagged) |\n| **Amazon VPC**     | Provides networking: public subnets, Internet Gateway, security groups |\n| **CloudWatch Logs**| Centralized logs for Lambda, API Gateway, and ECS |\n| **EventBridge**    | Scheduled trigger for the auto-sleep Lambda |\n| **S3 + DynamoDB**  | Remote Terraform state backend with locking |\n\n---\n\n## Wake / Sleep Lifecycle\n\nThe service operates in true **scale-to-zero** mode.  \nWhen idle, the ECS service remains at `desiredCount = 0` and consumes no compute resources.\n\n### Wake Flow\n\nClient → API Gateway → Wake Lambda → `ecs:UpdateService(desiredCount=1)`  \n→ Fargate task starts → Lambda waits for `RUNNING`  \n→ Browser redirects to the task public IP.\n\n### Sleep Flow\n\nEventBridge (runs every 1 minute)  \n→ Auto-Sleep Lambda checks activity  \n→ If idle, scales the service back to `desiredCount=0`.\n\n---\n\n## On-Demand Startup Challenge\n\nWhen scaling from `desiredCount=0`, early requests sometimes returned **HTTP 500**.\n\n**Cause**\n\nAPI Gateway forwarded traffic before the Fargate task was fully running and had obtained a public IP.  \nStartup time (~40 seconds) created a race condition during warm-up.\n\n**Fix**\n\nImplemented ECS task status polling inside the Wake Lambda, verified the `RUNNING` state, resolved the task public IP, and introduced a controlled warm-up window (`WAIT_MS`).\n\n**Result**\n\nDeterministic startup behavior with reliable redirects and no premature failures.\n\n---\n\n## Application Layer\n\n- **Runtime:** Node.js (Express-based HTTP service)\n- **Source directory:** `./app`\n- **Container image:** built from `./app/Dockerfile` and pushed to Amazon ECR via CI\n- **Deployment model:** single-container ECS Fargate task\n- **Port configuration:** application listens on `APP_PORT` (default: `80`)\n- **Frontend features:**\n  - Light / dark theme toggle\n  - Real-time log streaming via Server-Sent Events (SSE)\n  - Simple endpoints to generate traffic and simulate activity\n\n---\n\n### Wait Page \u0026 Frontend Flow\n\n- **Entry point:**  \n  The user accesses the public endpoint (API Gateway custom domain or default invoke URL).\n\n- **Warm-up phase:**  \n  The Wake Lambda returns a lightweight HTML response while the ECS service scales from `desiredCount=0` to `1`.\n\n- **Readiness check:**  \n  The Lambda polls ECS until the task reaches `RUNNING` state and the container becomes reachable.\n\n- **Redirect:**  \n  Once ready, the browser is redirected to the task’s public IP on `APP_PORT` (default `80`).\n\n- **Timeout protection:**  \n  If the task does not become ready within `WAIT_MS`, the request fails gracefully instead of redirecting prematurely.\n\n---\n\n## **Project Structure**\n\n```text\ndocker-ecs-deployment\n├── app/               # Node.js app (Express)\n├── wake/              # Wake Lambda (Python)\n├── autosleep/         # Auto-sleep Lambda (Python)\n├── build/             # Built Lambda ZIPs (Terraform-generated)\n├── infra/             # All Terraform infrastructure\n│   └── api/openapi-wake.yaml   # OpenAPI spec for the wake HTTP API\n├── docs/              # Architecture, ADRs, runbooks\n├── .github/           # CI/CD workflows + templates\n├── README.md\n└── LICENSE\n```\n\n---\n\n## Documentation\n\n**Docs:** [All Docs](./docs/) | [Architecture](./docs/architecture.md) | [Cost](./docs/cost.md) | [Configuration](./docs/configuration.md) | [Operational Model](./docs/operational-model.md) | [ADRs](./docs/adr/) | [Runbooks](./docs/runbooks/)\n\n---\n\n## **Common Terraform \u0026 AWS CLI Commands**\n\n### Terraform Lifecycle\n```bash\nterraform init\nterraform plan -out=tfplan\nterraform apply -auto-approve tfplan\nterraform destroy -auto-approve\n```\n\n### AWS CLI Checks\n```bash\naws ecs describe-services --cluster ecs-demo-cluster --services ecs-demo-svc --region us-east-1\naws logs tail /aws/lambda/ecs-demo-wake --follow --region us-east-1\naws logs tail /aws/lambda/ecs-demo-autosleep --follow --region us-east-1\naws events list-rules --name-prefix ecs-demo-autosleep --region us-east-1\naws ecs list-tasks --cluster ecs-demo-cluster --region us-east-1\naws ecs describe-tasks --cluster ecs-demo-cluster --tasks \u003cTASK_ID\u003e --region us-east-1\n```\n---\n\n## **Secrets Management**\n\n- Secrets are **not hardcoded** in Terraform or source code.\n- No plaintext credentials are stored in GitHub Actions.\n- Authentication uses **GitHub OIDC** → IAM role → temporary AWS credentials.\n- ECS tasks do not require static secrets (no DB, no external API tokens).\n- Lambda functions use only environment variables that contain **non-sensitive** values:\n  - `CLUSTER_NAME`\n  - `SERVICE_NAME`\n  - `SLEEP_AFTER_MINUTES`\n  - `WAIT_MS`\n\n### If secrets are needed in the future\nUse:\n- **SSM Parameter Store (SecureString)** for configuration  \n- **AWS Secrets Manager** for rotating credentials  \n- Access via:  \n  - IAM role attached to the Lambda  \n  - IAM role attached to the ECS task  \n\nThis keeps the project **fully keyless**, secure, and aligned with AWS best practices.\n  \n---\n\n## GitHub Actions Automation\n\n- **CI (`ci.yml`)**  \n  Builds Docker image, tags with commit SHA, pushes to ECR.\n\n- **CD (`cd.yml`)**  \n  Assumes AWS role via OIDC, runs `terraform apply/destroy`, registers new task definition, updates ECS service, waits for stability.\n\n- **OPS (`ops.yml`)**  \n  Manual helpers for wake (API call) and sleep (`desiredCount=0`).\n\nAll workflows use OIDC (no static AWS keys), least-privilege IAM, and deterministic SHA-based deployments.\n\n---\n\n### **Where We Consciously Accept Trade-Offs**\n\n- **No ALB (HTTP-only after wake)**  \n  Redirect goes to the task’s public IP over HTTP — avoids ~$20/mo ALB cost.\n\n- **Public-only subnets**  \n  No NAT Gateway (saves ~$32–$40/mo), but tasks must access the internet directly.\n\n- **Single-AZ architecture**  \n  Lower cost and faster provisioning, but not multi-AZ fault tolerant.\n\n- **Lambda-based warm-up logic**  \n  Slightly longer wake times vs. always-on compute — acceptable for scale-to-zero.\n\n- **Minimal logging retention**  \n  Keeps CloudWatch bill low, but long-term log history is not preserved.\n\nEach trade-off is intentional to support a **near-zero-cost, on-demand environment** suitable for demos, learning, and interviews.\n\n---\n\n## **Screenshots**\n\n###  Service Warming Up\nThe initial wake sequence — the API Gateway triggers the **Lambda \"Wake\"**, which scales the ECS service from `desiredCount=0` to `1`.\n![Warming Up](docs/readme-screenshots/1-warming-up.png)\n\n---\n\n###  Application Running\nThe application is now live and serving requests inside the **ECS Fargate** task.  \nLive metrics (uptime, memory, load average) are streamed to the UI dashboard.\n![App Running](docs/readme-screenshots/2-app-running.png)\n\n---\n\n###  ECS Service — Active\nAWS Console confirms that **1/1 tasks** are running and the service is fully active within the ECS cluster.  \nThe cluster status is **Active**, no tasks are pending.\n![ECS Active](docs/readme-screenshots/3-ecs-service-awake.png)\n\n---\n\n###  ECS Service — Autosleep Triggered\nAfter idle timeout, the **Auto-Sleep Lambda** scales the ECS service back down to `desiredCount=0`.  \nThis ensures cost-efficient operation by shutting down inactive containers.\n![ECS Sleeping](docs/readme-screenshots/4-ecs-service-sleep.png)\n\n---\n\n###  CloudWatch Logs — Autosleep Event\nCloudWatch logs confirm the autosleep action with the payload:  \n`{\"ok\": true, \"stopped\": true}` — indicating the ECS service has successfully stopped.\n![Autosleep Log](docs/readme-screenshots/5-autosleep-log.png)\n\n---\n\n## Summary\n\nThis project implements a scale-to-zero ECS Fargate architecture with deterministic on-demand startup.\n\nThe service remains at `desiredCount=0` when idle and provisions compute only when traffic arrives.  \nWake and sleep logic is implemented through Lambda, with infrastructure fully managed via Terraform and deployed through GitHub Actions.\n\nThe result is a minimal, reproducible, and cost-efficient platform that demonstrates controlled lifecycle management of containerized workloads on AWS.\n\n---\n\n## License\n\nThis project is released under the MIT License.\n\nSee the `LICENSE` file for details.","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frusets%2Fdocker-ecs-deployment","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Frusets%2Fdocker-ecs-deployment","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frusets%2Fdocker-ecs-deployment/lists"}