{"id":21847540,"url":"https://github.com/scribd/terraform-aws-recycle-eks","last_synced_at":"2025-04-14T13:33:27.712Z","repository":{"id":44169828,"uuid":"298117633","full_name":"scribd/terraform-aws-recycle-eks","owner":"scribd","description":"Terraform module for automatically recycling EKS worker nodes","archived":false,"fork":false,"pushed_at":"2024-07-18T05:49:48.000Z","size":312,"stargazers_count":7,"open_issues_count":4,"forks_count":3,"subscribers_count":9,"default_branch":"main","last_synced_at":"2025-03-28T02:45:36.485Z","etag":null,"topics":["aws","cplat","eks","terraform"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/scribd.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":"CODEOWNERS","security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2020-09-23T23:23:01.000Z","updated_at":"2024-12-16T17:14:04.000Z","dependencies_parsed_at":"2024-06-20T00:54:10.761Z","dependency_job_id":"bbd98f1b-2c24-43cf-ad18-0767ab55b6cd","html_url":"https://github.com/scribd/terraform-aws-recycle-eks","commit_stats":null,"previous_names":[],"tags_count":2,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/scribd%2Fterraform-aws-recycle-eks","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/scribd%2Fterraform-aws-recycle-eks/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/scribd%2Fterraform-aws-recycle-eks/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/scribd%2Fterraform-aws-recycle-eks/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/scribd","download_url":"https://codeload.github.com/scribd/terraform-aws-recycle-eks/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248888914,"owners_count":21178128,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["aws","cplat","eks","terraform"],"created_at":"2024-11-27T23:18:31.279Z","updated_at":"2025-04-14T13:33:27.686Z","avatar_url":"https://github.com/scribd.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# terraform-aws-recycle-eks\n\nThis module creates a terraform module to recycle EKS worker nodes. The high level functionalities are explained below,\n - Creates a step-function that will consist of 4 lambdas. This step function will handle the transfer of inputs across the lambda functions.\n - The first lambda takes an instance id as an input, to put it in standby state. Using autoscaling api to automatically add a new instance to the group while putting the old instance to standby state. The old instance will get into \"Standby\" state only when the new instance is in fully \"Inservice\" state\n - Taint this \"Standby\" node in EKS using K8S API in Lambda to prevent new pods from getting scheduled into this node\n - Periodically use K8S API check for status of “stateful” pods on that node based on the label selector provided. Another Lambda will do that\n - Once all stateful pods have completed on the node, i.e number of running pod reached 0, shut down that standby instance using AWS SDK via lambda. We are not terminating the node, only shutting it down, just in case. In future releases, we will be start terminating the nodes\n \n\n## TODO:\n - Check for new node in service before proceeding to put the existing node in standby state. Right now we are putting a sleep of 300 sec.\n - Refactor the code to use as a common module for getting the access token.\n - Better logging and exception handling\n - Make use of namespace input while selecting the pods. Currently it checks for pods in all namespaces.\n - Find a terraform way to edit configmap/aws-auth, this step is still manual to make this module work.\n\nThere are two main components:\n\n1. Lambdas\n2. Step Function in AWS, to chain the Lambdas and pass on the parameters form one Lamda to another Lamda\n\n\n## Usage\n\n\n```\nmodule \"recycl-eks-worker-node\" {\n  source = \"git::git@github.com:scribd/terraform-aws-recycle-eks.git\"\n  name                   = \"string\"\n  tags                            = {\n    Environment = \"dev\"\n    Terraform   = \"true\"\n  }\n  vpc_subnet_ids         = [\"subnet-12345678\", \"subnet-87654321\"]\n  vpc_security_group_ids = [\"sg-12345678\"]\n  aws_region             = \"us-east-2\"\n  namespace = \"your pod namespace\" # As of now it is just a place holder we check for all namespaces now\n\n}\n\n```\nAfter running the module, Run `kubectl edit -n kube-system configmap/aws-auth` and add the following:\n```\nmapRoles: | \n# ...\n    - rolearn: \u003cIAM role for the lamda execution\u003e\n      username: lambda\n\n```\nYou can get IAM role for the lamda execution from the output variable of \"lambda_exec_arn\" in this module\n\n## Running of step function\n\n```\nStep function takes an json input \n\n{\n    \"instance_id\": \"i-1234567890\",\n    \"cluster_name\": \"eks-cluster-name-where-the-instance-belongs-to\",\n    \"label_selector\":\"airflow_version=1.2.3,airflow-worker\" #you can put a comma separated value for labels, either key=value or only key\n}\nThis label selector will be the identifier on which the step function will wait and rest all pods will be ignored.\n\n```\n## Sample Output of a step function\n\n![](images/Step-Function-sample-output.png)\n\n## Development\n\nReleases are cut using [semantic-release](https://github.com/semantic-release/semantic-release).\n\nPlease write commit messages following [Angular commit guidelines](https://github.com/angular/angular.js/blob/master/DEVELOPERS.md#-git-commit-guidelines)\n\n\n### Release flow\n\nSemantic-release is configured with the [default branch workflow](https://semantic-release.gitbook.io/semantic-release/usage/configuration#branches)\n\nFor this project, releases will be cut from master as features and bugs are developed.\n\n\n### Maintainers\n- [Kuntal](https://github.com/kuntalkumarbasu)\n\n### Reference\n- There is an excellent module on [Gracefully drain Kubernetes pods from EKS worker nodes during autoscaling scale-in events](https://github.com/aws-samples/amazon-k8s-node-drainer). We followed some of the principles in the Lambdas.\n- [Orchestrating Amazon Kubernetes Service (EKS) from AWS Lambda](https://medium.com/@alejandro.millan.frias/managing-kubernetes-from-aws-lambda-7922c3546249) is another writeup that we referrenced while connecting to EKS from Lambda\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fscribd%2Fterraform-aws-recycle-eks","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fscribd%2Fterraform-aws-recycle-eks","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fscribd%2Fterraform-aws-recycle-eks/lists"}