{"id":14957889,"url":"https://github.com/powerdatahub/terraform-aws-airflow","last_synced_at":"2025-07-23T05:33:23.361Z","repository":{"id":34619179,"uuid":"177014228","full_name":"PowerDataHub/terraform-aws-airflow","owner":"PowerDataHub","description":"Terraform module to deploy an Apache Airflow cluster on AWS, backed by RDS PostgreSQL for metadata, S3 for logs and SQS as message broker with CeleryExecutor","archived":false,"fork":false,"pushed_at":"2023-01-13T19:02:04.000Z","size":719,"stargazers_count":84,"open_issues_count":21,"forks_count":40,"subscribers_count":4,"default_branch":"master","last_synced_at":"2025-06-15T22:07:01.801Z","etag":null,"topics":["airflow","apache-airflow","aws","celery","hacktoberfest","terraform","terraform-module","terraform-modules"],"latest_commit_sha":null,"homepage":"","language":"HCL","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/PowerDataHub.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2019-03-21T19:52:14.000Z","updated_at":"2024-01-29T17:24:08.000Z","dependencies_parsed_at":"2023-01-15T08:07:00.419Z","dependency_job_id":null,"html_url":"https://github.com/PowerDataHub/terraform-aws-airflow","commit_stats":null,"previous_names":[],"tags_count":91,"template":false,"template_full_name":null,"purl":"pkg:github/PowerDataHub/terraform-aws-airflow","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/PowerDataHub%2Fterraform-aws-airflow","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/PowerDataHub%2Fterraform-aws-airflow/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/PowerDataHub%2Fterraform-aws-airflow/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/PowerDataHub%2Fterraform-aws-airflow/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/PowerDataHub","download_url":"https://codeload.github.com/PowerDataHub/terraform-aws-airflow/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/PowerDataHub%2Fterraform-aws-airflow/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":266624721,"owners_count":23958299,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-07-23T02:00:09.312Z","response_time":66,"last_error":null,"robots_txt_status":null,"robots_txt_updated_at":null,"robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["airflow","apache-airflow","aws","celery","hacktoberfest","terraform","terraform-module","terraform-modules"],"created_at":"2024-09-24T13:15:46.536Z","updated_at":"2025-07-23T05:33:23.334Z","avatar_url":"https://github.com/PowerDataHub.png","language":"HCL","funding_links":[],"categories":[],"sub_categories":[],"readme":"[![Maintained by Powerdatahub.com](https://img.shields.io/badge/maintained%20by-powerdatahub.com-%233D4DFE.svg?style=for-the-badge)](https://powerdatahub.com/?ref=repo_aws_airflow) [![Apache Airflow 1.10.11](https://img.shields.io/badge/Apache%20Airflow-1.10.11-3D4DFE.svg?style=for-the-badge)](https://github.com/apache/airflow/)\n\n# Airflow AWS Module\nTerraform module to deploy an [Apache Airflow](https://airflow.apache.org/) cluster on AWS, backed by RDS PostgreSQL for metadata, [S3](https://aws.amazon.com/s3/) for logs and [SQS](https://aws.amazon.com/sqs/) as message broker with CeleryExecutor\n\n\u003cimg src=\"https://raw.githubusercontent.com/PowerDataHub/terraform-aws-airflow/master/terraform-aws-airflow.png?raw\" align=\"center\" width=\"100%\" /\u003e\n\n### Terraform supported versions:\n\n| Terraform version | Tag  |\n|-------------------|------|\n| \u003c= 0.11              | v0.7.x|\n| \u003e= 0.12              | \u003e= v0.8.x|\n\n## Usage\n\nYou can use this module from the [Terraform Registry](https://registry.terraform.io/modules/powerdatahub/airflow/aws/)\n\n```terraform\nmodule \"airflow-cluster\" {\n  # REQUIRED\n  source                   = \"powerdatahub/airflow/aws\"\n  key_name                 = \"airflow-key\"\n  cluster_name             = \"my-airflow\"\n  cluster_stage            = \"prod\" # Default is 'dev'\n  db_password              = \"your-rds-master-password\"\n  fernet_key               = \"your-fernet-key\" # see https://airflow.readthedocs.io/en/stable/howto/secure-connections.html\n\n  # OPTIONALS\n  vpc_id                   = \"some-vpc-id\"                     # Use default if not provided\n  custom_requirements      = \"path/to/custom/requirements.txt\" # See examples/custom_requirements for more details\n  custom_env               = \"path/to/custom/env\"              # See examples/custom_env for more details\n  ingress_cidr_blocks      = [\"0.0.0.0/0\"]                     # List of IPv4 CIDR ranges to use on all ingress rules\n  ingress_with_cidr_blocks = [                                 # List of computed ingress rules to create where 'cidr_blocks' is used\n    {\n      description = \"List of computed ingress rules for Airflow webserver\"\n      from_port   = 8080\n      to_port     = 8080\n      protocol    = \"tcp\"\n      cidr_blocks = \"0.0.0.0/0\"\n    },\n    {\n      description = \"List of computed ingress rules for Airflow flower\"\n      from_port   = 5555\n      to_port     = 5555\n      protocol    = \"tcp\"\n      cidr_blocks = \"0.0.0.0/0\"\n    }\n  ]\n  tags                     = {\n    FirstKey  = \"first-value\"                                  # Additional tags to use on resources\n    SecondKey = \"second-value\"\n  }\n  load_example_dags        = false\n  load_default_conns       = false\n  rbac                     = true                              # See examples/rbac for more details\n  admin_name               = \"John\"                            # Only if rbac is true\n  admin_lastname           = \"Doe\"                             # Only if rbac is true\n  admin_email              = \"admin@admin.com\"                 # Only if rbac is true\n  admin_username           = \"admin\"                           # Only if rbac is true\n  admin_password           = \"supersecretpassword\"             # Only if rbac is true\n}\n```\n\n## Debug and logs\n\nThe Airflow service runs under systemd, so logs are available through journalctl.\n\n`$ journalctl -u airflow -n 50`\n\n## Todo\n\n- [x] Run airflow as systemd service\n- [x] Provide a way to pass a custom requirements.txt files on provision step\n- [ ] Provide a way to pass a custom packages.txt files on provision step\n- [x] RBAC\n- [ ] Support for [Google OAUTH ](https://airflow.readthedocs.io/en/latest/security.html#google-authentication)\n- [ ] Flower\n- [ ] Secure Flower install\n- [x] Provide a way to inject environment variables into airflow\n- [ ] Split services into multiples files\n- [ ] Auto Scalling for workers\n- [ ] Use SPOT instances for workers\n- [ ] Maybe use the [AWS Fargate](https://aws.amazon.com/fargate/) to reduce costs\n\n---\n\nSpecial thanks to [villasv/aws-airflow-stack](https://github.com/villasv/aws-airflow-stack), an incredible project, for the inspiration.\n\n---\n\n\u003c!-- BEGINNING OF PRE-COMMIT-TERRAFORM DOCS HOOK --\u003e\n## Requirements\n\n| Name | Version |\n|------|---------|\n| terraform | \u003e= 0.12 |\n\n## Providers\n\n| Name | Version |\n|------|---------|\n| aws | n/a |\n| template | n/a |\n\n## Inputs\n\n| Name | Description | Type | Default | Required |\n|------|-------------|------|---------|:--------:|\n| admin\\_email | Admin email. Only If RBAC is enabled, this user will be created in the first run only. | `string` | `\"admin@admin.com\"` | no |\n| admin\\_lastname | Admin lastname. Only If RBAC is enabled, this user will be created in the first run only. | `string` | `\"Doe\"` | no |\n| admin\\_name | Admin name. Only If RBAC is enabled, this user will be created in the first run only. | `string` | `\"John\"` | no |\n| admin\\_password | Admin password. Only If RBAC is enabled. | `string` | `false` | no |\n| admin\\_username | Admin username used to authenticate. Only If RBAC is enabled, this user will be created in the first run only. | `string` | `\"admin\"` | no |\n| ami | Default is `Ubuntu Server 18.04 LTS (HVM), SSD Volume Type.` | `string` | `\"ami-0ac80df6eff0e70b5\"` | no |\n| aws\\_region | AWS Region | `string` | `\"us-east-1\"` | no |\n| azs | Run the EC2 Instances in these Availability Zones | `map(string)` | \u003cpre\u003e{\u003cbr\u003e  \"1\": \"us-east-1a\",\u003cbr\u003e  \"2\": \"us-east-1b\",\u003cbr\u003e  \"3\": \"us-east-1c\",\u003cbr\u003e  \"4\": \"us-east-1d\"\u003cbr\u003e}\u003c/pre\u003e | no |\n| cluster\\_name | The name of the Airflow cluster (e.g. airflow-xyz). This variable is used to namespace all resources created by this module. | `string` | n/a | yes |\n| cluster\\_stage | The stage of the Airflow cluster (e.g. prod). | `string` | `\"dev\"` | no |\n| custom\\_env | Path to custom airflow environments variables. | `string` | `null` | no |\n| custom\\_requirements | Path to custom requirements.txt. | `string` | `null` | no |\n| db\\_allocated\\_storage | Dabatase disk size. | `string` | `20` | no |\n| db\\_dbname | PostgreSQL database name. | `string` | `\"airflow\"` | no |\n| db\\_instance\\_type | Instance type for PostgreSQL database | `string` | `\"db.t2.micro\"` | no |\n| db\\_password | PostgreSQL password. | `string` | n/a | yes |\n| db\\_subnet\\_group\\_name | db subnet group, if assigned, db will create in that subnet, default create in default vpc | `string` | `\"\"` | no |\n| db\\_username | PostgreSQL username. | `string` | `\"airflow\"` | no |\n| fernet\\_key | Key for encrypting data in the database - see Airflow docs. | `string` | n/a | yes |\n| ingress\\_cidr\\_blocks | List of IPv4 CIDR ranges to use on all ingress rules | `list(string)` | \u003cpre\u003e[\u003cbr\u003e  \"0.0.0.0/0\"\u003cbr\u003e]\u003c/pre\u003e | no |\n| ingress\\_with\\_cidr\\_blocks | List of computed ingress rules to create where 'cidr\\_blocks' is used | \u003cpre\u003elist(object({\u003cbr\u003e    description = string\u003cbr\u003e    from_port   = number\u003cbr\u003e    to_port     = number\u003cbr\u003e    protocol    = string\u003cbr\u003e    cidr_blocks = string\u003cbr\u003e  }))\u003c/pre\u003e | \u003cpre\u003e[\u003cbr\u003e  {\u003cbr\u003e    \"cidr_blocks\": \"0.0.0.0/0\",\u003cbr\u003e    \"description\": \"Airflow webserver\",\u003cbr\u003e    \"from_port\": 8080,\u003cbr\u003e    \"protocol\": \"tcp\",\u003cbr\u003e    \"to_port\": 8080\u003cbr\u003e  },\u003cbr\u003e  {\u003cbr\u003e    \"cidr_blocks\": \"0.0.0.0/0\",\u003cbr\u003e    \"description\": \"Airflow flower\",\u003cbr\u003e    \"from_port\": 5555,\u003cbr\u003e    \"protocol\": \"tcp\",\u003cbr\u003e    \"to_port\": 5555\u003cbr\u003e  }\u003cbr\u003e]\u003c/pre\u003e | no |\n| instance\\_subnet\\_id | subnet id used for ec2 instances running airflow, if not defined, vpc's first element in subnetlist will be used | `string` | `\"\"` | no |\n| key\\_name | AWS KeyPair name. | `string` | `null` | no |\n| load\\_default\\_conns | Load the default connections initialized by Airflow. Most consider these unnecessary, which is why the default is to not load them. | `bool` | `false` | no |\n| load\\_example\\_dags | Load the example DAGs distributed with Airflow. Useful if deploying a stack for demonstrating a few topologies, operators and scheduling strategies. | `bool` | `false` | no |\n| private\\_key | Enter the content of the SSH Private Key to run provisioner. | `string` | `null` | no |\n| private\\_key\\_path | Enter the path to the SSH Private Key to run provisioner. | `string` | `\"~/.ssh/id_rsa\"` | no |\n| public\\_key | Enter the content of the SSH Public Key to run provisioner. | `string` | `null` | no |\n| public\\_key\\_path | Enter the path to the SSH Public Key to add to AWS. | `string` | `\"~/.ssh/id_rsa.pub\"` | no |\n| rbac | Enable support for Role-Based Access Control (RBAC). | `string` | `false` | no |\n| root\\_volume\\_delete\\_on\\_termination | Whether the volume should be destroyed on instance termination. | `bool` | `true` | no |\n| root\\_volume\\_ebs\\_optimized | If true, the launched EC2 instance will be EBS-optimized. | `bool` | `false` | no |\n| root\\_volume\\_size | The size, in GB, of the root EBS volume. | `string` | `35` | no |\n| root\\_volume\\_type | The type of volume. Must be one of: standard, gp2, or io1. | `string` | `\"gp2\"` | no |\n| s3\\_bucket\\_name | S3 Bucket to save airflow logs. | `string` | `\"\"` | no |\n| scheduler\\_instance\\_type | Instance type for the Airflow Scheduler. | `string` | `\"t3.micro\"` | no |\n| spot\\_price | The maximum hourly price to pay for EC2 Spot Instances. | `string` | `\"\"` | no |\n| tags | Additional tags used into terraform-terraform-labels module. | `map(string)` | `{}` | no |\n| vpc\\_id | The ID of the VPC in which the nodes will be deployed.  Uses default VPC if not supplied. | `string` | `null` | no |\n| webserver\\_instance\\_type | Instance type for the Airflow Webserver. | `string` | `\"t3.micro\"` | no |\n| webserver\\_port | The port Airflow webserver will be listening. Ports below 1024 can be opened only with root privileges and the airflow process does not run as such. | `string` | `\"8080\"` | no |\n| worker\\_instance\\_count | Number of worker instances to create. | `string` | `1` | no |\n| worker\\_instance\\_type | Instance type for the Celery Worker. | `string` | `\"t3.small\"` | no |\n\n## Outputs\n\n| Name | Description |\n|------|-------------|\n| database\\_endpoint | Endpoint to connect to RDS metadata DB |\n| database\\_username | Username to connect to RDS metadata DB |\n| this\\_cluster\\_security\\_group\\_id | The ID of the security group |\n| this\\_database\\_security\\_group\\_id | The ID of the security group |\n| webserver\\_admin\\_url | Url for the Airflow Webserver Admin |\n| webserver\\_public\\_ip | Public IP address for the Airflow Webserver instance |\n\n\u003c!-- END OF PRE-COMMIT-TERRAFORM DOCS HOOK --\u003e\n\n---\n[![forthebadge](https://forthebadge.com/images/badges/powered-by-netflix.svg)](https://forthebadge.com) [![forthebadge](https://forthebadge.com/images/badges/contains-cat-gifs.svg)](https://forthebadge.com) [![forthebadge](https://forthebadge.com/images/badges/60-percent-of-the-time-works-every-time.svg)](https://forthebadge.com)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpowerdatahub%2Fterraform-aws-airflow","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fpowerdatahub%2Fterraform-aws-airflow","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpowerdatahub%2Fterraform-aws-airflow/lists"}