{"id":23578119,"url":"https://github.com/ashex/supersetonaws","last_synced_at":"2026-04-27T08:32:14.542Z","repository":{"id":269221167,"uuid":"906759224","full_name":"Ashex/SupersetOnAWS","owner":"Ashex","description":"Apache Superset on AWS ECS Fargate with CDK","archived":false,"fork":false,"pushed_at":"2025-01-14T19:53:46.000Z","size":132,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-05-16T16:50:02.423Z","etag":null,"topics":["apache-superset","aws","cdk","fargate","superset"],"latest_commit_sha":null,"homepage":"","language":"TypeScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Ashex.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2024-12-21T20:24:47.000Z","updated_at":"2025-01-13T18:02:52.000Z","dependencies_parsed_at":"2025-01-13T19:20:19.031Z","dependency_job_id":"b2312ece-1c1d-436e-ba6d-f5ee2eceb9e4","html_url":"https://github.com/Ashex/SupersetOnAWS","commit_stats":{"total_commits":8,"total_committers":1,"mean_commits":8.0,"dds":0.0,"last_synced_commit":"b2e36eabf435086c608ec05b324d82912426434b"},"previous_names":["ashex/supersetonaws"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/Ashex/SupersetOnAWS","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Ashex%2FSupersetOnAWS","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Ashex%2FSupersetOnAWS/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Ashex%2FSupersetOnAWS/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Ashex%2FSupersetOnAWS/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Ashex","download_url":"https://codeload.github.com/Ashex/SupersetOnAWS/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Ashex%2FSupersetOnAWS/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":32329463,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-26T23:26:28.701Z","status":"online","status_checked_at":"2026-04-27T02:00:06.769Z","response_time":128,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["apache-superset","aws","cdk","fargate","superset"],"created_at":"2024-12-26T22:33:16.616Z","updated_at":"2026-04-27T08:32:14.513Z","avatar_url":"https://github.com/Ashex.png","language":"TypeScript","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Apache Superset on ECS Fargate\n\n## This isn't complicated\n\nAfter stumpling upon the [aws-ia](https://github.com/aws-ia) reference architecture for [Superset](https://aws-ia.github.io/cfn-ps-apache-superset/) which is arguably one of the worst reference architectures I have ever seen, I took it upon myself to hammer out a very basic setup on AWS which is truly serverless, scalable, and all the good buzzwords.\n\nThe main goal was to remain as automated as possible with serverless/managed solutions. You will probably need to adjust this for your needs, don't run this OOB in production unless you want to have a bad time.\n\nCloudfront is configured to respect Cache headers but should be tuned as necessary along with caching, memory, and all the usual things a competent engineer is capable of figuring out.\n\n\n## Architecture\n\n![High-level architecture showing the different services used. A cloudfront distribution serves traffic via an ALB which routes to a ECS Fargate which connects to Aurora and ElastiCache Postgres](./Architecture.png)\n\n## Parameters\n\n### Superset Stack\n\n| Name               | Defaults            | Type    | Description                                                                 |\n|--------------------|---------------------|---------|-----------------------------------------------------------------------------|\n| envName            | -                   | String  | The environment name (e.g., development, staging, production)               |\n| redisInstanceType  | cache.t4g.medium    | String  | The instance type for Redis                                                 |\n| auroraInstanceType | t4g.large           | String  | The instance type for Aurora, without db. prefix                            |\n| supersetMemoryLimit| 2048                | Number  | Memory Limit for Superset Service                                           |\n| supersetCPU        | 1024                | Number  | CPU allocation for Superset Service                                         |\n| supersetMinCapacity| 1                   | Number  | Minimum number of tasks for the service                                     |\n| supersetMaxCapacity| 2                   | Number  | Maximum number of tasks for the service                                     |\n| supersetDesiredCount| 1                  | Number  | Desired number of tasks for the service                                     |\n| r53DomainName      | none                | String  | (Optional) The Route53 DomainName to use for the CloudFront distribution    |\n| ACMCertArn         | none                | String  | (Optional) The ACM certificate arn to use for the CloudFront distribution    |\n| vpcIdParameter     | /base/network/vpcId | String  | The Parameter with VPC ID to use for the stack                              |\n| ContainerImage     | -                   | String  | The container image to use for the Superset service                         |\n| FirstRun           | false               | Boolean | If true, create the admin user and initialize the database                  |\n\n### ACM Stack\n\n| Name               | Defaults            | Type    | Description                                                                 |\n|--------------------|---------------------|---------|-----------------------------------------------------------------------------|\n| r53DomainName      | -                   | String  | The hosted zone name for the domain                                         |\n\n## Launching\n\nBuild the superset container under `src` and push it to a (ECR) repository and provide the arn including tag in `ContainerImage`.  \nThere is the assumption that you have the VPCId stored in parameter store and so provide the parameter name. Set `envName` appropriately then diff, deploy, and pray.\nUsing the construct [cdk-rds-sql](https://github.com/berenddeboer/cdk-rds-sql), a dedicated user for superset is created with the necessary permissions.\n\n## Bootstrapping\n\nThe `FirstRun` parameters controls if the following is done when set to True:\n\n* Adjusts the task command to create the admin user with the password admin.\n\nAfter first launch, change this setting to false.\n\n## SSL\n\nWe're doing some [cross-region magic](https://github.com/aws/aws-cdk/tree/8d55d864183803e2e6cfb3991edced7496eaadeb/packages/aws-cdk-lib#accessing-resources-in-a-different-stack-and-region)  \nto create the ACM certificate for Cloudfront, the key piece to set is `r53DomainName` and the record is constructed with it.\n\nIf for whatever reason you can't do automatic ACM certificate creation with r53 dns validation, rip out the acmStack and do it the hard way.\n\n## Monitoring\n\nThere are several Alarms with somewhat decent defaults, they do not go anywhere so configure the alarm target with a SNS topic with destinations (slack, opsgenie, etc.).\n\nA couple Log Insights queries are created with the prefix `superset/` which simply excludes the http request logs for easier troubleshooting.\n\n## Scaling\n\nWhile this is a single task deployment, you likely need to scale for query performance. To do so, you will need to add [dedicated celery workers](https://superset.apache.org/docs/configuration/async-queries-celery/) to support asynchronous queries which are long running. You can effectively reuse the existing task definition but change the task command to the following:\n\n```bash\ncelery --app=superset.tasks.celery_app:app worker --pool=prefork -O fair -c 4\n```\n\nIf you need to have a dedicated scheduler, use the following task command:\n\n```bash\ncelery --app=superset.tasks.celery_app:app beat\n```\n\nHowever review the linked documentation as CeleryConfig should be adjusted per your requirements (if you want to stick with one image build, consider adding additional config files that can be improted based off a type variable to override config).\n\n## Shout Outs\n\n* Anil Augustine Chalissery for his AWS in Plain English [post](https://aws.plainenglish.io/how-to-deploy-apache-superset-on-aws-ecs-08da76bedd32) that got me started.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fashex%2Fsupersetonaws","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fashex%2Fsupersetonaws","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fashex%2Fsupersetonaws/lists"}