Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/nicor88/dbt-serverless

Run dbt serverless in the Cloud (AWS)
https://github.com/nicor88/dbt-serverless

aws cloud dbt ecs fargate serverless terraform

Last synced: 21 days ago
JSON representation

Run dbt serverless in the Cloud (AWS)

Awesome Lists containing this project

README

        

# dbt-serverless
Run dbt serverless in the Cloud (AWS)

## Requirements
* aws credentials configured in `~/.aws/credentials`
* aws cli


pip install awscli

* terraform

## Deploy
The infrastructure is based on terraform.
I setup a terraform backend to keep terraform state. The backend is based an S3 bucket that was created manually.
You can create an S3 bucket simply running:


aws s3api create-bucket --bucket nicor88-eu-west-1-terraform --region eu-west-1 --create-bucket-configuration LocationConstraint=eu-west-1

Remember to change the name of the S3 bucket inside `infrastructure/provider.tf` before running the following commands:

export AWS_PROFILE=your_profile
make infra-plan
make infra-apply

After the infra is created correctly, you can push an new image to the ECR repository running:


make push-to-ecr AWS_ACCOUNT_ID=your_account_id

### Note
Currently Aurora Postgres is only accessible inside the VPC.
I create a Network load balancer, to connect to the DB from everywhere, but you need to get the Private IP of Aurora Endpoint.
You can simply run:


nslookup your_aurora_enpoint
# returned from the terraform outputs

Then you need to replace the 2 variables:
* autora_postgres_serverless_private_ip_1
* autora_postgres_serverless_private_ip_2

and apply again the changes with the command `make infra-apply`

## Infrastructure

### AWS Step Function

#### Input example


{
"commands1": [
"dbt",
"run",
"--models"
"example"
],
{
"commands2": [
"dbt",
"run",
"--models"
"just_another_example"
]
}
}

## Airflow operator
It's possible to invoke ECS Fargate containers to run dbt also from Airflow.
Here an example of how to call a DbtOperator from Airflow:


dbt_run_example = DbtOperator(
dag=dag,
task_id='dbt_example',
command='run',
target='dev',
dbt_models='my_example',
subnets=['subnet_id_1', 'subnet_id_2'],
security_groups=['sg_1']
)