Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/pprzetacznik/datalake-aws

Sample data lake pipeline on AWS implemented using Terraform
https://github.com/pprzetacznik/datalake-aws

aws csv datalake parquet python terraform

Last synced: 1 day ago
JSON representation

Sample data lake pipeline on AWS implemented using Terraform

Awesome Lists containing this project

README

        

# AWS Data Lake example

## Diagram

![AWS Data Lake diagram](diagram/aws_data_lake.png "Data Lake")

## Terraform workspaces structure

### persistance

This layer is extracted from `datalake` workspace to preserve data when refactoring serverless infrastructure.

### iam

IAM users and roles can be set before set up of the data lake infrastructure.

### datalake

This infrastructure is making the majority of the billing costs and can be deleted and restored anytime.

## TODO

* Cleaning the code
* Extracting code from workspaces' main.tf files to modules
* Versioning of modules through git tags