Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/longnguyen010203/spark-processing-aws

πŸ‘·πŸŒ‡ Set up and build a big data processing pipeline with Apache Spark, πŸ“¦ AWS services (S3, EMR, EC2, IAM, VPC, Redshift) Terraform to setup the infrastructure and Integration Airflow to automate workflowsπŸ₯Š
https://github.com/longnguyen010203/spark-processing-aws

apache-airflow apache-spark aws aws-ec2 aws-s3 aws-services cloud-computing data-pipeline emr-cluster iam pyspark redshift spark-cluster spark-master spark-worker terraform

Last synced: about 1 month ago
JSON representation

πŸ‘·πŸŒ‡ Set up and build a big data processing pipeline with Apache Spark, πŸ“¦ AWS services (S3, EMR, EC2, IAM, VPC, Redshift) Terraform to setup the infrastructure and Integration Airflow to automate workflowsπŸ₯Š

Awesome Lists containing this project

README

        

# πŸ‘· Spark-Processing-AWS
In this project, I set up and build a big data processing pipeline using Apache Spark integrated with various AWS services, including S3, EMR, EC2, VPC, IAM, and Redshift and Terraform to setup the infrastructure

## πŸ”¦ About Project

## πŸ“¦ Technologies
- `S3`
- `EMR`
- `EC2`
- `Airflow`
- `Redshift`
- `Terraform`
- `Spark`
- `VPC`
- `IAM`

## πŸ¦„ Features
## πŸ‘©πŸ½β€πŸ³ The Process
## πŸ“š What I Learned
## πŸ’­ How can it be improved?
## 🚦 Running the Project