Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/longnguyen010203/spark-processing-aws
π·π Set up and build a big data processing pipeline with Apache Spark, π¦ AWS services (S3, EMR, EC2, IAM, VPC, Redshift) Terraform to setup the infrastructure and Integration Airflow to automate workflowsπ₯
https://github.com/longnguyen010203/spark-processing-aws
apache-airflow apache-spark aws aws-ec2 aws-s3 aws-services cloud-computing data-pipeline emr-cluster iam pyspark redshift spark-cluster spark-master spark-worker terraform
Last synced: about 1 month ago
JSON representation
π·π Set up and build a big data processing pipeline with Apache Spark, π¦ AWS services (S3, EMR, EC2, IAM, VPC, Redshift) Terraform to setup the infrastructure and Integration Airflow to automate workflowsπ₯
- Host: GitHub
- URL: https://github.com/longnguyen010203/spark-processing-aws
- Owner: longNguyen010203
- License: apache-2.0
- Created: 2024-06-18T16:52:12.000Z (5 months ago)
- Default Branch: main
- Last Pushed: 2024-07-12T10:20:19.000Z (4 months ago)
- Last Synced: 2024-10-12T00:02:08.838Z (about 1 month ago)
- Topics: apache-airflow, apache-spark, aws, aws-ec2, aws-s3, aws-services, cloud-computing, data-pipeline, emr-cluster, iam, pyspark, redshift, spark-cluster, spark-master, spark-worker, terraform
- Language: Python
- Homepage:
- Size: 1010 KB
- Stars: 2
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# π· Spark-Processing-AWS
In this project, I set up and build a big data processing pipeline using Apache Spark integrated with various AWS services, including S3, EMR, EC2, VPC, IAM, and Redshift and Terraform to setup the infrastructure## π¦ About Project
## π¦ Technologies
- `S3`
- `EMR`
- `EC2`
- `Airflow`
- `Redshift`
- `Terraform`
- `Spark`
- `VPC`
- `IAM`## π¦ Features
## π©π½βπ³ The Process
## π What I Learned
## π How can it be improved?
## π¦ Running the Project