Projects in Awesome Lists tagged with spark-cluster
A curated list of projects in awesome lists tagged with spark-cluster .
https://github.com/mgarralda/hadoop-spark-cluster
Repository containing Docker images for create a cluster Spark on Hadoop Yarn.
hadoop-hdfs spark spark-cluster spark-hadoop spark-hadoop-docker spark-yarn-docker
Last synced: 26 Apr 2025
https://github.com/aixhunter/spark-k8s-pod-template
Steps to deploy a Spark app to Kubernetes cluster using spark-submit or a pod template
k8s kubernetes pod spark spark-cluster spark-submit
Last synced: 08 May 2025
https://github.com/longnguyen010203/spark-processing-aws
👷🌇 Set up and build a big data processing pipeline with Apache Spark, 📦 AWS services (S3, EMR, EC2, IAM, VPC, Redshift) Terraform to setup the infrastructure and Integration Airflow to automate workflows🥊
apache-airflow apache-spark aws aws-ec2 aws-s3 aws-services cloud-computing data-pipeline emr-cluster iam pyspark redshift spark-cluster spark-master spark-worker terraform
Last synced: 11 Mar 2025
https://github.com/aimanamri/raspberry-pi4-hadoop-spark-cluster
This is a self-documentation of learning distributed data storage, parallel processing, and Linux OS using Apache Hadoop, Apache Spark and Raspbian OS. In this project, 3-node cluster will be setup using Raspberry Pi 4, install HDFS and run Spark processing jobs via YARN.
big-data distributed-storage hadoop-cluster hdfs parallel-processing pyspark raspberry-pi-4 spark-cluster spark-shell yarn
Last synced: 09 Apr 2025
https://github.com/kumarvna/terraform-azurerm-hdinsight
Terraform module to create managed, full-spectrum, open-source analytics service Azure HDInsight. This module creates Apache Hadoop, Apache Spark, Apache HBase, Interactive Query (Apache Hive LLAP) and Apache Kafka clusters.
apache-hive-cluster azure azure-hdinsight hadoop-cluster hadoop-filesystem hadoop-hdfs hbase-cluster hdinsight-cluster hdinsight-hadoop-cluster hdinsight-hbase-cluster hdinsight-interactive-query-cluster hdinsight-kafka-cluster hdinsight-spark-cluster kafka-cluster spark-cluster spark-clusters terraform terraform-module
Last synced: 14 Apr 2025
https://github.com/turnipdo/spark-standalone-cluster-setup
To facilitate the initial setup of Apache Spark, this repository provides a beginner-friendly, step-by-step guide on setting up a master node and two worker nodes.
Last synced: 12 Apr 2025
https://github.com/flaviostutz/spark-submit-scala
Spark submit extension from bde2020/spark-submit for Scala with SBT
bigdata sbt scala spark spark-cluster spark-submit
Last synced: 31 Mar 2025
https://github.com/euiyounghwang/spark_job_interface_service
spark_job_interface_service
fastapi spark spark-cluster spark-jobs
Last synced: 11 Mar 2025
https://github.com/minsusun/deploy-spark-cluster
configs for deploying the spark clusters on docker and k8s !!
docker docker-compose k8s spark-cluster
Last synced: 17 Mar 2025