Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
Projects in Awesome Lists tagged with hadoop-cluster
A curated list of projects in awesome lists tagged with hadoop-cluster .
https://github.com/big-data-europe/docker-hadoop
Apache Hadoop docker image
docker docker-hadoop hadoop hadoop-cluster hadoop-docker
Last synced: 14 Oct 2024
https://github.com/wittline/apache-spark-docker
Dockerizing an Apache Spark Standalone Cluster
apache-spark dataengineer dataengineering docker docker-compose hadoop-cluster hadoop-docker hdfs hive hive-metastore hue pyspark
Last synced: 14 Oct 2024
https://github.com/mikeroyal/apache-ignite-guide
Apache Ignite Guide
data-science database hadoop hadoop-cluster ignite nosql nosql-data-storage nosql-databases stream-processing streaming
Last synced: 24 Oct 2024
https://github.com/mitre/clusterconf
Manage Hadoop cluster configurations
hadoop hadoop-cluster r r-package rstats
Last synced: 03 Aug 2024
https://github.com/conema/spark-terraform
This project create an Hadoop and Spark cluster on Amazon AWS with Terraform
aws cluster hadoop hadoop-cluster hcl spark spark-clusters terraform
Last synced: 12 Oct 2024
https://github.com/mikeroyal/apache-hadoop-guide
Apache Hadoop Guide
hadoop hadoop-cluster hadoop-filesystem hadoop-hdfs hadoop-mapreduce
Last synced: 24 Oct 2024
https://github.com/yjham2002/hadoop_clustering
:book: Apache Hadoop Based Clustering Tutorial
hadoop hadoop-cluster mac-osx mapreduce
Last synced: 24 Oct 2024
https://github.com/mariam-iftikhar/bigdata
The repository showcases a series of exercises and projects focused on big data processing using Hadoop, HBase, Hive, and Spark with Python. Hosted on AWS EMR, these projects demonstrate efficient data handling and processing techniques, leveraging the power of cloud computing to tackle complex data challenges.
apache-spark awsec2 awsemr hadoop-cluster hadoop-mapreduce hbase hiveql
Last synced: 12 Oct 2024
https://github.com/avojak/aws-hadoop-cluster
Infrastructure and configuration-as-code for standing up a Hadoop cluster in AWS
ansible aws aws-ec2 configuration-as-code hadoop hadoop-cluster infrastructure-as-code terraform
Last synced: 24 Oct 2024
https://github.com/kumarvna/terraform-azurerm-hdinsight
Terraform module to create managed, full-spectrum, open-source analytics service Azure HDInsight. This module creates Apache Hadoop, Apache Spark, Apache HBase, Interactive Query (Apache Hive LLAP) and Apache Kafka clusters.
apache-hive-cluster azure azure-hdinsight hadoop-cluster hadoop-filesystem hadoop-hdfs hbase-cluster hdinsight-cluster hdinsight-hadoop-cluster hdinsight-hbase-cluster hdinsight-interactive-query-cluster hdinsight-kafka-cluster hdinsight-spark-cluster kafka-cluster spark-cluster spark-clusters terraform terraform-module
Last synced: 11 Oct 2024
https://github.com/aimanamri/raspberry-pi4-hadoop-spark-cluster
This is a self-documentation of learning distributed data storage, parallel processing, and Linux OS using Apache Hadoop, Apache Spark and Raspbian OS. In this project, 3-node cluster will be setup using Raspberry Pi 4, install HDFS and run Spark processing jobs via YARN.
big-data distributed-storage hadoop-cluster hdfs parallel-processing pyspark raspberry-pi-4 spark-cluster spark-shell yarn
Last synced: 10 Oct 2024