Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

Projects in Awesome Lists tagged with hadoop-cluster

A curated list of projects in awesome lists tagged with hadoop-cluster .

https://github.com/mitre/clusterconf

Manage Hadoop cluster configurations

hadoop hadoop-cluster r r-package rstats

Last synced: 03 Aug 2024

https://github.com/conema/spark-terraform

This project create an Hadoop and Spark cluster on Amazon AWS with Terraform

aws cluster hadoop hadoop-cluster hcl spark spark-clusters terraform

Last synced: 12 Oct 2024

https://github.com/yjham2002/hadoop_clustering

:book: Apache Hadoop Based Clustering Tutorial

hadoop hadoop-cluster mac-osx mapreduce

Last synced: 24 Oct 2024

https://github.com/mariam-iftikhar/bigdata

The repository showcases a series of exercises and projects focused on big data processing using Hadoop, HBase, Hive, and Spark with Python. Hosted on AWS EMR, these projects demonstrate efficient data handling and processing techniques, leveraging the power of cloud computing to tackle complex data challenges.

apache-spark awsec2 awsemr hadoop-cluster hadoop-mapreduce hbase hiveql

Last synced: 12 Oct 2024

https://github.com/avojak/aws-hadoop-cluster

Infrastructure and configuration-as-code for standing up a Hadoop cluster in AWS

ansible aws aws-ec2 configuration-as-code hadoop hadoop-cluster infrastructure-as-code terraform

Last synced: 24 Oct 2024

https://github.com/kumarvna/terraform-azurerm-hdinsight

Terraform module to create managed, full-spectrum, open-source analytics service Azure HDInsight. This module creates Apache Hadoop, Apache Spark, Apache HBase, Interactive Query (Apache Hive LLAP) and Apache Kafka clusters.

apache-hive-cluster azure azure-hdinsight hadoop-cluster hadoop-filesystem hadoop-hdfs hbase-cluster hdinsight-cluster hdinsight-hadoop-cluster hdinsight-hbase-cluster hdinsight-interactive-query-cluster hdinsight-kafka-cluster hdinsight-spark-cluster kafka-cluster spark-cluster spark-clusters terraform terraform-module

Last synced: 11 Oct 2024

https://github.com/aimanamri/raspberry-pi4-hadoop-spark-cluster

This is a self-documentation of learning distributed data storage, parallel processing, and Linux OS using Apache Hadoop, Apache Spark and Raspbian OS. In this project, 3-node cluster will be setup using Raspberry Pi 4, install HDFS and run Spark processing jobs via YARN.

big-data distributed-storage hadoop-cluster hdfs parallel-processing pyspark raspberry-pi-4 spark-cluster spark-shell yarn

Last synced: 10 Oct 2024