Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

Projects in Awesome Lists tagged with hadoop-cluster

A curated list of projects in awesome lists tagged with hadoop-cluster .

https://github.com/impetus/jumbune

Jumbune, an open source BigData APM & Data Quality Management Platform for Data Clouds. Enterprise feature offering is available at http://jumbune.com. More details of open source offering are at,

aiops apm cluster-monitoring data-analysis data-quality developer-tools devops-tools hadoop hadoop-cluster hadoop-monitor hadoop-monitoring monitoring-tool optimization-framework yarn yarn-hadoop-cluster

Last synced: 14 Nov 2024

https://github.com/hyeonsangjeon/dataplatform

Hadoop3.2 single/cluster mode with web terminal gotty, spark, jupyter pyspark, hive, eco etc.

hadoop hadoop-cluster hadoop-docker hadoop-ecosystem hadoop-mapreduce hive pyspark-notebook zeppelin-notebook

Last synced: 17 Nov 2024

https://github.com/manuparra/masterdegreecc_practice

Taller del Máster Profesional de Informática UGR. Curso de CloudComputing.

cloudcomputing cluster docker docker-cluster docker-container hadoop hadoop-cluster hdfs opennebula practice virtual-machine

Last synced: 07 Nov 2024

https://github.com/mitre/clusterconf

Manage Hadoop cluster configurations

hadoop hadoop-cluster r r-package rstats

Last synced: 09 Nov 2024

https://github.com/mitre/webhdfs

Interface with WebHDFS Service in a Cluster-Neutral Way

hadoop-cluster r r-package rstats webhdfs

Last synced: 09 Nov 2024

https://github.com/conema/spark-terraform

This project create an Hadoop and Spark cluster on Amazon AWS with Terraform

aws cluster hadoop hadoop-cluster hcl spark spark-clusters terraform

Last synced: 20 Nov 2024

https://github.com/codito/hadoop-expt

Experiments with Hadoop cluster setups in Docker

docker docker-compose hadoop hadoop-cluster hadoop-docker

Last synced: 10 Nov 2024

https://github.com/yjham2002/hadoop_clustering

:book: Apache Hadoop Based Clustering Tutorial

hadoop hadoop-cluster mac-osx mapreduce

Last synced: 12 Dec 2024

https://github.com/akaliutau/hadoop-cluster

Batch data processing on the dockerized Hadoop cluster

batch-processing hadoop-cluster hdf5 hdfs java mapreduce

Last synced: 12 Nov 2024

https://github.com/mariam-iftikhar/bigdata

The repository showcases a series of exercises and projects focused on big data processing using Hadoop, HBase, Hive, and Spark with Python. Hosted on AWS EMR, these projects demonstrate efficient data handling and processing techniques, leveraging the power of cloud computing to tackle complex data challenges.

apache-spark awsec2 awsemr hadoop-cluster hadoop-mapreduce hbase hiveql

Last synced: 16 Nov 2024

https://github.com/aimanamri/raspberry-pi4-hadoop-spark-cluster

This is a self-documentation of learning distributed data storage, parallel processing, and Linux OS using Apache Hadoop, Apache Spark and Raspbian OS. In this project, 3-node cluster will be setup using Raspberry Pi 4, install HDFS and run Spark processing jobs via YARN.

big-data distributed-storage hadoop-cluster hdfs parallel-processing pyspark raspberry-pi-4 spark-cluster spark-shell yarn

Last synced: 05 Nov 2024

https://github.com/kumarvna/terraform-azurerm-hdinsight

Terraform module to create managed, full-spectrum, open-source analytics service Azure HDInsight. This module creates Apache Hadoop, Apache Spark, Apache HBase, Interactive Query (Apache Hive LLAP) and Apache Kafka clusters.

apache-hive-cluster azure azure-hdinsight hadoop-cluster hadoop-filesystem hadoop-hdfs hbase-cluster hdinsight-cluster hdinsight-hadoop-cluster hdinsight-hbase-cluster hdinsight-interactive-query-cluster hdinsight-kafka-cluster hdinsight-spark-cluster kafka-cluster spark-cluster spark-clusters terraform terraform-module

Last synced: 08 Nov 2024

https://github.com/xpcosmos/data-lake-prime

This project aims to simulate and configure a Distributed File System using Hadoop HDFS. For this project, 3 machines were created: 1 Master Node and 2 Worker Nodes.

hadoop hadoop-cluster hadoop-hdfs hdfs network

Last synced: 14 Nov 2024

https://github.com/akshayavb99/ansible-examples

The repository contains all the Playbooks and other files used to work with different applications for Ansible

ansible ansible-playbooks docker dynamic-inventory-aws explanation hadoop-cluster linux-scripting loadbalancer rhel8 webserver webserver-setup webservers yum

Last synced: 14 Nov 2024

https://github.com/avojak/aws-hadoop-cluster

Infrastructure and configuration-as-code for standing up a Hadoop cluster in AWS

ansible aws aws-ec2 configuration-as-code hadoop hadoop-cluster infrastructure-as-code terraform

Last synced: 12 Dec 2024