Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

https://github.com/myamafuj/hadoop-hive-spark-docker

Hadoop-Hive-Spark cluster + Jupyter on Docker
https://github.com/myamafuj/hadoop-hive-spark-docker

docker hadoop hive jupyter jupyter-notebook pyspark spark

Last synced: 1 day ago
JSON representation

Hadoop-Hive-Spark cluster + Jupyter on Docker

Lists

README

        

# Hadoop-Hive-Spark cluster + Jupyter on Docker

## Software

* [Hadoop 3.3.4](https://hadoop.apache.org/)

* [Hive 3.1.3](http://hive.apache.org/)

* [Spark 3.3.1](https://spark.apache.org/)

## Quick Start

To deploy the cluster, run:
```
make
docker-compose up
```

## Access interfaces with the following URL

### Hadoop

ResourceManager: http://localhost:8088

NameNode: http://localhost:9870

HistoryServer: http://localhost:19888

Datanode1: http://localhost:9864
Datanode2: http://localhost:9865

NodeManager1: http://localhost:8042
NodeManager2: http://localhost:8043

### Spark
master: http://localhost:8080

worker1: http://localhost:8081
worker2: http://localhost:8082

history: http://localhost:18080

### Hive
URI: jdbc:hive2://localhost:10000

### Jupyter Notebook
URL: http://localhost:8888

example: [jupyter/notebook/pyspark.ipynb](jupyter/notebook/pyspark.ipynb)