Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/pierrekieffer/docker-spark-yarn-cluster
Docker multi-nodes Hadoop cluster with Spark 2.4.1 on Yarn
https://github.com/pierrekieffer/docker-spark-yarn-cluster
cluster docker hadoop spark yarn yarn-hadoop-cluster
Last synced: 11 days ago
JSON representation
Docker multi-nodes Hadoop cluster with Spark 2.4.1 on Yarn
- Host: GitHub
- URL: https://github.com/pierrekieffer/docker-spark-yarn-cluster
- Owner: PierreKieffer
- License: apache-2.0
- Created: 2019-04-15T15:37:37.000Z (over 5 years ago)
- Default Branch: master
- Last Pushed: 2020-12-07T07:38:31.000Z (almost 4 years ago)
- Last Synced: 2023-03-07T01:31:41.456Z (over 1 year ago)
- Topics: cluster, docker, hadoop, spark, yarn, yarn-hadoop-cluster
- Language: Shell
- Homepage:
- Size: 46.9 KB
- Stars: 44
- Watchers: 2
- Forks: 28
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Docker hadoop yarn cluster for spark 2.4.1
Provides Docker multi-nodes Hadoop cluster with Spark 2.4.1 on Yarn.
* [Usage](#usage)
* [Build](#build)
* [Run](#run)
* [Stop](#stop)
* [Connect to Master Node](#connect-to-master-node)
* [Run spark applications on cluster :](#run-spark-applications-on-cluster-)
* [spark-shell](#spark-shell)
* [spark submit](#spark-submit)
* [Web UI](#web-ui)## Usage
### Build
```bash
make build
```
### Run
```bash
make start
```
### Stop
```bash
make stop
```
### Connect to Master Node
```bash
make connect
```
```bash
---- MASTER NODE ----
root@cluster-master:/#
```
### Run spark applications on cluster :
Once connected to the master node
#### spark-shell
```bash
spark-shell --master yarn --deploy-mode client
```
#### spark submit
```bash
spark-submit --master yarn --deploy-mode [client or cluster] --num-executors 2 --executor-memory 4G --executor-cores 4 --class org.apache.spark.examples.SparkPi $SPARK_HOME/examples/jars/spark-examples_2.11-2.4.1.jar
```
#### Web UI
- Get master node ip:
```bash
make master-ip
```
```bash
---- MASTER NODE IP ----
Master node ip : 172.20.0.4
```
- Access to Hadoop cluster Web UI : `master-node-ip:8088`
- Access to spark Web UI : `master-node-ip:8080`
- Access to hdfs Web UI : `master-node-ip:50070`