Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/myamafuj/hadoop-hive-spark-docker
Hadoop-Hive-Spark cluster + Jupyter on Docker
https://github.com/myamafuj/hadoop-hive-spark-docker
docker hadoop hive jupyter jupyter-notebook pyspark spark
Last synced: 3 months ago
JSON representation
Hadoop-Hive-Spark cluster + Jupyter on Docker
- Host: GitHub
- URL: https://github.com/myamafuj/hadoop-hive-spark-docker
- Owner: myamafuj
- Created: 2022-02-26T10:48:46.000Z (over 2 years ago)
- Default Branch: master
- Last Pushed: 2024-05-26T03:13:07.000Z (6 months ago)
- Last Synced: 2024-07-05T16:55:43.865Z (4 months ago)
- Topics: docker, hadoop, hive, jupyter, jupyter-notebook, pyspark, spark
- Language: Dockerfile
- Homepage:
- Size: 83 KB
- Stars: 54
- Watchers: 3
- Forks: 36
- Open Issues: 2
-
Metadata Files:
- Readme: README.md
- Changelog: history/Dockerfile
Awesome Lists containing this project
README
# Hadoop-Hive-Spark cluster + Jupyter on Docker
## Software
* [Hadoop 3.3.4](https://hadoop.apache.org/)
* [Hive 3.1.3](http://hive.apache.org/)
* [Spark 3.3.1](https://spark.apache.org/)
## Quick Start
To deploy the cluster, run:
```
make
docker-compose up
```## Access interfaces with the following URL
### Hadoop
ResourceManager: http://localhost:8088
NameNode: http://localhost:9870
HistoryServer: http://localhost:19888
Datanode1: http://localhost:9864
Datanode2: http://localhost:9865NodeManager1: http://localhost:8042
NodeManager2: http://localhost:8043### Spark
master: http://localhost:8080worker1: http://localhost:8081
worker2: http://localhost:8082history: http://localhost:18080
### Hive
URI: jdbc:hive2://localhost:10000### Jupyter Notebook
URL: http://localhost:8888example: [jupyter/notebook/pyspark.ipynb](jupyter/notebook/pyspark.ipynb)