https://github.com/qzchenwl/vagrant-spark-cluster
Vagrantfile to setup 2-node spark cluster
https://github.com/qzchenwl/vagrant-spark-cluster
Last synced: 3 months ago
JSON representation
Vagrantfile to setup 2-node spark cluster
- Host: GitHub
- URL: https://github.com/qzchenwl/vagrant-spark-cluster
- Owner: qzchenwl
- Created: 2019-02-16T14:38:21.000Z (over 6 years ago)
- Default Branch: master
- Last Pushed: 2019-02-16T14:52:56.000Z (over 6 years ago)
- Last Synced: 2025-01-18T21:31:23.361Z (5 months ago)
- Language: Shell
- Size: 10.7 KB
- Stars: 3
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# vagrant-spark-cluster
Spark cluster of two node. With Hadoop & Yarn.
# Start
```bash
# 1. First download java8, spark-2.2, hadoop-2.7
you@host ~/vagrant-spark-cluster$ cd packages
you@host ~/vagrant-spark-cluster/packages$ wget -i urls.txt# 2. Vagrant up
you@host ~/vagrant-spark-cluster$ vagrant up# 3. Start cluster
you@host ~/vagrant-spark-cluster$ vagrant ssh spark-node1
vagrant@spark-node1 ~$ bash /vagrant/scripts/start-cluster.sh# 4. Test
[vagrant@spark-node1 /opt/spark-2.2]$ spark-submit --master yarn examples/src/main/python/pi.py
```# References
- https://www.linode.com/docs/databases/hadoop/how-to-install-and-set-up-hadoop-cluster/
- https://www.quora.com/How-do-I-set-up-Apache-Spark-with-Yarn-Cluster
- https://backtobazics.com/big-data/setup-multi-node-hadoop-2-6-0-cluster-with-yarn/