https://github.com/logicalclocks/spark-chef
Apache Spark chef cookbook
https://github.com/logicalclocks/spark-chef
Last synced: about 1 year ago
JSON representation
Apache Spark chef cookbook
- Host: GitHub
- URL: https://github.com/logicalclocks/spark-chef
- Owner: logicalclocks
- License: agpl-3.0
- Created: 2014-12-22T09:59:33.000Z (over 11 years ago)
- Default Branch: master
- Last Pushed: 2024-11-05T22:24:32.000Z (over 1 year ago)
- Last Synced: 2025-04-05T09:04:47.859Z (about 1 year ago)
- Language: Ruby
- Size: 505 KB
- Stars: 3
- Watchers: 12
- Forks: 32
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
Awesome Lists containing this project
README
# Apache Spark Chef cookbook
### Install Spark standalone
### Install Spark yarn
## References
* https://documentation.altiscale.com/spark-2-0-with-altiscale
* https://www.linkedin.com/pulse/running-spark-2xx-cloudera-hadoop-distro-cdh-deenar-toraskar-cfa
*
set "spark.yarn.jars"
$ Cd $SPARK_HOME
$ hadoop fs mkdir spark-2.0.0-bin-hadoop
$hadoop fs -copyFromLocal jars/* spark-2.0.0-bin-hadoop
$ echo "spark.yarn.jars=hdfs:///nameservice1/user//spark-2.0.0-bin-hadoop/*" >> conf/spark-defaults.conf
If you do have access to the local directories of all the nodes in your cluster you can copy the archive or spark jars to the local directory of each of the data nodes using rsync or scp. Just update the URLs from hdfs:/ to local: