https://github.com/logicalclocks/spark-chef

Apache Spark chef cookbook
https://github.com/logicalclocks/spark-chef

Last synced: about 1 year ago
JSON representation

Apache Spark chef cookbook

Host: GitHub
URL: https://github.com/logicalclocks/spark-chef
Owner: logicalclocks
License: agpl-3.0
Created: 2014-12-22T09:59:33.000Z (over 11 years ago)
Default Branch: master
Last Pushed: 2024-11-05T22:24:32.000Z (over 1 year ago)
Last Synced: 2025-04-05T09:04:47.859Z (about 1 year ago)
Language: Ruby
Size: 505 KB
Stars: 3
Watchers: 12
Forks: 32
Open Issues: 0
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- License: LICENSE

Awesome Lists containing this project

README

# Apache Spark Chef cookbook

### Install Spark standalone

### Install Spark yarn

## References

* https://documentation.altiscale.com/spark-2-0-with-altiscale
* https://www.linkedin.com/pulse/running-spark-2xx-cloudera-hadoop-distro-cdh-deenar-toraskar-cfa
*

set "spark.yarn.jars"
$ Cd $SPARK_HOME
$ hadoop fs mkdir spark-2.0.0-bin-hadoop
$hadoop fs -copyFromLocal jars/* spark-2.0.0-bin-hadoop
$ echo "spark.yarn.jars=hdfs:///nameservice1/user//spark-2.0.0-bin-hadoop/*" >> conf/spark-defaults.conf

If you do have access to the local directories of all the nodes in your cluster you can copy the archive or spark jars to the local directory of each of the data nodes using rsync or scp. Just update the URLs from hdfs:/ to local:

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/logicalclocks/spark-chef

Awesome Lists containing this project

README