https://github.com/coditva/spark-pbspro
Support for PBS Pro in Spark
https://github.com/coditva/spark-pbspro
Last synced: 4 months ago
JSON representation
Support for PBS Pro in Spark
- Host: GitHub
- URL: https://github.com/coditva/spark-pbspro
- Owner: coditva
- Created: 2018-12-14T07:48:43.000Z (about 7 years ago)
- Default Branch: master
- Last Pushed: 2019-04-29T08:56:05.000Z (almost 7 years ago)
- Last Synced: 2025-05-20T10:45:13.878Z (9 months ago)
- Language: Scala
- Size: 123 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- Code of conduct: CODE_OF_CONDUCT.md
Awesome Lists containing this project
README
# Spark PBSPro Support
This adds support for [PBS Professional](https://github.com/pbspro/pbspro)
HPC resource manager in [Apache Spark](https://github.com/apache/spark).
### Status of build with latest Spark
[](https://travis-ci.com/PBSPro/spark-pbspro-connector)
## Usage
You can run Spark on the PBS cluster just by adding "--master pbs" while submitting as follows:
```bash
# start spark shell. only in client mode
./bin/spark-shell --master pbs
# submit a spark application in client mode
./bin/spark-submit --master pbs --deploy-mode client --class org.apache.spark.examples.SparkPi $SPARK_HOME/examples/target/scala-2.12/jars/spark-examples_2.12-3.0.0-SNAPSHOT.jar 100
# submit a spark application in cluster mode
./bin/spark-submit --master pbs --deploy-mode cluster --class org.apache.spark.examples.SparkPi $SPARK_HOME/examples/target/scala-2.12/jars/spark-examples_2.12-3.0.0-SNAPSHOT.jar 100
```
You can also just append `spark.master pbs` in `conf/spark-defaults.conf` to avoid adding
`--master pbs` on every submit.
To run Spark UI with PBS cluster:
```bash
bin/spark-class org.apache.spark.deploy.pbs.ui.PbsClusterUI
```
## Installation
This expects PBSPro to be installed at `/opt/pbs`.
Clone the Spark repository and move to spark folder
```bash
git clone https://github.com/apache/spark.git
cd spark
```
In the spark project root, punch in these commands:
```bash
# Clone the repo
git clone https://github.com/PBSPro/spark-pbspro-connector resource-managers/pbs
# Apply patch to spark (in the root directory).
git am resource-managers/pbs/*.patch
# Build!
build/mvn -DskipTests -Ppbs package
```
Add executor home to your configuration:
```bash
# in file conf/spark-defaults.conf add line:
spark.pbs.executor.home "SPARK INSTALLATION DIRECTORY PATH IN PBS MOMS"
```