https://github.com/kyeah/geotrellis-cassandra
A spike project for supporting Spark + Cassandra in GeoTrellis.
https://github.com/kyeah/geotrellis-cassandra
Last synced: 5 months ago
JSON representation
A spike project for supporting Spark + Cassandra in GeoTrellis.
- Host: GitHub
- URL: https://github.com/kyeah/geotrellis-cassandra
- Owner: kyeah
- Created: 2015-02-18T19:58:30.000Z (over 11 years ago)
- Default Branch: master
- Last Pushed: 2015-02-18T22:18:38.000Z (over 11 years ago)
- Last Synced: 2023-04-04T11:56:28.520Z (about 3 years ago)
- Language: Shell
- Homepage:
- Size: 9.33 MB
- Stars: 0
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
Geotrellis-Cassandra
========================
This is a spike for integrating Spark + Apache Cassandra support into GeoTrellis.
# Installation
## Spark Cluster
Install Spark and launch a standalone master server using `$SPARK_HOME/sbin/start-master.sh`. This will print out the `spark:://HOST:PORT` URL used to connect workers.
Then, start a worker using `./bin/spark-class org.apache.spark.deploy.worker.Worker spark://IP:PORT`. Check that it is connected at the master's portal, default `http://localhost:8080/`.
## Cassandra Cluster
Install Cassandra and launch a local cluster using `sudo service cassandra start`. Check that it is running using `nodetool status`.
## Version Compatibility
Note the version compatibility requirements between Spark, Cassandra, and Datastax's Connector.
| Connector | Spark | Cassandra |
| --------- | ------------- | --------- |
| 1.2 | 1.2 | 2.1, 2.0 |
| 1.1 | 1.1, 1.0 | 2.1, 2.0 |
| 1.0 | 1.0, 0.9 | 2.0 |
# Running
After setting up your Spark and Cassandra clusters, edit `Main.scala` to point to your clusters. Run `sbt run` to execute the `main` function, which provides an example of interacting with RDDs in your Cassandra tables.