Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/oskardudycz/sparkwithscalaanddocker
Example showing how to run Spark with Scala and Docker
https://github.com/oskardudycz/sparkwithscalaanddocker
Last synced: 14 days ago
JSON representation
Example showing how to run Spark with Scala and Docker
- Host: GitHub
- URL: https://github.com/oskardudycz/sparkwithscalaanddocker
- Owner: oskardudycz
- Created: 2017-11-04T10:30:05.000Z (about 7 years ago)
- Default Branch: master
- Last Pushed: 2020-01-02T10:19:53.000Z (almost 5 years ago)
- Last Synced: 2024-10-19T21:12:36.788Z (24 days ago)
- Language: Scala
- Homepage:
- Size: 659 KB
- Stars: 1
- Watchers: 3
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Spark With Scala And Docker
Project shows how to create basic configuration of Spark workstation with Docker. It's not needed to setup on local environment Java, Scala, Spark, Hadoop on any other tool than Docker. As an example is show how to do simplest lines count in Spark.
Code is placed in [/src/main/scala/](https://github.com/oskardudycz/SparkWithScalaAndDocker/blob/master/src/src/main/scala/FileLinesCount.scala)
```scala
package sparkWithScalaAndDockerimport org.apache.spark.{SparkContext, SparkConf}
object FileLinesCount {
def main(args: Array[String]) {
val conf = new SparkConf().setAppName("SparkWithScalaAndDocker Application")
val sc = new SparkContext(conf)val fileName = args(0)
val lines = sc.textFile(fileName).cacheval c = lines.count
println(s"There are $c lines in $fileName")
}
}
```
Install [Docker](https://www.docker.com/get-docker)Open cmd/shell:
1. Run `init` to start docker (see details in [init.bat](https://github.com/oskardudycz/SparkWithScalaAndDocker/blob/master/init.bat)).
2. Run `build` to build project (see details in [build.bat](https://github.com/oskardudycz/SparkWithScalaAndDocker/blob/master/build.bat)).
3. Run `run` to run project in Spark (see details in [run.bat](https://github.com/oskardudycz/SparkWithScalaAndDocker/blob/master/run.bat))I found an issue or I have a change request
--------------------------------
Feel free to create an issue on GitHub. Contributions, pull requests are more than welcome!**Spark With Scala And Docker** is Copyright © 2017-2020 [Oskar Dudycz](http://oskar-dudycz.pl) and other contributors under the [MIT license](LICENSE).