https://github.com/ceteri/intro_spark
Code examples supporting the "Introduction to Apache Spark" video published by O'Reilly Media
https://github.com/ceteri/intro_spark
Last synced: about 1 year ago
JSON representation
Code examples supporting the "Introduction to Apache Spark" video published by O'Reilly Media
- Host: GitHub
- URL: https://github.com/ceteri/intro_spark
- Owner: ceteri
- License: other
- Archived: true
- Created: 2015-05-21T20:26:33.000Z (about 11 years ago)
- Default Branch: master
- Last Pushed: 2022-07-01T17:37:35.000Z (almost 4 years ago)
- Last Synced: 2025-03-29T21:51:06.381Z (about 1 year ago)
- Language: Scala
- Homepage: http://shop.oreilly.com/product/0636920036807.do
- Size: 158 KB
- Stars: 37
- Watchers: 9
- Forks: 35
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
Introduction to Apache Spark
============================
The material here supports the O'Reilly Media video by Paco Nathan:
[Introduction to Apache Spark](http://shop.oreilly.com/product/0636920036807.do)
Please see the code examples in the `src` directory here, which are numbered
in the sequence used in the video.
This material assumes that you have downloaded a pre-compiled version of
Apache Spark on your laptop from http://spark.apache.org/downloads.html
Outline
-------
* Pre-Flight Check
* Spark Deconstructed: Log Mining Example
* Word Count
* Join
* Coding Exercise
* Pi Approximation
* Spark Streaming example
* Network Word Count in Python
* Network Word Count in Python -- Stateful
* GraphX example
* build/run SimpleApp.java with Maven
* build/run SimpleApp.scala with SBT
Updates
-------
See the `bikeshare` directory for the Spark 1.3 update, showing DataFrames,
MLlib, and GraphX with examples based on Capital Bikeshare data.
---
This work is licensed under the Creative Commons Attribution-ShareAlike 4.0
International License. To view a copy of this license, visit
http://creativecommons.org/licenses/by-sa/4.0/