Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/npatta01/spark_metis_investigation

My investigation presentation during the Metis Bootcamp
https://github.com/npatta01/spark_metis_investigation

Last synced: 14 days ago
JSON representation

My investigation presentation during the Metis Bootcamp

Awesome Lists containing this project

README

        

# About
My investigation presentation during my stay at [Metis](Bootcamp).

Simple word count and length on a sampe of the github commit messages.

# Audience
Meant for a beginner audience who has heard the term big data but is more comfortable with pandas and python.

# Data

The data contain two hours data from github that is archived by the github archive project.

# Deliverables

[Slides](http://www.slideshare.net/nidhinpattaniyil/beginner-apache-spark-presentation)
[Analysis](https://github.com/npatta01/spark_metis_investigation/blob/master/spark.ipynb)

Presented on a high level what Apache Spark was and how to use it using pyspark.

My slides and sample data are included in the repo

{% iframe https://www.slideshare.net/slideshow/embed_code/key/7t1RDQfmaRkh9N 425 355 %}