Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/vigneshss-07/pyspark-acompleteguide

This repo explains pyspark modules in python. Used to deal with big data more practical handson.
https://github.com/vigneshss-07/pyspark-acompleteguide

pyspark pyspark-mllib pyspark-notebook pyspark-python pyspark-tutorial

Last synced: 5 days ago
JSON representation

This repo explains pyspark modules in python. Used to deal with big data more practical handson.

Awesome Lists containing this project

README

        

# Spark_Pyspark

* http://spark.apache.org/docs/latest/api/python/reference/index.html

***Apache Spark using Python***

1. https://github.com/dgadiraju/itversity-books/tree/master/Data%20Engineering%20Bootcamp/46%20Apache%20Spark%20using%20Python
2. https://github.com/dgadiraju/itversity-books/tree/master/starterkits/spark/python

1. A quick introduction to the Spark API
https://lnkd.in/g8Y3tdhX

2. Overview of Spark - RDD, accumulators, broadcast variable
https://lnkd.in/g7fepuFF

3. Spark SQL, Datasets, and DataFrames:
https://lnkd.in/g3iZp7zk

4. PySpark - Processing data with Spark in Python
https://lnkd.in/gBnh6PAi

5. Processing data with SQL on the command line
https://lnkd.in/ggnxDaUu

6. Cluster Overview
https://lnkd.in/guCQnJnv

7. Packaging and deploying applications
https://lnkd.in/gUZpi2P9

8. Customize Spark via its configuration system
https://lnkd.in/gZh8Vkmv

9. Monitoring - Track the behavior of your applications
https://lnkd.in/grpGKFuP

10. Best practices to optimize performance and memory use
https://lnkd.in/gTRYBDQu