Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/dimitrov-s-dev/pyspark

PySpark
https://github.com/dimitrov-s-dev/pyspark

pyspark python3 spark spark-sql

Last synced: 30 days ago
JSON representation

PySpark

Awesome Lists containing this project

README

        

#

![alt text](https://github.com/Dimitrov-S-Dev/PySpark/blob/master/pyspark.jpg)


# Big Data Practices with PySpark & Spark Tuning
Semi-Structured (JSON), Structured and Unstructured Data Analysis with Spark and Python & Spark Performance Tuning
## Acquired skills
- Apache Spark’s framework, execution and programming model.
- Lazy evaluations (Narrow vs Wide transformation) and internal working of Spark.
- PySpark practices on structured, unstructured and semi-structured data using RDD, DataFrame and SQL.
- Build simple to advanced Big Data applications for different types of data (volume, variety, veracity) through real case studies.
- Apply Adaptive Query Execution (AQE) to optimize Spark SQL query execution at runtime