An open API service indexing awesome lists of open source software.

https://github.com/tatevkaren/pyspark_tutorial

PySpark Tutorial
https://github.com/tatevkaren/pyspark_tutorial

Last synced: about 1 month ago
JSON representation

PySpark Tutorial

Awesome Lists containing this project

README

        

# PySpark Cheat Sheet For Big Data Analytics




flame




For this article we have used Stroke Prediction Dataset publicly available on Kaggle .


Following topics are included in this tutorial:

- Loading Data
- Viewing Data
- Selecting Data
- Counting Data
- Unique Values
- Filtering Data
- Ordering Data
- Creating New Variables
- Deleting Data
- Changing Data Types
- Conditions
- Data Aggregation


Deatiled explanation and sample outputs can be found in this Medium article PySpark Cheat Sheet For Big Data Analytics