An open API service indexing awesome lists of open source software.

https://github.com/sarthak-1408/pyspark-tutorial

In this Repo, I create a tutorial of PySpark to better understand how to read and manage Big Data.
https://github.com/sarthak-1408/pyspark-tutorial

machine-learning pyspark pyspark-mllib pyspark-python pyspark-tutorial python3

Last synced: 19 days ago
JSON representation

In this Repo, I create a tutorial of PySpark to better understand how to read and manage Big Data.

Awesome Lists containing this project

README

        

# PySpark Tutorial
## Overview

- PySpark is an interface for Apache Spark in Python. It not only allows you to write Spark applications using Python APIs, but also provides the PySpark shell for interactively analyzing your data in a distributed environment.
- In this Repository i explain each and everything about PySpark and how can you do read , handle missing values etc with the help of PySpark.

## Installation
```sh
pip install pyspark
```
- For install Windows, Mac, Linux :- https://www.datacamp.com/community/tutorials/installation-of-pyspark

### Credits
- Krish Naik (https://www.youtube.com/user/krishnaik06)