https://github.com/webdevcaptain/pyspark-intro
Introduction to PySpark
https://github.com/webdevcaptain/pyspark-intro
pyspark
Last synced: 3 months ago
JSON representation
Introduction to PySpark
- Host: GitHub
- URL: https://github.com/webdevcaptain/pyspark-intro
- Owner: WebDevCaptain
- Created: 2025-02-01T10:21:45.000Z (5 months ago)
- Default Branch: main
- Last Pushed: 2025-02-01T11:23:29.000Z (5 months ago)
- Last Synced: 2025-02-01T12:25:13.105Z (5 months ago)
- Topics: pyspark
- Language: Jupyter Notebook
- Homepage:
- Size: 6.84 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Introduction to PySpark
PySpark is a Python API for Apache Spark. It offers a faster, more flexible alternative to the traditional MapReduce framework.
## Contents
1. [RDD Notebook](./pyspark-primer.ipynb)
2. [Pyspark SQL](./pyspark-intro.ipynb)## References
- [PySpark](https://spark.apache.org/docs/latest/api/python/index.html)