Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/maprihoda/learning-spark


https://github.com/maprihoda/learning-spark

apache-spark data-analysis data-science data-wrangling machine-learning pyspark python

Last synced: 8 days ago
JSON representation

Awesome Lists containing this project

README

        

Following [Learning Spark](https://github.com/databricks/LearningSparkV2), while adding adding my own modifications, code snippets and elaborating exercises.

Download the datasets from https://github.com/databricks/LearningSparkV2/subfolder and place them somewhere on your local hard drive, then modify DATA_DIRECTORY in setting.py. Mine is

```python
DATA_DIRECTORY = os.path.join(os.environ["HOME"], "data", "learning-spark").
```