Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/datitran/emr-bootstrap-pyspark
Quickstart PySpark with Anaconda on AWS/EMR
https://github.com/datitran/emr-bootstrap-pyspark
aws emr python3
Last synced: 22 days ago
JSON representation
Quickstart PySpark with Anaconda on AWS/EMR
- Host: GitHub
- URL: https://github.com/datitran/emr-bootstrap-pyspark
- Owner: datitran
- License: mit
- Created: 2016-11-06T15:17:51.000Z (about 8 years ago)
- Default Branch: master
- Last Pushed: 2017-01-09T08:25:16.000Z (almost 8 years ago)
- Last Synced: 2024-10-10T12:36:32.870Z (about 1 month ago)
- Topics: aws, emr, python3
- Language: Python
- Homepage: https://medium.com/@datitran/quickstart-pyspark-with-anaconda-on-aws-660252b88c9a?source=user_profile---------15----------------
- Size: 7.81 KB
- Stars: 53
- Watchers: 6
- Forks: 19
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# EMR Bootstrap PySpark with Anaconda
This code should help to jump start PySpark with Anaconda on AWS.
## Getting Started
1. `conda env create -f environment.yml`
2. Fill in all the required information e.g. aws access key, secret acess key etc. into the `config.yml.example` file and rename it to `config.yml`
3. Run it `python emr_loader.py`## Requirements
- [Anaconda 3](https://www.continuum.io/downloads)
- [AWS Account](https://aws.amazon.com/)## Copyright
See [LICENSE](LICENSE) for details.
Copyright (c) 2016 [Dat Tran](http://www.dat-tran.com/).