Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/giulic3/data-engineering-nanodegree

Projects realized for the Data Engineering Nanodegree offered by Udacity https://www.udacity.com/course/data-engineer-nanodegree--nd027
https://github.com/giulic3/data-engineering-nanodegree

apache-airflow apache-cassandra apache-spark aws aws-emr aws-redshift aws-s3 data-engineering postgresql

Last synced: about 1 month ago
JSON representation

Projects realized for the Data Engineering Nanodegree offered by Udacity https://www.udacity.com/course/data-engineer-nanodegree--nd027

Awesome Lists containing this project

README

        

# data-engineering-nanodegree

This is a collection of the projects realized following the syllabus of the Data Engineering Nanodegree offered by Udacity (https://www.udacity.com/course/data-engineer-nanodegree--nd027).

## Course overview and projects
The course is divided into 4 blocks of lessons, each block consists of a theoretical introduction on various topics, a series of demos for hands-on practice on the explained concepts and one (or two) projects:

***1. Data Modeling***
* Introduction to Data Modeling
* Relational Data Models
* _[Proj1]_: Data Modeling with Postgres
* NoSQL Data Models
* _[Proj2]_: Data Modeling with Apache Cassandra

***2. Cloud Data Warehouses***
* Introduction to Data Warehouses
* Introduction to Cloud Computing and AWS
* Implementing Data Warehouses on AWS
* _[Proj3]_: Data Warehouse

***3. Data Lakes with Spark***
* The Power of Spark
* Data Wrangling with Spark
* Debugging and Optimization
* Introduction to Data Lakes
* _[Proj4]_: Data Lake

***4. Data Pipelines with Airflow***
* Data Pipelines
* Data Quality
* Production Data Pipelines
* _[Proj5]_: Data Pipelines

***5. Bonus: [CapstoneProject]*** - ####