https://github.com/ramapinnimty/udacity-dataengineering-nanodegree

Projects done as part of the Udacity Data Engineering Nanodegree program.
https://github.com/ramapinnimty/udacity-dataengineering-nanodegree

apache-cassandra aws data-engineering postgresql python udacity-nanodegree

Last synced: 5 months ago
JSON representation

Projects done as part of the Udacity Data Engineering Nanodegree program.

Host: GitHub
URL: https://github.com/ramapinnimty/udacity-dataengineering-nanodegree
Owner: ramapinnimty
License: apache-2.0
Created: 2022-06-13T05:28:53.000Z (about 4 years ago)
Default Branch: main
Last Pushed: 2022-07-16T14:00:39.000Z (almost 4 years ago)
Last Synced: 2025-03-14T07:45:52.143Z (over 1 year ago)
Topics: apache-cassandra, aws, data-engineering, postgresql, python, udacity-nanodegree
Language: Jupyter Notebook
Homepage:
Size: 2.39 MB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

# Udacity Data Engineering Nanodegree
Projects done as part of the [Data Engineering Nanodegree program](https://www.udacity.com/course/data-engineer-nanodegree--nd027) offered by Udacity.

## Project 1: [Data Modeling with PostgreSQL](https://github.com/ramapinnimty/Udacity-DataEngineering-Nanodegree/tree/main/01-Data%20Modeling/Project_01-Relational%20Databases-Data%20Modeling%20with%20PostgreSQL)
Developed a SQL database using PostgreSQL to model user activity data for a music streaming app.
* Created a relational database using PostgreSQL locally.
* Developed a Star Schema database using optimized definitions of Fact and Dimension tables and also performed Normalization on tables.
* Built out an ETL pipeline to optimize queries in order to understand what songs users are listening to.

*Tech stack: - Python, PostgreSQL, Star Schema, ETL pipelines, Normalization*

## Project 2: [Data Modeling with Apache Cassandra](https://github.com/ramapinnimty/Udacity-DataEngineering-Nanodegree/tree/main/01-Data%20Modeling/Project_02-Non-Relational%20Databases-Data%20Modeling%20with%20Apache%20Cassandra)
Designed a NoSQL database using Apache Cassandra based on the original schema outlined in `Project 1`.
* Created a NoSQL database using Apache Cassandra locally.
* Developed denormalized tables optimized for a specific set of queries and business needs.

*Tech stack: - Python, Apache Cassandra, Denormalization*

## Project 3: [Data Warehouse using AWS](https://github.com/ramapinnimty/Udacity-DataEngineering-Nanodegree/tree/main/02-Cloud%20Data%20Warehouses/Project_03-Data%20Warehouse%20using%20AWS)
Created a database warehouse utilizing Amazon Redshift.
* Created a Redshift cluster along with the appropriate IAM role and Security group.
* Developed an ETL Pipeline that loads data from S3 buckets into staging tables on Redshift which will be processed using Star schema.
* Optimized queries to enable faster loads as required by the Data Analytics team.

*Tech stack: - Python, AWS CLI, Amazon SDK, PostgreSQL, Amazon S3, Amazon Redshift*

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/ramapinnimty/udacity-dataengineering-nanodegree

Awesome Lists containing this project

README