Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/shubhammohanty680/spotify_end_to_end_data_engineering

It is a project build using ETL(Extract, Transform, Load) pipeline using Spotify API on AWS.
https://github.com/shubhammohanty680/spotify_end_to_end_data_engineering

Last synced: 10 days ago
JSON representation

It is a project build using ETL(Extract, Transform, Load) pipeline using Spotify API on AWS.

Awesome Lists containing this project

README

        

# Spotify_End_To_End_Data_Engineering Project

### Introduction
It is a project build using ETL(Extract, Transform, Load) pipeline using Spotify API on AWS. The pipeline will retrieve data from Spotify API, transform it to a desired format, and load into AWS S3(Data Storage).

### Architecture
![Architecture Diagram](https://github.com/user-attachments/assets/504350dd-973c-4a4c-8459-d008eb0edb31)

### Services Used
1. **AWS S3(Simple Storage Service):** AWS S3 is a highly saclable object storage service used for storing and retrieving any amount of data from anywhere using web. It is commonly used to store and distribute large files.

2. **AWS Lambda:** AWS Lambda is a serverless computing service that lets you run code without managing servers.

3. **Cloud Watch:** AWS Cloudwatch is a monitoring service for AWS Resources and the applications you run on them. It is used to collect and tracks metrics, collect and monitor log files and set alarms.

4. **AWS Data Catalog:** AWS Data Catalog is a centralized metadata repository for all your data assets across various data sources.

5. **AWS GLue Crawler:** AWS Data Catalog consists of AWS Glue Crawler which crawls the data sources and identifies data formats, infer schemas and create AWS Glue Data Catalog.

6. **AWS Athena:** Amazon Athena is a interactive query service that makes it easy to analyze data stored in S3 using standard SQL. It is also used to analyze stored in Glue Data Catalog.