Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/prekshivyas/livebeats
LiveBeats : A live dashboard for real-time music streaming insights.
https://github.com/prekshivyas/livebeats
airflow cloud data-analytics data-engineering dbt gcp kafka spark-streaming
Last synced: 6 days ago
JSON representation
LiveBeats : A live dashboard for real-time music streaming insights.
- Host: GitHub
- URL: https://github.com/prekshivyas/livebeats
- Owner: prekshivyas
- Created: 2024-06-21T09:44:15.000Z (5 months ago)
- Default Branch: main
- Last Pushed: 2024-06-24T11:15:04.000Z (5 months ago)
- Last Synced: 2024-06-25T02:58:57.065Z (5 months ago)
- Topics: airflow, cloud, data-analytics, data-engineering, dbt, gcp, kafka, spark-streaming
- Homepage:
- Size: 396 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# liveBeats (in progress)
I have always been an ardent lover of the cloud and Apache Airflow
In this project I delve into GCP and Data Analytics to bring to you LiveBeats - an end to end data pipeline using Apache Kafka, Apache Spark Streaming, dbt, Docker, Airflow, Terraform and GCP to visualize real time streaming music data.
## Architecture
![Architecture](images/SpotifyDataStreamingAnalytics.png)- Dataset:
1. [Eventsim](http://millionsongdataset.com/pages/getting-dataset/#subset)
- Tools & Technologies
1. Cloud - Google Cloud Platform
2. Infrastructure as Code software - Terraform
3. Containerization - Docker, Docker Compose
4. Stream Processing - Kafka, Spark Streaming
5. Orchestration - Airflow
6. Transformation - dbt
7. Data Lake - Google Cloud Storage
8. Data Warehouse - BigQuery
9. Data Visualization - Data Studio