Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/yukta026/tokyo-olympics-2021-analytics

An end-to-end ETL pipeline for analyzing and visualizing Tokyo Olympics 2021 data using Azure tools and Power BI.
https://github.com/yukta026/tokyo-olympics-2021-analytics

azure data-engineering etl hadoop powerbi python3 spark sql

Last synced: 4 months ago
JSON representation

An end-to-end ETL pipeline for analyzing and visualizing Tokyo Olympics 2021 data using Azure tools and Power BI.

Awesome Lists containing this project

README

        

## Tokyo Olympics 2021 Analytics Dashboard

This project involves building an ETL (Extract, Transform, Load) pipeline for analyzing data from the Tokyo Olympics 2021.
The pipeline is designed using various Azure tools, with each phase leveraging specific services to ensure efficient data handling and processing.

### Pipeline Phases and Azure Tools Used:

1. **Data Extraction**
- **Azure Data Lake Gen 2**: For storing raw data.
- **Azure Data Factory**: For orchestrating data extraction processes.

2. **Data Transformation**
- **Azure Databricks**: For data processing using Spark and Hadoop.
- **Python**: For implementing custom transformation logic.

3. **Data Loading**
- **Azure Data Lake Gen 2**: For storing processed data.
- **Azure Synapse Analytics**: For large-scale data warehousing and analytics.

4. **Data Visualization**
- **Power BI**: For creating interactive dashboards and reports to visualize the insights from the data.

## Dashboard
Screenshot 2024-09-09 at 10 29 52 PM

## References
1) https://www.kaggle.com/datasets/arjunprasadsarkhel/2021-olympics-in-tokyo
2) https://www.youtube.com/watch?v=IaA9YNlg5hM
3) https://www.youtube.com/watch?v=nW0ffUW2vw4&t=0s
4) https://www.numerro.io/design-challenges/olympic-games