https://github.com/swapnitagoyal/cricket-match-data-analysis
ETL on the raw cricket json data using snowflake
https://github.com/swapnitagoyal/cricket-match-data-analysis
json snowflake sql
Last synced: about 1 year ago
JSON representation
ETL on the raw cricket json data using snowflake
- Host: GitHub
- URL: https://github.com/swapnitagoyal/cricket-match-data-analysis
- Owner: SwapnitaGoyal
- Created: 2025-02-01T05:44:20.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2025-02-01T05:45:53.000Z (over 1 year ago)
- Last Synced: 2025-04-05T19:36:00.154Z (about 1 year ago)
- Topics: json, snowflake, sql
- Homepage:
- Size: 14.6 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Cricket-Match-Data-Analysis
This project demonstrates the process of transforming raw cricket related JSON data into structured dimension and transactional tables using Snowflake. The focus is on extracting relevant details related to players, teams, matches, and innings to populate the following tables:
Data Source: https://cricsheet.org/downloads/
Technologies Used:
-> Snowflake (Data Warehousing & Processing)
-> SQL (Data Transformation)
-> JSON Handling (Parsing & Structuring Data)
This project follows the ETL (Extract, Transform, Load) process to clean, transform, and load the data into Snowflake's structured tables.
Processed Data Includes:
DimPlayer : Player details
DimTeam : Team information
DimVenue : Venue details
Match_Details : Match-level data
Innings_Details : Ball-by-ball or innings-level information
Note: In order to reduce the credit used in Snowflake, only one JSON file is used. Feel free to increase the number of JSON input files as needed for your specific use case.