https://github.com/rhejos/ipl_data_analysis
This project explores data analysis of the Indian Premier League utilizing AWS S3, Apache Spark, python, and SQL.
https://github.com/rhejos/ipl_data_analysis
apache-spark aws-s3 databricks-notebooks pyspark sql
Last synced: 3 months ago
JSON representation
This project explores data analysis of the Indian Premier League utilizing AWS S3, Apache Spark, python, and SQL.
- Host: GitHub
- URL: https://github.com/rhejos/ipl_data_analysis
- Owner: rhejos
- Created: 2024-05-13T04:45:17.000Z (about 2 years ago)
- Default Branch: main
- Last Pushed: 2024-05-16T06:39:01.000Z (about 2 years ago)
- Last Synced: 2025-03-06T19:43:49.761Z (over 1 year ago)
- Topics: apache-spark, aws-s3, databricks-notebooks, pyspark, sql
- Language: Python
- Homepage:
- Size: 276 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Indian Premier League Data Analysis
Ball By Ball Data of all the IPL seasons (637 matches including 2017).
## SUMMARY
This data set has the ball by ball data of all the Indian Premier League (IPL) matches till 2017 season. This is made up of six different datasets.
Source: http://cricsheet.org/ (data is available on this website in the YAML format. This is converted to CSV format by using R Script ,SQL,SSIS.
Project Design : https://www.youtube.com/watch?v=0iNJPKheQqM
### Description
In this project I will be using Apache Spark, Python, & SQL to complete this project.
The Indian Premier League is a cricket league. This data is made up of six seperate datasets.
- Ball by ball
- Match
- Player
- Player match
- Team
The data dictionary for these datasets can be found here. https://data.world/raghu543/ipl-data-till-2017/workspace/data-dictionary
#### AWS S3 Bucket
The information for this project is stored within rhea-github AWS S3 bucket.
#### Databricks platform
Databricks platform and juypter notebook was utilized for this project.
### Visualizations
#### Top 10 Economical Players within Indian Premier League
This shows the top ten economical players. An economical bowler/player is one who concedes relatively few runs per over while bowling.

#### Impact of winning toss on match outcomes
This shows the count of matches that were won or loss if the team won the coin toss.

#### The Top 10 scorers avergae runs for winning matches
This goes over the average runs the top scorer had for winning matches.

#### Distribution of Scores by Venue
This explores the coring trends based on match venues.

#### Team performance
This ranks teams performance by how many wins they had after winning the toss.
