https://github.com/kayannr/sportstats
Historical data analysis using SQL, Databricks, Python, PandaSQL, Pandas, and SQL Window functions. .
https://github.com/kayannr/sportstats
pandasql scala spark-sql sql
Last synced: 3 months ago
JSON representation
Historical data analysis using SQL, Databricks, Python, PandaSQL, Pandas, and SQL Window functions. .
- Host: GitHub
- URL: https://github.com/kayannr/sportstats
- Owner: kayannr
- Created: 2022-09-09T18:32:31.000Z (over 2 years ago)
- Default Branch: main
- Last Pushed: 2022-09-23T19:27:04.000Z (over 2 years ago)
- Last Synced: 2025-01-23T03:14:42.565Z (4 months ago)
- Topics: pandasql, scala, spark-sql, sql
- Language: Jupyter Notebook
- Homepage:
- Size: 34.2 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# sportstats
This project aims to analyze historical Olympics dataset using SQL.Tasks performed:
* Data collection and aggregation
* Data cleaning and deduplication
* Data quality and validity assessment
* Exploratory data analysis using SQL
* Hypothesis development and testing using Python and Pandas
* Recommendation development based on results obtained