An open API service indexing awesome lists of open source software.

https://github.com/k-l-16/marathon-data-analysis

cleaning, enriching, and analyzing marathon race data
https://github.com/k-l-16/marathon-data-analysis

geopy pandas python sql

Last synced: about 2 months ago
JSON representation

cleaning, enriching, and analyzing marathon race data

Awesome Lists containing this project

README

          

# Marathon-Data-Analysis
cleaning, enriching, and analyzing marathon race data
---
- **Python**
- `pandas` (data cleaning and transformation)
- `geopy` (latitude and longitude retrieval based on city/state)
- **SQL**
- Aggregation queries
- Window functions (RANK)
- CTEs (Common Table Expressions)
- **Tools**
- DataGrip (SQL query management)
- GitHub (version control)

---

## Key Features
- Cleaned raw marathon race data (removed missing values, combined names, computed total minutes).
- Enriched data with geographic coordinates (latitude, longitude).
- Saved processed data to CSV for further analysis.
- Wrote SQL queries for:
- Counting distinct states
- Calculating average race times by gender
- Finding age range by gender
- Grouping average times by age bucket
- Ranking top 3 finishers per gender

---