Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/benhg/homerun-distance-tests
https://github.com/benhg/homerun-distance-tests
Last synced: 19 days ago
JSON representation
- Host: GitHub
- URL: https://github.com/benhg/homerun-distance-tests
- Owner: benhg
- Created: 2019-07-26T01:38:00.000Z (over 5 years ago)
- Default Branch: master
- Last Pushed: 2019-07-26T02:33:58.000Z (over 5 years ago)
- Last Synced: 2024-11-01T22:43:33.105Z (2 months ago)
- Language: Python
- Size: 930 KB
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Baseball distance comparison
## What is this?
I saw a tweet from Chuck Garfein [here](https://twitter.com/ChuckGarfien/status/1154546316153970688) (follow him if you're not already) which read:
"Nelson Cruz just hit the baseball on the left field concourse. Ball has been jumping off his bat this year, especially here in Chicago."
I was intrigued by this. I wanted to know whether there really was a significant difference between homerun distances this year in Chicago and anywhere else. I decided that the tool appropriate was probably the Kolmogorov-Smirnov test, a statistical test used to determine whether there's a "significant" difference between two distributions. This code will help answer that question.
## Data source
I exported the data from (Baseball Savant)[https://baseballsavant.mlb.com/statcast_search?hfPT=&hfAB=home%5C.%5C.run%7C&hfBBT=&hfPR=hit%5C.%5C.into%5C.%5C.play%5C.%5C.score%7C&hfZ=&stadium=17&hfBBL=&hfNewZones=&hfGT=R%7C&hfC=&hfSea=2019%7C&hfSit=&player_type=batter&hfOuts=&opponent=&pitcher_throws=&batter_stands=&hfSA=&game_date_gt=&game_date_lt=&hfInfield=&team=&position=&hfOutfield=&hfRO=&home_road=&hfFlag=&hfPull=&metric_1=&hfInn=&min_pitches=0&min_results=0&group_by=name&sort_col=pitches&player_event_sort=h_launch_speed&sort_order=desc&min_pas=0#results]. I asked for all homeruns, then all homeruns at The Cell (I call it 'downward arrow park' after the rebrand), then finally all homeruns from Wrigley Field.
## Results
I computed the mean distances of homeruns in chicago versus elswhere, and found no major differences. Here are the results: ![here](means.png). Players in Chicago hit the ball 401 ft on homeruns as compared to the mlb-wide 400 ft.
Then, I made a relative-frequency histogram to visually compare. The distributions look pretty similar. Here's the graph: ![here](hr_distributions.png)
Finally, I did a KS test to check if the two distributions were statistically significantly different. The usual standard for tests like this is P<=.05. In this case, the null hypothesis is that the distributions are not different. With a P-value of p=0.8889765283383031, we fail to reject the null hypothesis and the distributions are not statistically significantly different.
## Addendum
I will probably test between 2018 and 2019 as well. Will add a note here when I do that.