Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/llimllib/bostonmarathon
Almost complete data dumps from the Boston Marathon, 2001-2014
https://github.com/llimllib/bostonmarathon
Last synced: about 1 month ago
JSON representation
Almost complete data dumps from the Boston Marathon, 2001-2014
- Host: GitHub
- URL: https://github.com/llimllib/bostonmarathon
- Owner: llimllib
- License: mit
- Created: 2014-04-26T18:07:38.000Z (over 10 years ago)
- Default Branch: master
- Last Pushed: 2022-09-29T12:14:09.000Z (over 2 years ago)
- Last Synced: 2024-11-30T02:50:26.295Z (about 1 month ago)
- Language: Jupyter Notebook
- Homepage:
- Size: 54.3 MB
- Stars: 30
- Watchers: 7
- Forks: 14
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
Boston Marathon Raw Data
==================================================This repository contains, as close as I can manage, all of the data on the Boston Marathon
available from baa.org. It also contains a python notebook for exploration of that data.The Data
--------------------------------------Look in the results/{year}/results.csv files for the data. Do something interesting with
it, and make sure you [tell me about it]([email protected])!Format
--------------------------------------There are (unfortunately) two different data formats. 2013 and 2014 have more detailed
timing data, with splits at 10k, 20k, 25k, half, 30k, 35k, and 40k.Pre-2013, the data has only the finishing time, but adds the person's standing in their
division, gender, and overall.Caveats
--------------------------------------* The data includes wheelchair racers but not hand cyclists or other special groups...
if you're interested in that data please submit a pull request!
* The data does not include runners who did not finish. There's nothing I can
do about that, as far as I can tell that data is unavailable from baa.org
* The data is certainly missing a few people, but it ought to contain the large
majority of runners who finished from each year.
* The code is ugly. This is just about grinding the results out!Visualizations
--------------------------------------* @tmcw
* [heatmap of finish times](http://bl.ocks.org/tmcw/11376778/d39142fc73e14097fad33d50e75366d197b6c2a3)
* [which years did people PR?](http://bl.ocks.org/tmcw/raw/11385055/)
* me
* ![Violin plot of finish times 2001-2014](https://raw.githubusercontent.com/llimllib/bostonmarathon/master/images/times_violin.png)
* [histogram of finishers by gender and age](https://pbs.twimg.com/media/BmH86ZHCQAEay54.png:large)License
--------------------------------------MIT License. Use it as you want to, don't feel obligated to give me credit. It's the BAA's
data anyway. (Thanks for organizing, BAA)Downloading The Data
--------------------------------------I... already did that for you. Why do you want to do that?
Anyway, if you do, you'll want to run `python multidl.py {year}`
Viewing the notebook
--------------------------------------1. Install the prerequisites: `pip install < requirements.txt`
2. Start the notebook: `make notebook`
3. Play!