https://github.com/papagorgio23/rufus_analyst
This repo contains the data challenge and answer for Rufus Peabody's Analyst position from 2019
https://github.com/papagorgio23/rufus_analyst
Last synced: 7 months ago
JSON representation
This repo contains the data challenge and answer for Rufus Peabody's Analyst position from 2019
- Host: GitHub
- URL: https://github.com/papagorgio23/rufus_analyst
- Owner: papagorgio23
- License: mit
- Created: 2020-12-23T06:09:21.000Z (almost 5 years ago)
- Default Branch: main
- Last Pushed: 2021-05-01T06:37:59.000Z (over 4 years ago)
- Last Synced: 2025-01-29T12:12:09.246Z (9 months ago)
- Language: HTML
- Size: 9.24 MB
- Stars: 2
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Rufus_Analyst
This repo contains the data challenge and answer for Rufus Peabody's Analyst position from 2019
1. Using data from this webpage alone (https://www.pro-football-reference.com/years/2019/) - do not
use any data that can be found only through links on this webpage - build a model which you'll use to
determine the probability that the home team wins, and to predict the final score of the game, for each
game for week 15 (http://www.nfl.com/schedules/2019/REG15). Send us model details and prediction
results. How confident are you in your predictions? If you wanted to get a more quantitative
measure of your confidence in your predictions, what might you do?
2. Using the data in the attached file (‘cfb_games_for_ml_task’), and no other data, you are tasked with
coming up with win probability forecasts for future college football games based on the point spread
(and total, if you so desire).
3. You need to forecast NFL quarterback (QB) performance. Using the data in the attached file
('qb_by_game' file), and no other data, forecast a number for yards_adj using previous yards_adj and
any other variables in the dataset you think are relevant. Think about how to properly regress the QB
to a relevant “mean” without introducing survivorship bias.
4. You are interested in building profitable gambling models in soccer, despite having no experience
with soccer data and little knowledge of the sport. A friend recommends this paper to you, which you
read: https://arxiv.org/pdf/1802.07127.pdf
Now it's time to create your own team and player ratings, predict game outcomes, etc., with an eye to
making profitable bets. Do you use the framework used in this paper? Why or why not? If not,
what framework do you use?
5. Give an example of a project you've done where you've had to clean, organize, combine multiple
datasets. Think about something you’ve done that would be relevant if you were putting together a
database of college, G League, international and NBA players, recognizing that some players end up
spending some time in all of those four places, have many seasons of data associated with them, and
also have non game playing metrics that might be of interest.
6. What experience, if any, do you have working with player location (coordinates) data?
TODO: Run instructions