{"id":19689999,"url":"https://github.com/nflverse/nflfastr","last_synced_at":"2025-04-13T01:59:32.145Z","repository":{"id":38025809,"uuid":"258836530","full_name":"nflverse/nflfastR","owner":"nflverse","description":"A Set of Functions to Efficiently Scrape NFL Play by Play Data","archived":false,"fork":false,"pushed_at":"2025-03-31T15:13:23.000Z","size":879668,"stargazers_count":444,"open_issues_count":5,"forks_count":53,"subscribers_count":23,"default_branch":"master","last_synced_at":"2025-04-13T01:59:25.429Z","etag":null,"topics":["american-football","cran","cran-r","football-data","nfl","nflstats","nflverse","r","r-package","sports-analytics"],"latest_commit_sha":null,"homepage":"https://www.nflfastr.com/","language":"R","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/nflverse.png","metadata":{"files":{"readme":"README.Rmd","changelog":"NEWS.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2020-04-25T17:40:45.000Z","updated_at":"2025-04-11T13:44:59.000Z","dependencies_parsed_at":"2023-10-10T19:01:35.267Z","dependency_job_id":"cd503ad9-c71e-49d9-9282-1ae4c798d5a8","html_url":"https://github.com/nflverse/nflfastR","commit_stats":{"total_commits":977,"total_committers":8,"mean_commits":122.125,"dds":"0.26407369498464683","last_synced_commit":"1741e871f4104362e0d96c182d22e02fa7ea31c2"},"previous_names":[],"tags_count":29,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nflverse%2FnflfastR","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nflverse%2FnflfastR/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nflverse%2FnflfastR/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nflverse%2FnflfastR/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/nflverse","download_url":"https://codeload.github.com/nflverse/nflfastR/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248654046,"owners_count":21140235,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["american-football","cran","cran-r","football-data","nfl","nflstats","nflverse","r","r-package","sports-analytics"],"created_at":"2024-11-11T19:04:04.203Z","updated_at":"2025-04-13T01:59:32.122Z","avatar_url":"https://github.com/nflverse.png","language":"R","funding_links":[],"categories":[],"sub_categories":[],"readme":"---\noutput: github_document\n---\n\n\u003c!-- README.md is generated from README.Rmd. Please edit that file --\u003e\n\n```{r, include = FALSE}\nknitr::opts_chunk$set(\n  collapse = TRUE,\n  comment = \"#\u003e\",\n  fig.path = \"man/figures/readme-\"\n)\n```\n\n# **nflfastR** \u003cimg src=\"man/figures/logo.png\" align=\"right\" width=\"25%\" min-width=\"120px\"/\u003e\n\n\n\u003c!-- badges: start --\u003e\n[![CRAN status](https://www.r-pkg.org/badges/version-last-release/nflfastR)](https://CRAN.R-project.org/package=nflfastR)\n[![CRAN downloads](https://cranlogs.r-pkg.org/badges/grand-total/nflfastR)](https://CRAN.R-project.org/package=nflfastR)\n[![Dev status](https://img.shields.io/github/r-package/v/nflverse/nflfastR/master?label=dev%20version\u0026style=flat-square\u0026logo=github)](https://www.nflfastr.com/)\n[![R build status](https://img.shields.io/github/actions/workflow/status/nflverse/nflfastR/R-CMD-check.yaml?label=R%20check\u0026style=flat-square\u0026logo=github)](https://github.com/nflverse/nflfastR/actions)\n[![Lifecycle: stable](https://img.shields.io/badge/lifecycle-stable-brightgreen.svg)](https://lifecycle.r-lib.org/articles/stages.html#stable)\n[![nflverse support](https://img.shields.io/discord/789805604076126219?color=7289da\u0026label=nflverse%20support\u0026logo=discord\u0026logoColor=fff\u0026style=flat-square)](https://discord.com/invite/5Er2FBnnQa)\n\u003c!-- [![Twitter Follow](https://img.shields.io/twitter/follow/nflfastR.svg?style=social)](https://twitter.com/nflfastR) --\u003e\n\u003c!-- badges: end --\u003e\n\n`nflfastR` is a set of functions to efficiently scrape NFL play-by-play data. `nflfastR` expands upon the features of nflscrapR:\n  \n* The package contains NFL play-by-play data back to 1999\n* As suggested by the package name, it obtains games **much** faster\n* Includes completion probability (`cp`), completion percentage over expected (`cpoe`), and expected yards after the catch (`xyac_epa` and `xyac_mean_yardage`) in play-by-play going back to 2006\n* Includes drive information, including drive starting position and drive result\n* Includes series information, including series number and series success\n* Hosts [a release of play-by-play data going back to 1999](https://github.com/nflverse/nflverse-data/releases/tag/pbp) for very quick access\n* Features models for Expected Points, Win Probability, Completion Probability, and Yards After the Catch (see section below)\n* Includes a function `update_db()` that creates and updates a database\n\nWe owe a debt of gratitude to the original [`nflscrapR`](https://github.com/maksimhorowitz/nflscrapR) team, Maksim Horowitz, Ronald Yurko, and Samuel Ventura, without whose contributions and inspiration this package would not exist.\n\n\n## Installation\n\nThe easiest way to get nflfastR is to install it from [CRAN](https://cran.r-project.org/package=nflfastR) with:\n\n```{r, eval=FALSE}\ninstall.packages(\"nflfastR\")\n```\n\nTo get a bug fix or to use a feature from the development version, you can install the development version of nflfastR either from [GitHub](https://github.com/nflverse/nflfastR/) with:\n\n``` {r eval = FALSE}\nif (!require(\"pak\")) install.packages(\"pak\")\npak::pak(\"nflverse/nflfastR\")\n```\n\nor prebuilt from the [development repo](https://nflverse.r-universe.dev) with:\n\n```{r eval = FALSE}\ninstall.packages(\"nflfastR\", repos = c(\"https://nflverse.r-universe.dev\", getOption(\"repos\")))\n```\n\n## Usage\n\nWe have provided some application examples in the **[Getting Started](https://www.nflfastr.com/articles/nflfastR.html)** article. However, these require a basic knowledge of R. For this reason we have the **[nflfastR beginner's guide](https://www.nflfastr.com/articles/beginners_guide.html)**, which we recommend to all those who are looking for an introduction to nflfastR with R.\n\nYou can find column names and descriptions in the **[Field Descriptions](https://www.nflfastr.com/articles/field_descriptions.html)** article, or by accessing the `field_descriptions` dataframe from the package.\n\n## Data access\n\nEven though `nflfastR` is very fast, **we recommend downloading the data from [here](https://github.com/nflverse/nflverse-data/releases/tag/pbp) or using the `nflreadr` package**. These data sets include play-by-play data of complete seasons going back to 1999 and are updated nightly during the season. The files contain both regular season and postseason data, and one can use game_type or week to figure out which games occurred in the postseason.\n\n## nflfastR models\n\n`nflfastR` uses its own models for Expected Points, Win Probability, Completion Probability, and Expected Yards After the Catch. To read about the models, please see [this post on Open Source Football](https://opensourcefootball.com/posts/2020-09-28-nflfastr-ep-wp-and-cp-models/). For a more detailed description of the motivation for Expected Points models, we highly recommend this paper [from the nflscrapR team located here](http://arxiv.org/pdf/1802.00998). \n\nHere is a visualization of the Expected Points model by down and yardline.\n\n``` {r epa-model, warning = FALSE, message = FALSE, results = 'hide', fig.keep = 'all', dpi = 600, echo=FALSE, eval = FALSE}\n\n# This code was used to create the ep model image. Since we don't want to include \n# the resulting png file in the package for file size reasons it was uploaded to\n# the nflfastR repo and embedded remotely with the next chunk\n\nlibrary(tidyverse)\n\ndf \u003c- nflreadr::load_pbp(2014:2019) |\u003e\n        filter(!is.na(posteam) \u0026 !is.na(ep), !is.na(down)) |\u003e\n        select(ep, down, yardline_100, air_yards, pass_location, cp)\n\ndf |\u003e\n  ggplot(aes(x = yardline_100, y = ep, color = as.factor(down))) + \n  geom_smooth(size = 2) + \n  labs(x = \"Yards from opponent's end zone\",\n       y = \"Expected points value\",\n       color = \"Down\",\n       title = \"Expected Points by Yardline and Down\") +\n  theme_bw() + \n  scale_y_continuous(expand=c(0,0), breaks = scales::pretty_breaks(10)) + \n  scale_x_continuous(expand=c(0,0), breaks = seq(from = 5, to = 95, by = 10)) +\n  theme(\n    plot.title = element_text(size = 18, hjust = 0.5),\n    plot.subtitle = element_text(size = 16, hjust = 0.5),\n    axis.title = element_text(size = 18),\n    axis.text = element_text(size = 16),\n    legend.text = element_text(size = 16),\n    legend.title = element_text(size = 16),\n    legend.position = c(.90, .80)) +\n    annotate(\"text\", x = 14, y = -2.2, size = 3, label = \"2014-2019 | Model: @nflfastR\")\n```\n\n```{r echo=FALSE, fig.align='center', fig.cap='', out.width='100%'}\nknitr::include_graphics('man/figures/readme-epa-model-1.png')\n```\n\nHere is a visualization of the Completion Probability model by air yards and pass direction.\n\n``` {r cp-model, warning = FALSE, message = FALSE, results = 'hide', fig.keep = 'all', dpi = 600, echo=FALSE, eval = FALSE}\n\n# This code was used to create the cp model image. Since we don't want to include \n# the resulting png file in the package for file size reasons it was uploaded to\n# the nflfastR repo and embedded remotely with the next chunk\n\ndf |\u003e\n  filter(!is.na(cp), between(air_yards, -5, 45)) |\u003e\n  mutate(pass_middle = if_else(pass_location == \"middle\", \"Yes\", \"No\")) |\u003e\n  ggplot(aes(x = air_yards, y = cp, color = as.factor(pass_middle))) + \n  geom_smooth(size = 2) + \n  labs(x = \"Air yards\",\n       y = \"Expected completion %\",\n       color = \"Pass middle\",\n       title = \"Expected Completion % by Air Yards and Pass Direction\") +\n  theme_bw() + \n  scale_y_continuous(expand=c(0,0), breaks = scales::pretty_breaks(5)) + \n  scale_x_continuous(expand=c(0,0)) +\n  theme(\n    plot.title = element_text(size = 18, hjust = 0.5),\n    plot.subtitle = element_text(size = 16, hjust = 0.5),\n    axis.title = element_text(size = 18),\n    axis.text = element_text(size = 16),\n    legend.text = element_text(size = 16),\n    legend.title = element_text(size = 16),\n    legend.position = c(.80, .80)) +\n    annotate(\"text\", x = 2, y = .32, size = 3, label = \"2014-2019 | Model: @nflfastR\")\n```\n\n```{r echo=FALSE, fig.align='center', fig.cap='', out.width='100%'}\nknitr::include_graphics('man/figures/readme-cp-model-1.png')\n```\n\n`nflfastR` includes two win probability models: one with and one without incorporating the pre-game spread.\n\n## Special thanks\n\n* To Nick Shoemaker for [finding and making available JSON-formatted NFL play-by-play back to 1999](https://github.com/CroppedClamp/nfl_pbps) (`nflfastR` uses this source for 1999 and 2000 and previously also used it for 2001-2010)\n* To Lau Sze Yui for developing a scraping function to access JSON-formatted NFL play-by-play beginning in 2001\n* To Aaron Schatz and [FTN Fantasy](https://ftnfantasy.com/dvoa/nfl) for providing charting data to correctly mark scrambles in the 1999-2005 seasons\n* To Lee Sharpe for curating a resource for game information\n* To Timo Riske, Lau Sze Yui, Sean Clement, and Daniel Houston for many helpful discussions regarding the development of the new `nflfastR` models\n* To Zach Feldman and Josh Hermsmeyer for many helpful discussions about CPOE models as well as Peter Owen for many helpful suggestions for the CP model\n* To Florian Schmitt for the logo design\n* The many users who found and reported bugs in `nflfastR` 1.0\n* And of course, the original [`nflscrapR`](https://github.com/maksimhorowitz/nflscrapR) team, Maksim Horowitz, Ronald Yurko, and Samuel Ventura, whose work represented a dramatic step forward for the state of public NFL research\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnflverse%2Fnflfastr","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fnflverse%2Fnflfastr","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnflverse%2Fnflfastr/lists"}