{"id":20942474,"url":"https://github.com/bdilday/marcelr","last_synced_at":"2025-07-26T10:35:07.805Z","repository":{"id":78136186,"uuid":"84767473","full_name":"bdilday/marcelR","owner":"bdilday","description":"Marcel projections","archived":false,"fork":false,"pushed_at":"2020-03-02T01:50:27.000Z","size":50840,"stargazers_count":17,"open_issues_count":2,"forks_count":4,"subscribers_count":5,"default_branch":"master","last_synced_at":"2025-05-14T00:10:37.414Z","etag":null,"topics":["baseball","baseball-analysis-packages","marcel-projections","r","sabermetrics"],"latest_commit_sha":null,"homepage":null,"language":"R","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/bdilday.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2017-03-13T00:21:54.000Z","updated_at":"2024-03-19T20:34:21.000Z","dependencies_parsed_at":"2023-02-26T11:30:17.023Z","dependency_job_id":null,"html_url":"https://github.com/bdilday/marcelR","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/bdilday/marcelR","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bdilday%2FmarcelR","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bdilday%2FmarcelR/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bdilday%2FmarcelR/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bdilday%2FmarcelR/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/bdilday","download_url":"https://codeload.github.com/bdilday/marcelR/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bdilday%2FmarcelR/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":267150480,"owners_count":24043473,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-07-26T02:00:08.937Z","response_time":62,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["baseball","baseball-analysis-packages","marcel-projections","r","sabermetrics"],"created_at":"2024-11-18T23:27:19.762Z","updated_at":"2025-07-26T10:35:07.797Z","avatar_url":"https://github.com/bdilday.png","language":"R","funding_links":[],"categories":[],"sub_categories":[],"readme":"# marcelR\n\nThis package generates Marcel projections, using data from the `Lahman` package. \n\n## Brief introduction to Marcels\n\nMarcels describes a projection system for baseball, first developed by Tom Tango. It is often described as the most basic projection system. It weights a players last three seasons, regresses this to the mean, and applies an age adjustment.\n\n## Installing \n\nSince this is not on CRAN, it needs to be installed from github,\n\n``` {r}\n\u003e library(devtools)\n\u003e install_github('bdilday/marcelR')\n\u003e library(marcelR)\n```\n\nAs of this writing the `Lahman` package has not been updated to include the 2017+ seasons. I created an updated version, however, that can be used for generating projections for the 2018-2020 seasons. It can be installed from github as well,\n\n``` {r}\n\u003e install_github('bdilday/Lahman')\n\u003e max(Lahman::Batting$yearID)\n[1] 2019\n```\n\n## Marcel data\n\nThis package includes the marcels as a set of data frames.\n\n``` {r}\n\u003e data(marcels)\n\u003e names(marcels)\n[1] \"Pitching\" \"Batting\"  \"Teams\"   \n\n\u003e nrow(marcels$Batting)\n[1] 56622\n\n\u003e nrow(marcels$Pitching)\n[1] 42777\n\n\u003e nrow(marcels$Teams)\n[1] 2877\n\n```\n\n### Batting\n\nHere's an example of a projection, illustrated with Carlos Beltran for 2004.\n\n``` {r}\n\u003e library(dplyr)\n\u003e marcels$Batting %\u003e% \n     filter(yearID==2004, playerID=='beltrca01') %\u003e% \n     print.data.frame()\n   playerID yearID proj_pa    X1B      X2B      X3B       HR       BB      HBP       SB\n1 beltrca01   2004   573.2 93.154 25.37286 7.483096 22.70545 58.44974 3.351604 29.40546\n        CS      SO       SH       SF\n1 3.721516 92.3098 1.338349 5.604223\n```\n\nThe highest projected HR,\n\n``` {r}\n\u003e marcels$Batting %\u003e% arrange(-HR) %\u003e% select(playerID, yearID, HR)\n# A tibble: 54,730 × 3\n    playerID yearID       HR\n      \u003cfctr\u003e  \u003cdbl\u003e    \u003cdbl\u003e\n1  mcgwima01   1999 52.27053\n2  mcgwima01   2000 51.93934\n3  bondsba01   2002 49.70076\n4   sosasa01   2002 48.08998\n5  mcgwima01   1998 47.17453\n6   sosasa01   2001 46.87696\n7   sosasa01   2000 46.38852\n8  griffke02   1999 45.72925\n9  bondsba01   2003 45.71398\n10  sosasa01   2003 43.79848\n# ... with 54,720 more rows\n```\n\n### Pitching\n\nLowest projected RA9 since 1950.\n\n``` {r}\n\u003e marcels$Pitching %\u003e% \n   mutate(RA9=27*R/proj_pt) %\u003e% \n   arrange(RA9) %\u003e% \n   filter(yearID\u003e=1950) %\u003e% \n   select(playerID, yearID, RA9)\n   # A tibble: 29,054 × 3\n    playerID yearID      RA9\n      \u003cfctr\u003e  \u003cdbl\u003e    \u003cdbl\u003e\n1  gibsobo01   1970 2.409111\n2  goodedw01   1986 2.417237\n3  gibsobo01   1969 2.439491\n4  koufasa01   1965 2.451923\n5  koufasa01   1967 2.473709\n6  kershcl01   2015 2.480347\n7  koufasa01   1966 2.525069\n8  kershcl01   2016 2.525205\n9  kimbrcr01   2014 2.529273\n10 maddugr01   1996 2.530020\n# ... with 28,830 more rows\n```\n\n### Teams\n\nHighest projected winning percentage since 1913,\n``` {r}\n\u003e marcels$Teams %\u003e% filter(yearID\u003e=1913) %\u003e% arrange(-wpct) %\u003e% select(yearID, teamID, wpct)\n# A tibble: 2,290 × 3\n   yearID teamID      wpct\n    \u003cdbl\u003e \u003cfctr\u003e     \u003cdbl\u003e\n1    1940    NYA 0.6175461\n2    1928    NYA 0.6119175\n3    1952    NY1 0.6096779\n4    1913    NY1 0.6089000\n5    1953    BRO 0.6082399\n6    1934    CHN 0.6039637\n7    2017    CHN 0.6039189\n8    2004    BOS 0.6038871\n9    1921    NYA 0.6037286\n10   1941    NYA 0.6033883\n# ... with 2,280 more rows\n```\n\nAs of this writing, the Batting and Pitching stats have been updated to 2019, but the team projectison have not (waiting on creation of updated rosters).\n\n## Marcel computations\n\n### Data exporting\n\nThe marcel data is exported in the `marcel_data_exporter.R` script. The low-level functions to compute the marcels are also included, however. Examples are given below.\n\n### Batting\n\nFor batting stats, the weights given to the previous three seasons are 5, 4, and 3, and the amount of regression is 100 PA. \n\nAn example of computing marcels for batting stats,\n\n\n``` {r}\n\u003e a \u003c- get_batting_stats()\n\u003e b \u003c- dplyr::tbl_df(marcelR:::append_previous_years(a %\u003e% filter(POS!=\"P\"), \n                                           get_seasonal_averages_batting, \n                                           previous_years = 3))\n\u003e mcl \u003c- dplyr::tbl_df(apply_marcel_batting(b, \"HR\", marcelR:::age_adjustment))\n\u003e mcl %\u003e% filter(projectedYearID==2004, playerID=='beltrca01') %\u003e% print.data.frame()\n   playerID yearID projectedYearID age_adj x_metric x_pa       x_av proj_pa metric_target\n1 beltrca01   2003            2004   1.012      318 7938 0.02867754   573.2    0.02868739\n      num denom proj_rate_raw  proj_rate proj_value metric_agg proj_value_floating\n1 352.413  9138    0.03856566 0.03902845   22.70545 0.02826497            22.37111\n  metric_multiplier\n1          1.014945\n```\n\n### Pitching\n\nFor pitching stats, the weights given to the previous three seasons are 3, 2, and 1, and the amount of regression is 134 Outs, or about 44.2 innings.\n\nAn example of computing marcels for pitching stats,\n\n``` {r}\n\u003e a \u003c- get_pitching_stats()\n\u003e b \u003c- dplyr::tbl_df(marcelR:::append_previous_years(a %\u003e% filter(POS==\"P\"), \n              get_seasonal_averages_pitching, \n              previous_years=3))\n\u003e mcl \u003c- dplyr::tbl_df(apply_marcel_pitching(b, \"R\", marcelR:::age_adjustment_reciprocal))\n\u003e mcl %\u003e% filter(projectedYearID==2017) %\u003e% mutate(RA9=27*proj_value/proj_pt) %\u003e% arrange(RA9) %\u003e% head(4) %\u003e% print.data.frame()\n   playerID yearID projectedYearID age_adj x_metric x_pt    x_lgav proj_pt metric_target\n1 kershcl01   2016            2017   1.000      259 3332 0.1607082   473.3     0.1616625\n2 brittza01   2016            2017   1.003       70 1226 0.1614394   195.2     0.1616625\n3 daviswa01   2016            2017   1.009       51 1010 0.1602990   160.2     0.1616625\n4 millean01   2016            2017   1.009       87 1226 0.1621735   205.0     0.1616625\n       num denom proj_rate_raw  proj_rate proj_value metric_agg proj_value_floating\n1 388.2094  4136    0.09386108 0.09386108   44.42786  0.1616501            44.42445\n2 199.7973  2030    0.09842229 0.09871756   19.27115  0.1616501            19.26967\n3 179.8804  1814    0.09916231 0.10005477   16.03000  0.1616501            16.02877\n4 217.3875  2030    0.10708744 0.10805123   22.15220  0.1616501            22.15050\n  metric_multiplier      RA9\n1          1.000077 2.534444\n2          1.000077 2.665579\n3          1.000077 2.701686\n4          1.000077 2.917607\n```\n  \n### Teams\n\nTeam win projections aren't strictly a part of the marcel specification. In this package, marcels are used in the following way to project wins.\n\n* Specify a roster of batters and pitchers. In practice this comes from the players that actually played in the subsequent season, based on `Lahman` data.\n\n* Given the assumed roster, aggregate batting and pitching stats based on projected playing time.\n\n* Apply Base Runs to estimate the number of runs scored on offense.\n\n* Use estimated RA9 from the pitching projections directly for estimating runs allowed.\n\n* Adjust the estimated runs and runs-allowed to a common number of PA.\n\n* Apply the Pythagorean win formula to these adjusted runs estimates.\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbdilday%2Fmarcelr","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fbdilday%2Fmarcelr","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbdilday%2Fmarcelr/lists"}