{"id":13616977,"url":"https://github.com/christopherjenness/NBA-prediction","last_synced_at":"2025-04-14T03:32:14.111Z","repository":{"id":82958417,"uuid":"73628742","full_name":"christopherjenness/NBA-prediction","owner":"christopherjenness","description":"Predict scores of NBA games using regularized matrix completion","archived":false,"fork":false,"pushed_at":"2019-03-01T09:53:32.000Z","size":68,"stargazers_count":152,"open_issues_count":6,"forks_count":43,"subscribers_count":19,"default_branch":"master","last_synced_at":"2024-11-08T01:39:53.741Z","etag":null,"topics":["matrix-completion","nba","nba-games","nba-prediction"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/christopherjenness.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null}},"created_at":"2016-11-13T16:39:25.000Z","updated_at":"2024-07-19T19:38:31.000Z","dependencies_parsed_at":null,"dependency_job_id":"529d617e-d906-4335-8b57-f0334348b432","html_url":"https://github.com/christopherjenness/NBA-prediction","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/christopherjenness%2FNBA-prediction","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/christopherjenness%2FNBA-prediction/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/christopherjenness%2FNBA-prediction/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/christopherjenness%2FNBA-prediction/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/christopherjenness","download_url":"https://codeload.github.com/christopherjenness/NBA-prediction/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248815629,"owners_count":21165957,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["matrix-completion","nba","nba-games","nba-prediction"],"created_at":"2024-08-01T20:01:35.491Z","updated_at":"2025-04-14T03:32:09.095Z","avatar_url":"https://github.com/christopherjenness.png","language":"Python","funding_links":[],"categories":["Python"],"sub_categories":[],"readme":"# NBA-prediction\n[![Coverage Status](https://coveralls.io/repos/github/christopherjenness/NBA-prediction/badge.svg?branch=master)](https://coveralls.io/github/christopherjenness/NBA-prediction?branch=master)\n\nPredicts scores of NBA games using matrix completion\n\n## The Model\nFor a given NBA game, if you could accurately predict each team's offensive rating (points per 100 possessions) and the pace of the game (possessions per game), you could estimate the final score of the game.\n\nPredicting a team's offensive rating against another team is tricky.   It depends on how good the offensive team is at scoring and how good the defending team is a defending.  Most importantly though, it depends on the specific matchups between the two teams.  This is reminiscent of recommendation systems where the recommendation depends on the type of user, the type of product, and the affinity between those two.  Furthermore, for a given season only some offensive ratings between teams are available (the teams that have already played).  The strategy in this model is to use matrix completion techniques to estimate unseen offensive ratings.  These will be combined with pace estimations to predict final scores.\n\n## Matrix completion\n\nHere, we look at two methods for matrix completion: Maximum Margin Matrix Factorization (MMMF) and Singular Value Decomposition (SVD).\n\nHastie, Trevor, Robert Tibshirani, and Martin Wainwright. Statistical learning with sparsity: the lasso and generalizations. CRC Press, 2015.\n\n### Maximum Margin Matrix Factorization (MMMF)\n\nThe objective of MMMF is approximate an _m_ x _n_ matrix **Z** by factoring into \n\n![1](equations/(1).gif)\n\nwhere **A** is an _m_ x _r_ matrix and **B** is an _n_ x _r_ matrix.  Effectively, this puts a rank constraint _r_ on the approximation **M**.\n\nThis can be estimated by solving the following\n\n![2](equations/(2).gif)\n\nwhere Omega indicates that only the known values in **Z** should be taken into consideration.  Any unknown value is treated as zero.\n\nWhile intuitive, this approach has a two of problems.  First, this is a two dimensional family of models indexed by _r_ (the rank of the factorization) and _lambda_ (the magnitude of regularization), which requires a lot of tuning.  Second, this optomization problem is non-convex and in practice did not find global minima when used to predict NBA offensive ratings.  Because of this, we turned to SVD.\n\n### Singular Value Decomposition Using Nuclear Norm\n\n[SVD](https://en.wikipedia.org/wiki/Singular_value_decomposition), not explained here, can be used to provide a rank-q approximation of a matrix (**Z**) by constraining the rank of the SVD (**M**).  This amounts to the following optimization\n\n![3](equations/(3).gif)\n\nIf values are missing from **Z** then you can constrain **M** to correctly impute these values, while approximating the unknown values\n\n![4](equations/(4).gif)\n\nWhere omega is the set of known values.  However, this problem is NP-hard and also leads to overfitting since the known values are required to be predicted exactly.  Instead, you can simultanously predict unknown values and approximate known values by solving the following optimization\n\n![5](equations/(5).gif)\n\nLike MMMF, this problem is non-convex.  However, it can be relaxed to the following convex optimization problem \n\n![6](equations/(6).gif)\n\nwhere a nuclear norm on **M**, ||**M**||\u003csub\u003e*\u003c/sub\u003e is used. This algorithm, called soft-impute, is studied extensively in:\n\nMazumder, Rahul, Trevor Hastie, and Robert Tibshirani. \"Spectral regularization algorithms for learning large incomplete matrices.\" Journal of machine learning research 11.Aug (2010): 2287-2322.\n\n##  Example Code\n\nTo make predictions, use the following code:\n\n```python\n\u003e\u003e model = NBAModel(update=True)\n\u003e\u003e model.get_scores('PHO', 'WAS')\nPHO WAS\n92.9092883132 97.1806398788\n```\n\nwhich predicts the Suns will lose to the Wizards 93-97.  \n\nNote, scraping all the data required to run the algorithm is slow.  This only needs to be done the first time.  On subsequent models, you can use ```update=False``` to used the cached data.\n\n## Model Tuning and Test Error\n\nThe optimization strategy above is parameterized by lambda, the extent of regularization.  Using a validation set (10% of sample), we determined 25 to be optimal value of lambda.\n\n![Imgur](http://i.imgur.com/bT7XUCJ.png)\n\nUsing lambda = 25 on a held out test set, our model estimates a team's final score with an MSE of 6.7.  Not bad.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fchristopherjenness%2FNBA-prediction","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fchristopherjenness%2FNBA-prediction","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fchristopherjenness%2FNBA-prediction/lists"}