{"id":21300111,"url":"https://github.com/roclark/clarktech-ncaab-predictor","last_synced_at":"2025-06-10T17:04:58.101Z","repository":{"id":40978991,"uuid":"74707929","full_name":"roclark/clarktech-ncaab-predictor","owner":"roclark","description":"A machine learning project to predict NCAA Men's Basketball outcomes","archived":false,"fork":false,"pushed_at":"2022-12-08T06:38:37.000Z","size":7202,"stargazers_count":33,"open_issues_count":7,"forks_count":8,"subscribers_count":9,"default_branch":"master","last_synced_at":"2024-05-01T15:06:49.497Z","etag":null,"topics":["basketball","basketball-stats","machine-learning","prediction","python","randomforest"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/roclark.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":".github/FUNDING.yml","license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null},"funding":{"github":["roclark"]}},"created_at":"2016-11-24T22:08:15.000Z","updated_at":"2023-11-17T15:45:49.000Z","dependencies_parsed_at":"2023-01-24T16:45:14.358Z","dependency_job_id":null,"html_url":"https://github.com/roclark/clarktech-ncaab-predictor","commit_stats":null,"previous_names":[],"tags_count":2,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/roclark%2Fclarktech-ncaab-predictor","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/roclark%2Fclarktech-ncaab-predictor/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/roclark%2Fclarktech-ncaab-predictor/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/roclark%2Fclarktech-ncaab-predictor/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/roclark","download_url":"https://codeload.github.com/roclark/clarktech-ncaab-predictor/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":225750111,"owners_count":17518315,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["basketball","basketball-stats","machine-learning","prediction","python","randomforest"],"created_at":"2024-11-21T15:07:31.501Z","updated_at":"2024-11-21T15:07:32.352Z","avatar_url":"https://github.com/roclark.png","language":"Python","funding_links":["https://github.com/sponsors/roclark"],"categories":[],"sub_categories":[],"readme":"\u003cp align=\"center\"\u003e\n  \u003cimg src=\"assets/BasketballLogo.png\" height=\"200\" width=\"200\"\u003e\n\u003c/p\u003e\n\n# NCAAB Basketball Predictor\n![Docker Pulls](https://img.shields.io/docker/pulls/roclark/clarktech-ncaab-predictor?style=flat-square)\n\nThis tool uses machine learning to predict the outcomes of NCAAB Men's\nDivision-I Basketball games. Included are several algorithms which can forecast\ndifferent events, such as a daily matchup simulator, conference tournament\npredictor, and a preview of the NCAA tournament field.\n\n## Setup\nIt is _highly_ recommended to pull the latest Docker container from Docker Hub\nas this image contains a pre-populated dataset containing multiple years worth\nof data as well as optimizations to the data which cannot be reproduced\nretro-actively. A new image is pushed to the registry daily, so it is\nrecommended to setup a workflow which scans for newer images prior to running\none of the provided algorithms. To pull the latest image, first ensure Docker is\ninstalled on your system by following the [documentation](https://docs.docker.com/install/).\nNext, pull the latest image with:\n\n```\ndocker pull roclark/clarktech-ncaab-predictor\n```\n\nThis will download and extract the most recent image to your local machine which\ncan be viewed with:\n\n```\n$ docker images\nREPOSITORY                          TAG      IMAGE ID       CREATED       SIZE\nroclark/clarktech-ncaab-predictor   latest   0cfaab9aa82a   4 hours ago   525MB\n```\n\n### MongoDB\nIn addition to the pulling the predictor image from Docker Hub, it is\nrecommended to use MongoDB as a database to save and retrieve results for future\nusage. While this isn't a strict requirement, many of the algorithms provide\nbetter handling and verbosity when saving results into a Mongo database.\nLuckily, if a Mongo database isn't already installed and configured on your\nsystem, it is straightforward to do so with a Docker container. Simply pull the\nlatest image from Docker Hub, then run a container in detached mode so it will\nrun persistently on the host:\n\n```\ndocker pull mongo\ndocker run -it -d mongo\n```\n\nYou now have a MongoDB instance running inside a container which can be accessed\nanywhere on the host using the default `mongodb` url.\n\nIf you choose to skip MongoDB, you will need to add `--skip-save-to-mongodb` to\nall commands while running the application (more on usage below).\n\n## Usage\nOnce setup is complete, the tool is now ready to be used to predict NCAAB\noutcomes. The general usage of the application with Docker is as follows:\n\n```\ndocker run --rm -it roclark/clarktech-ncaab-predictor [options] algorithm [algorithm-specific options]\n```\n\nMore information on the usage can be retrieved with the following:\n\n```\ndocker run --rm -it roclark/clarktech-ncaab-predictor --help\n```\n\n### Daily Simulator\nThe daily simulator is designed to simulate the outcome of all games scheduled\nfor the current day. It is suggested to run this algorithm in the morning to\nretrieve a list of the scheduled games and determine which team is expected to\nwin. Sample text output is as follows:\n\n```\n$ docker run --rm -it roclark/clarktech-ncaab-predictor daily-simulation\nArmy at (4) Duke  =\u003e  (4) Duke\nGeorge Washington at (5) Virginia  =\u003e  (5) Virginia\nFlorida Gulf Coast at (10) Michigan State  =\u003e  (10) Michigan State\n```\n\nAdditional information such as the predicted spread and further details on each\nteam is included in the database.\n\n### Conference Simulator\nThe conference simulator will forecast the remaining schedule for a conference\nand, based on the existing conference standings, determine the final projected\nstandings as well as the likelihood a particular team will earn their projected\nposition and their overall probability that they will finish first. The\nalgorithm also displays the projected number of games the team will win in the\nconference by the end of the season. This can be triggered as follows:\n\n```\ndocker run --rm -it roclark/clarktech-ncaab-predictor monte-carlo-simulation\n```\n\nThe output generated from this command is saved to a database which is required\nas a baseline for several algorithms listed below.\n\n### Conference Tournament Simulator\nThis simulator runs through each conference's post-season tournament and\npredicts the overall winner and the potential route each team takes to the\nfinals. In order to generate the initial seeds, a forecast of the final\nconference standings needs to be run prior to this algorithm using the\nConference Simulator above. Each conference has its own unique tournament format\nand is handled differently, as specified in the brackets library. Run this\nsimulation with the following:\n\n```\ndocker run --rm -it roclark/clarktech-ncaab-predictor conference-tourney-simulator\n```\n\nPrior to running the algorithm, ensure a `simulation.json` file has been\ngenerated using the Conference Simulator above.\n\n### Matchup\nA matchup between two specific teams can be simulated with the matchup\nalgorithm. This will run several games between the requested teams and determine\nthe overall winner and the expected difference in score. Due to the difference\nbetween playing at home and on the road, the results could vary depending on\nwhich team is specified as the home team. For example, the following will test a\nmatchup between Purdue and Indiana with Purdue designated as the home team:\n\n```\ndocker run --rm -it roclark/clarktech-ncaab-predictor matchup purdue indiana\n```\n\n### Power Rankings\nPower rankings can be generated for all NCAA Men's Division-I basketball teams\nto determine the comparative performance relative to one another. This algorithm\nruns a home-and-home matchup between each team in the division and tallies the\ncollective spread for each team. After all simulations are complete, the team\nwith the highest positive spread will be the number one team overall with the\nteam with the second highest spread being number two, and so on. This system\nworks under the philosophy that the team which can beat the highest number of\nteams by the highest margin is the strongest team in the league. This does not\nlook specifically at what a team has accomplished so far in the season, but\ninstead how strong they are at this point in time. The rankings can be generated\nwith the following:\n\n```\ndocker run --rm -it roclark/clarktech-ncaab-predictor power-rankings\n```\n\n### NCAA Field Filler\nThe NCAA Field Filler will populate the 68-team NCAA Tournament field based on\nboth automatic and at-large bids. The automatic bids are identified by\nsimulating every conference tournament and determining the winner. These winners\nwill receive automatic bids to the tournament. The remaining spots will be\nawarded on a priority basis based on the power rankings. The rankings need to be\ngenerated prior to running this algorithm. Attached to each team is their\nexpected seed. After generating power rankings using the command above, run this\nalgorithm with the following:\n\n```\ndocker run --rm -it roclark/clarktech-ncaab-predictor fill-ncaa-field\n```\n\n### NCAA Tournament Simulator\nLastly, the NCAA Tournament Simulator runs a simulation of the NCAA tournament.\nThis requires a CSV file of the expected teams and seeds in the tournament to be\nused as a baseline for the bracket. An example of this CSV file is provided in\nthe repository. To simulate the tournament, run the following:\n\n```\ndocker run --rm -it roclark/clarktech-ncaab-predictor tournament-simulator 2019-ncaa.csv\n```\n\n### Other options\nIn addition to the algorithms listed above, some additional options are\navailable.\n\n#### Num Sims\nGiven the unpredictability of sports, especially with men's college basketball,\nsome randomness is injected into the algorithms. The randomness is generated by\napplying a random variance within the league's standard deviant for every\ncategory for each team tested on a per-simulation basis. For example, in a\nsingle simulation, one team could have a +0.7 * STDEV improvement to their\nshooting percentage, and a -0.3 * STDEV punishment to their rebounds. As this is\ndone on a per-simulation basis, it is recommended to increase the number of\nsimulations run to improve the variance of data tested and get a more accurate\nview of the overall trend for each team instead of relying solely on a limited\nnumber of varied results. Please note that while increasing the number of\nsimulations is recommended, every additional pass will increase the time to\ncompletion.\n\n#### Skip Saving to MongoDB\nBy default, all results will be saved to a Mongo database at a specified URL.\nThe results in the database provide additional context and can easily be\narchived for future use as needed. If desired, this can be avoided by requesting\nto skip saving to MongoDB, and results will be saved in the local directory as\napplicable.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Froclark%2Fclarktech-ncaab-predictor","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Froclark%2Fclarktech-ncaab-predictor","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Froclark%2Fclarktech-ncaab-predictor/lists"}