Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/topfunky/r-tennis-win-probability
An experiment in building an in-game win probability predictor for tennis matches
https://github.com/topfunky/r-tennis-win-probability
Last synced: 11 days ago
JSON representation
An experiment in building an in-game win probability predictor for tennis matches
- Host: GitHub
- URL: https://github.com/topfunky/r-tennis-win-probability
- Owner: topfunky
- License: mit
- Created: 2021-02-15T02:03:22.000Z (almost 4 years ago)
- Default Branch: main
- Last Pushed: 2021-06-14T04:43:23.000Z (over 3 years ago)
- Last Synced: 2024-04-15T14:13:25.785Z (7 months ago)
- Language: R
- Size: 7.11 MB
- Stars: 2
- Watchers: 3
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Tennis In-Game Win Probability with R
An experiment in building an in-game win probability model for tennis matches. Uses [XGBoost](https://xgboost.ai).
## Sample match
- [Point by point](http://www.tennisabstract.com/charting/20080705-W-Wimbledon-F-Venus_Williams-Serena_Williams.html)
![Venus v Sabrina](out/w/20080705-W-Wimbledon-F-Venus_Williams-Serena_Williams.png)
- [Point by point](http://www.tennisabstract.com/charting/20050403-M-Miami_Masters-F-Roger_Federer-Rafael_Nadal.html)
- [Video highlights](https://www.youtube.com/watch?v=QKlXGgbwwJI)![Federer v Nadal](out/m/20050403-M-Miami_Masters-F-Roger_Federer-Rafael_Nadal.png)
## Accuracy
Women's model
![Accuracy Women](out/w/accuracy.png)
Men's model
![Accuracy Men](out/m/accuracy.png)
## Feature Importance
Women's model
![Feature Importance Women](out/w/importance.png)
Men's model
![Feature Importance Men](out/m/importance.png)
## Reference
Data is from the [Match Charting Project](https://github.com/JeffSackmann/tennis_MatchChartingProject).
## Development notes
Run on the command line:
```shell
$ R --no-save < tennis-win-probability.R
```Data cleanup tasks to do:
- Records are numbered by point `Pts` (approximately 100 per match)
- `Set1` and `Set2` are sets won by player 1 or two
- Same for `Gm1` and `Gm2`
- The model uses points, games, and sets
- The identity of the player serving the ball is not currently included in the model
- Add estimated points (EPA) for potentially even greater accuracy