An open API service indexing awesome lists of open source software.

https://github.com/gmalbert/rugby

Rugby Data Analysis and Sports Betting
https://github.com/gmalbert/rugby

data-analysis rugby sports-betting

Last synced: 19 days ago
JSON representation

Rugby Data Analysis and Sports Betting

Awesome Lists containing this project

README

          

# ScrumBet ๐Ÿ‰

European rugby analytics and DraftKings betting intelligence, built with Streamlit.

## Overview

ScrumBet is a multi-page Streamlit app that combines live scraped data from five sources with
three predictive models (Elo, Dixon-Coles, Try Scorer) to surface match previews, player
stats, and value-bet opportunities across six major European and international rugby
competitions.

## Pages

### Home (`predictions.py`)
- **This Week's Fixtures** โ€” upcoming matches grouped by day with kickoff time, venue, and DK moneylines
- **Live Ticker** โ€” real-time scores via SofaScore; falls back to pipeline data when unavailable
- **DK Odds Snapshot** โ€” compact moneyline + total table for next 7 days
- **Top Try Scorers** โ€” league-filtered season leaderboard across all competitions
- **Team Form Strip** โ€” last-5 result badges (W/D/L) for every team in the current fixture list
- Global sidebar: logo, theme selector, cache-refresh button, data-source credits

### 1 โ€“ League Overview (`pages/1_League_Overview.py`)
- **Standings table** โ€” P/W/L/D, points for/against, try diff, bonus points, league points, form arrow
- **Attack Efficiency scatter** โ€” Points per Game vs Tries per Game with quadrant lines at the mean
- **Home vs Away bar charts** โ€” average points scored at home and away per team
- **Round-by-Round Results grid** โ€” pivot table of scores by round
- **Upcoming Fixtures** โ€” date, teams, venue, round, and DK odds (ML + O/U) when available
- **Season Snapshot** โ€” auto-generated narrative (leader, gap at top, top attacker, best defence)

### 2 โ€“ Team Deep Dive (`pages/2_Team_Deep_Dive.py`)
- **Form Strip** โ€” last 10 results with colour-coded badges and opponent/score labels
- **Attack vs Defence Radar** โ€” five-axis radar: tries scored, tries conceded, metres, linebreaks, tackles
- **Elo Rating History** โ€” line chart of the team's Elo trajectory across all recorded matches
- **Scoring Breakdown** โ€” bar chart splitting season points into try-derived vs penalties/drops
- **Key Players** โ€” top 5 try scorers, metres carriers, and tacklers in three side-by-side tables
- **Head-to-Head** โ€” last 10 meetings vs any selected opponent with W/D/L result column
- **Venue Stats** โ€” home/away split: games, wins, average points for/against

### 3 โ€“ Player Stats (`pages/3_Player_Stats.py`)
- **Try Scorer Rankings** โ€” sortable table: tries, games, tries-per-game, last-3 tries; filterable by league and position
- **Player Profile** โ€” metrics card (team, position, season tries, T/game, consistency %, avg minutes); tries-by-round bar chart; minutes-played trend
- **Prop Bet Analyzer** โ€” enter a DraftKings American-odds line; model computes historical scoring rate, implied probability, expected value, and Kelly Criterion stake recommendation

### 4 โ€“ Match Analysis (`pages/4_Match_Analysis.py`)
- **Win Probabilities** โ€” horizontal probability bar chart from Dixon-Coles (primary) or Elo (fallback)
- **Predicted Scoreline** โ€” expected points for each team and top-5 most likely scorelines with probabilities
- **Scoreline Distribution Heatmap** โ€” Dixon-Coles joint Poisson matrix up to 40 pts per side
- **Head-to-Head** โ€” last 5 meetings between the two selected teams
- **Key Stats Comparison** โ€” 8-metric side-by-side table: win%, pts/game, tries/game, tries conceded/game, metres, linebreaks, tackles, missed tackles
- **Predicted Try Scorers** โ€” top-5 anytime try scorer probabilities from the logistic regression model
- **DK Odds vs Model** โ€” edge table for home ML, away ML, and total line with signal (โœ… Back / ๐Ÿ”ด Under / ๐Ÿ”ด Fade)
- **Venue Weather** โ€” current conditions at kickoff venue via OpenWeatherMap (optional, requires `OPENWEATHER_API_KEY`)

### 5 โ€“ Betting Edge (`pages/5_Betting_Edge.py`)
- **Value Bets** โ€” Elo-implied vs DK-implied probability for every upcoming fixture; colour-coded rows (green = back, red = fade); configurable minimum edge threshold
- **Try Scorer Value** โ€” model probability and fair-value American odds for anytime try scorer across all upcoming games
- **Totals Analysis** โ€” Dixon-Coles expected total vs DK O/U line with Over/Under signal
- **Parlay Builder** โ€” select any combination of Elo-ranked match-winner legs; displays combined model probability and fair-value American parlay odds
- **Historical Edge Tracking** โ€” placeholder section; will show ROI, strike rate, and calibration by market type as results accumulate

### 6 โ€“ Model Lab (`pages/6_Model_Lab.py`)
- **Elo Leaderboard** โ€” current ratings for all teams with league, 3-match trend arrow, and sortable table; per-team Elo history line chart
- **Dixon-Coles Parameters** โ€” home advantage and ฯ (low-scoring correction) values; attack/defence rating table; attack-vs-defence scatter plot
- **Backtesting** โ€” walk-forward evaluation: % correct winner, Brier score, number of test matches, calibration scatter (predicted vs actual home win rate)
- **Manual Override** โ€” adjust any two teams' Elo ratings by ยฑ200 points to simulate injury news; re-runs win probability instantly
- **Monte Carlo Simulation** โ€” Poisson draws (1 000โ€“10 000); outputs win/draw/loss %, total-points histogram, and winning-margin histogram

## Data Sources

| Source | Used For |
|---|---|
| ESPN API | Fixtures, results, scores, league standings |
| RugbyPass | Supplementary match metadata |
| SofaScore | Live scores, player statistics (tries, metres, tackles, linebreaks, minutes) |
| World Rugby | Official rankings |
| The Odds API | DraftKings moneylines, totals, and try scorer props |

## Models

| Model | Details |
|---|---|
| **Elo** | K = 32, home advantage = +50, margin-weighted updates. Ratings stored per match in `data_files/`. Used for win probability, backtesting, and parlay builder. |
| **Dixon-Coles** | Bivariate Poisson with low-scoring correction (ฯ). Fitted via `scipy.optimize.minimize`. Requires โ‰ฅ 15 completed matches. Outputs attack/defence ratings, win/draw/loss probabilities, and full scoreline matrix. |
| **Try Scorer** | Random Forest on player features: tries-per-game, position, home/away, minutes. Trained on all completed match data; requires โ‰ฅ 20 matches. |
| **Value Finder** | Compares Elo/DC model probabilities to DK implied probabilities; identifies edges above a configurable threshold. |

## Leagues Covered

Six Nations ยท Premiership Rugby ยท Top 14 ยท Super Rugby Pacific ยท United Rugby Championship ยท European Champions Cup

## Tech Stack

| Layer | Library |
|---|---|
| Frontend | Streamlit (multi-page, wide layout, dark/light themes) |
| Data | pandas, pyarrow (Parquet + CSV dual-write) |
| Models | scipy, scikit-learn, numpy |
| Charts | Plotly Express |
| Scrapers | requests, ESPN API, SofaScore API, The Odds API |
| Pipeline | Python script + GitHub Actions cron (daily 03:00 UTC) |
| Config | python-dotenv (`.env` for API keys) |

## Disclaimer

ScrumBet is an informational analytics tool only. Nothing here constitutes financial or gambling advice.