https://github.com/dkealvaro/ufc-scraper-ml
Scraping UFC data and building predictive models
https://github.com/dkealvaro/ufc-scraper-ml
machine-learning scraping sports-betting ufc
Last synced: about 2 months ago
JSON representation
Scraping UFC data and building predictive models
- Host: GitHub
- URL: https://github.com/dkealvaro/ufc-scraper-ml
- Owner: DKeAlvaro
- License: other
- Created: 2025-07-02T23:15:38.000Z (3 months ago)
- Default Branch: main
- Last Pushed: 2025-08-02T17:21:43.000Z (2 months ago)
- Last Synced: 2025-08-02T19:35:58.133Z (2 months ago)
- Topics: machine-learning, scraping, sports-betting, ufc
- Language: Python
- Homepage:
- Size: 1.03 MB
- Stars: 2
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
---
title: UFC Fight Predictor
emoji: 🥊
colorFrom: red
colorTo: blue
sdk: gradio
sdk_version: "4.28.3"
app_file: app.py
pinned: false
---
# UFC Scraper & ML [](https://huggingface.co/spaces/AlvaroMros/ufc-predictor)## Setup
1. Clone the repo (or download it), then open a terminal in the root folder
2. Install the required Python packages using pip:
```bash
pip install -r requirements.txt
```## Usage
### 1. Data Scraping
**Initial Setup (First Time):**
```bash
python -m src.main --pipeline scrape --scrape-mode full
```
Scrapes all historical fight data from ufcstats.com.**Update Data (Regular Use):**
```bash
python -m src.main --pipeline scrape --scrape-mode update
```
Adds only the latest events to existing data.### 2. Fight Prediction
**Use Existing Models (Fast):**
```bash
python -m src.main --pipeline predict
```
Loads saved models if available and retrains if new data available.**Force Retrain Models:**
```bash
python -m src.main --pipeline predict --force-retrain
```
Always retrains all models from scratch with latest data.### 3. Complete Pipeline
#### 2.1 Complete Pipeline
```bash
python -m src.main --pipeline all --scrape-mode update
```
Runs scraping (update mode), analysis, and prediction in sequence.### 4. Model Updates Only
```bash
python -m src.main --pipeline update
```
Checks for new data and retrains models only if needed (perfect for automation).## Model Performance
The system tests on the latest UFC event for realistic accuracy scores (typically 50-70% for fight prediction).
## Output
- **Data:** `output/ufc_fights.csv`, `output/ufc_fighters.csv`
- **Models:** `output/models/*.joblib`
- **Results:** `output/model_results.json`## License
This project is licensed under the GNU Affero General Public License v3 (AGPL-3.0) - see the [LICENSE](LICENSE) file for details.
**What this means:**
- Free for personal, research, and educational use
- Can be modified and redistributed (with source code)
- **Network Copyleft**: If you run this as a web service or API, you must make your source code publicly available
- **Strong Copyleft**: Any modifications or derivative works must also be AGPL-3.0 licensed
- Commercial use is allowed but requires compliance with copyleft termsThis license specifically prevents companies from using this code in proprietary betting platforms or closed-source prediction services without contributing back to the open source community.