An open API service indexing awesome lists of open source software.

https://github.com/hectorzamoranogarcia/sql-gym-split

SQL-based system for strength-training analysis. It includes predictive models for fatigue management, deload (rest-week) optimization, and plateau detection using Window Functions and business logic.
https://github.com/hectorzamoranogarcia/sql-gym-split

data-engineering data-quality fatigue-management fitness-tracker gym-analytics predictive-analytics sql sqlite strength-training window-fu

Last synced: 13 days ago
JSON representation

SQL-based system for strength-training analysis. It includes predictive models for fatigue management, deload (rest-week) optimization, and plateau detection using Window Functions and business logic.

Awesome Lists containing this project

README

          

# Gym Performance Intelligence System

**Data engineering project and advanced sports performance analysis based on business logic, relational logic and SQL (SQLite).**

![SQLite 3.45+](https://img.shields.io/badge/SQLite-v3.45+-003B57?style=for-the-badge&logo=sqlite&logoColor=white)
![Advanced SQL](https://img.shields.io/badge/SQL-Window_Functions-4479A1?style=for-the-badge&logo=sqlite&logoColor=white)
![Relational Logic](https://img.shields.io/badge/Logic-Relational_Modeling-232F3E?style=for-the-badge)
![Git](https://img.shields.io/badge/Git-F05032?style=for-the-badge&logo=git&logoColor=white)

The routine is based on my personal 7-day hypertrophy-focused gym split. The synthetic data was generated using randomized fatigue logic to validate the algorithms.

## Description

This project aims to digitize and optimize decision-making in strength training through a relational database engine. The main goal was not only to store logs, but to design a data architecture capable of **interpreting** performance in real time.

Algorithms were implemented to detect plateaus, predict intra-session fatigue, and calculate the need for deload weeks based on biological variables (sleep, diet) and load metrics.

The underlying dataset was synthetically generated by replicating my personal hypertrophy routine (frequency 2, split: *Chest/Back – Arms – Legs – Rest – Upper – Lower – Rest*), introducing random variability (rep drop between –1 and –3) to stress-test the algorithms.
During development, minor assistance from AI tools—mainly GitHub Copilot—was used for generating synthetic data (`INSERT` seeds), but all business logic, relational logic, schema design, and analytical decisions were made manually.

## Objectives

* **Plateau Detection:** Automatically identify when a lift fails to surpass the historical record (*Running Max*) within a given period.
* **Auto-Regulated Load Adjustment:** Predict when an athlete should reduce weight in the next set based on performance drop in the previous set due to accumulated fatigue.
* **Fatigue Management (Deload):** Calculate the optimal “Deload Week” by combining accumulated weeks, sleep hours, and nutritional phase (Bulking vs Recomposition vs Cutting).
* **Data Auditing:** Ensure data integrity through a dedicated quality-control layer.

## Data

* **Main Source:** Self-generated data based on my real 7-day routine.
* **Volume:** Simulations of 5–10 training-week cycles.
* **Simulated Profiles:**
* *Juan:* Optimal profile (Bulking, sleep > 7h).
* *Antonio:* High-risk profile (Cutting, sleep < 6h).

### Main Variables
* **Training:** `reps`, `weight`, `rpe`, `session_type` (Chest/Back, Lower, etc.).
* **Context:** `sleep_hours` (neural recovery), `training_goal` (bulk/cut/recomp).
* **Catalog:** Exercises such as *Machine Flat Press*, *Neutral Pulldown*, *Gironda Row*, *Overhead Triceps Extension*.

> To stress-test the system, a randomness factor was added to Antonio’s data, simulating rep drops of 1–3 across consecutive sets to trigger fatigue alerts.

## Methodology

### 1. Schema Design (OLTP)
Normalized SQLite database with 5 main tables (`Users`, `Exercises`, `Workouts`, `Sets`, `BodyMetrics`) ensuring referential integrity and proper data types.

### 2. Data Generation (Scenarios)
Creation of SQL (`seed`) scripts that simulate full training histories:
* **Scenario A (Progression):** Linear load increases.
* **Scenario B (Acute Fatigue):** Sudden intra-session rep drops.
* **Scenario C (Systemic Risk):** Multiple weeks with sleep deficit and restrictive dieting.

### 3. Plateau Detection Algorithm
Implementation of **Window Functions** (`MAX() OVER ... ROWS UNBOUNDED PRECEDING`) to compare current performance against the athlete’s lifetime max, avoiding false positives caused by naïve weekly comparisons.

### 4. Auto-Regulation Predictor
Use of displacement functions (`LAG`) to compare the current set (N) with the previous set (N-1).

* **Rule:** If reps fall below a critical threshold or fatigue is evident → `🔻 ALERT: REDUCE WEIGHT`.

### 5. Deload Week Predictor

A cumulative scoring system (*Fatigue Score*) was built using conditional SQL logic.

#### A. Scoring System (Fatigue Score)
Implemented using `CASE WHEN`, where 0 = ideal state and 10 = maximum risk.

* **Sleep Factor (Neural Recovery):**
* `< 6 hours`: **+6 Points** (Critical Risk / CNS compromised)
* `6–7 hours`: **+3 Points** (Incomplete recovery)
* `7–8 hours`: **+1 Point**
* `> 8 hours`: **0 Points**

* **Nutrition Factor (Energetic Recovery):**
* `Cutting / Deficit`: **+4 Points**
* `Recomposition`: **+2 Points**
* `Bulking / Surplus`: **0 Points**

#### B. Temporal Decision Matrix
The system evaluates the total *Fatigue Score* against the current week of the mesocycle:

* **Range 1 (Weeks 0–5): Accumulation Phase.**
* *Action:* `🟢 KEEP TRAINING`
* **Range 2 (Week 6): Critical Safety Filter.**
* *Logic:* If Score ≥ 6
* *Action:* `💀 IMMEDIATE DELOAD`
* **Range 3 (Weeks 7–8): Fatigue-Management Zone.**
* *Logic:* If Score ≥ 3
* *Action:* `⚠️ PLAN DELOAD`
* **Range 4 (Week 9+): Physiological Limit.**
* *Logic:* Regardless of score
* *Action:* `🔴 MANDATORY DELOAD`

**Result:** A four-tier temporal classification system forcing early deloads for high-risk athletes while allowing low-risk profiles to push up to week 9.

## Conclusions

* The engine successfully differentiated the profiles: it recommended continued training for **Juan** (Week 5) and forced an immediate deload for **Antonio** (Week 9 + High Risk).
* **Window Functions** proved superior to subqueries for PR calculation.
* The intra-session load-adjustment predictor correctly reacted to synthetic fatigue, suggesting weight reductions on sets where reps dropped by more than 2.
* SQL can directly implement “Business Logic” within the database, reducing external processing.

## Repository Structure

```
sql-adaptive-strength-engine/
├── sql/
│ ├── 01_schema.sql # Table structure (CREATE)
│ ├── 02_seeds.sql # Synthetic data (Juan & Antonio)
│ └── 03_analysis.sql # KPIs and decision algorithms
├── tests/
│ └── data_quality.sql # Data integrity tests
└── README.md # Project documentation
```

## Author

Héctor Zamorano García

## Credits

Project developed entirely by the author, with occasional support from AI-assistance tools (mainly GitHub Copilot) for minor tasks such as synthetic data generation or SQL statement autocompletion.
All business logic, schema design, analytical decisions, and validation were performed manually.

## Technical Requirements

* SQLite 3.45+
* Any SQL client (DBeaver, SQLiteStudio, VSCode + SQL Extension)
* Ability to execute `.sql` scripts for creation, seed, and analysis
* Operating system: Windows / Linux / macOS

## Notes

This project was conceived for educational purposes and experimentation applied to strength training.
Although it is based on a real routine and principles grounded in physiology, its purpose is technical analysis and it does not replace professional guidance.

## License

This project is distributed solely for academic, analytical, and research purposes.
Reading and personal use are permitted.
Commercial use or redistribution is prohibited without the author's explicit authorization.