An open API service indexing awesome lists of open source software.

https://github.com/syswe/llm-phpa

πŸ€– LLM-powered Predictive Horizontal Pod Autoscaling for Kubernetes with intelligent pattern recognition
https://github.com/syswe/llm-phpa

Last synced: 11 months ago
JSON representation

πŸ€– LLM-powered Predictive Horizontal Pod Autoscaling for Kubernetes with intelligent pattern recognition

Awesome Lists containing this project

README

          

# Predictive Horizontal Pod Autoscaling: A Pattern-Aware Framework with Large Language Model Integration

## 🎯 Project Overview

This repository contains the complete implementation and research framework for **"LLM Pattern Recognition for Predictive Horizontal Pod Autoscaling"** - a comprehensive master thesis research project that establishes foundational components for intelligent Kubernetes autoscaling through integration of advanced machine learning techniques, sophisticated hyperparameter optimization frameworks, and automated pattern recognition capabilities.

### πŸ”¬ Research Context

Traditional Kubernetes autoscaling relies on reactive heuristic methods that fail to capture complex temporal dependencies in modern microservices workloads. This research addresses these limitations through the development of Predictive Horizontal Pod Autoscaling (PHPA) incorporating advanced hyperparameter optimization frameworks that anticipate future resource requirements before they manifest as performance bottlenecks.

## πŸ—οΈ System Architecture

The research framework consists of three interconnected modules that collectively establish a comprehensive approach to intelligent autoscaling:

![phpa-diagram](phpa-diagram.png)

## πŸ“Š Key Research Contributions

### 1. **Comprehensive Pattern Taxonomy** (Module 1)
- **Six mathematically-formulated pattern types** covering the full spectrum of production workload behaviors
- **Over 2 million data points** across 600 distinct scenarios for robust algorithm evaluation
- **Real-world calibration** validated against NASA web servers, FIFA World Cup datasets, and cloud application logs
- **Statistical rigor** with 15-minute granularity over 35-day periods with realistic Kubernetes constraints

### 2. **Advanced Model Selection Framework** (Module 2)
- **37.4% average improvement** in forecasting accuracy through pattern-specific model selection
- **Seven CPU-optimized models** with comprehensive hyperparameter optimization
- **Production-ready performance** with training times 0.02-0.61s, memory usage 42-210MB
- **Advanced optimization strategies** including temporal cross-validation and early stopping

### 3. **LLM-Powered Pattern Recognition** (Module 3)
- **96.7% overall accuracy** in pattern classification with Gemini 2.5 Pro
- **Multimodal analysis** supporting both text-based CSV and visual chart analysis
- **Automated model recommendation** based on detected workload patterns
- **Democratic access** to sophisticated temporal analysis capabilities

## πŸ”— Module Integration and Relationships

### Data Flow Architecture
1. **Pattern Generation** β†’ Creates comprehensive synthetic datasets representing six fundamental workload types
2. **Model Training** β†’ Evaluates forecasting models across generated patterns with optimization
3. **LLM Recognition** β†’ Automatically identifies patterns and recommends optimal models
4. **PHPA Framework** β†’ Integrates all components for intelligent autoscaling decisions

### Mathematical Foundation
The research establishes pattern-driven optimization as:

```
min ΞΈα΅’ E[L(yβ‚œ, fα΅’(xβ‚œ; ΞΈα΅’))] subject to pβ‚œ ∈ Pα΅’
```

Where `fα΅’` represents the optimal model for pattern type `i`, `ΞΈα΅’` denotes pattern-specific optimized hyperparameters, and `Pα΅’` defines the pattern classification space.

## πŸ“ Repository Structure

```
phpa/
β”œβ”€β”€ 1-dataset-generation/ # Module 1: Pattern Generation System
β”‚ β”œβ”€β”€ scripts/ # Pattern generation and validation
β”‚ β”‚ β”œβ”€β”€ patterns/ # Six pattern implementations
β”‚ β”‚ β”œβ”€β”€ config/ # Configuration management
β”‚ β”‚ └── utils/ # Utilities and plotting
β”‚ └── README.md # Detailed module documentation
β”‚
β”œβ”€β”€ 2-ml-training/ # Module 2: ML Model Training Framework
β”‚ β”œβ”€β”€ scripts/ # Training and evaluation scripts
β”‚ β”‚ β”œβ”€β”€ cpu-models/ # Production-ready CPU models
β”‚ β”‚ └── gpu-models/ # Advanced GPU-accelerated models
β”‚ └── README.md # Detailed module documentation
β”‚
β”œβ”€β”€ 3-llm-pattern-recognition/ # Module 3: LLM Integration System
β”‚ β”œβ”€β”€ scripts/ # LLM evaluation and benchmarking
β”‚ β”œβ”€β”€ config.yaml.example # Configuration template
β”‚ └── README.md # Detailed module documentation
β”‚
β”œβ”€β”€ sections-en/ # Academic Paper Sections
β”‚ β”œβ”€β”€ 1-introduction.tex # Research introduction and context
β”‚ β”œβ”€β”€ 7-architecture.tex # Proposed PHPA architecture
β”‚ β”œβ”€β”€ 8-discussion.tex # Critical analysis and implications
β”‚ β”œβ”€β”€ 9-conclusion.tex # Conclusions and future directions
β”‚ β”œβ”€β”€ 10-acknowledgment.tex # Acknowledgments
β”‚
└── README.md # This comprehensive overview
```

## πŸš€ Quick Start Guide

### Prerequisites
- Python 3.8+
- Docker (optional, for containerized deployment)
- Kubernetes cluster (for production testing)
- API keys for LLM providers (Gemini, Qwen, Grok)

### 1. Dataset Generation
```bash
cd 1-dataset-generation/scripts
python generate_patterns.py --output-dir complete_dataset --days 35
```

### 2. Model Training
```bash
cd 2-ml-training/scripts
python train-models.py --models "xgboost,lightgbm,prophet"
```

### 3. LLM Pattern Recognition
```bash
cd 3-llm-pattern-recognition
cp config.yaml.example config.yaml
# Configure API keys
python scripts/enhanced_benchmark.py --llm all --method all
```

## πŸ“ˆ Research Results and Validation

### Empirical Performance Metrics

| Component | Metric | Result |
|-----------|--------|--------|
| **Pattern-Specific vs Universal** | MAE Improvement | **37.4%** |
| **LLM Pattern Recognition** | Overall Accuracy | **96.7%** |
| **Model Training Time** | Range | **0.02-0.61s** |
| **Memory Usage** | Range | **42-210MB** |
| **Dataset Coverage** | Total Data Points | **2M+** |
| **Scenario Diversity** | Unique Scenarios | **600** |

### Pattern-Model Optimization Results

| Pattern Type | Optimal Model | Win Rate | MAE | Optimization Strategy |
|--------------|---------------|----------|-----|----------------------|
| **Growing** | VAR | 96% | 2.44 | BIC lag selection |
| **On/Off** | CatBoost | 62% | 0.87 | Ordered boosting |
| **Seasonal** | GBDT | 45% | 1.89 | Learning rate-depth optimization |
| **Burst** | GBDT | 42% | 2.13 | Histogram-based construction |
| **Chaotic** | GBDT | 38% | 2.45 | Advanced regularization |
| **Stepped** | GBDT | 35% | 1.97 | Depth optimization |

## πŸ”¬ Scientific Methodology

### 1. Pattern Taxonomy Development
Six fundamental Kubernetes workload patterns with mathematical formulations:

- **Seasonal**: `P_t = B + βˆ‘A_k sin(2Ο€t/T_k + Ο†_k) + N_t`
- **Growing**: `P_t = B + GΒ·f(t) + SΒ·sin(2Ο€h_t/24) + N_t`
- **Burst**: `P_t = B + βˆ‘B_iΒ·g(t-t_i,d_i)Β·1_{t_i≀t