An open API service indexing awesome lists of open source software.

https://github.com/cakmoel/resilio

Professional technology-agnostic load testing suite built for performance engineering and durability auditing. Implements research-based methodologies (Jain, 1991) and ISO 25010 standards to validate speed, endurance, and scalability across any backend stack.
https://github.com/cakmoel/resilio

apachebench benchmarking devops-tools endurance-testing load-testing performance-testing quality-assurance reliability-engineering scalability sre stress-testing tech-agnostic

Last synced: 14 days ago
JSON representation

Professional technology-agnostic load testing suite built for performance engineering and durability auditing. Implements research-based methodologies (Jain, 1991) and ISO 25010 standards to validate speed, endurance, and scalability across any backend stack.

Awesome Lists containing this project

README

          

# Resilio

**High-Performance Load Testing Suite for Web Durability and Speed**

[![License: MIT](https://img.shields.io/badge/License-MIT-blue.svg)](LICENSE)
[![Version](https://img.shields.io/badge/version-6.3.0-green.svg)](CHANGELOG.md)
[![SLT Engine](https://img.shields.io/badge/SLT-v2.2-blue.svg)](bin/slt.sh)
![CI](https://github.com/cakmoel/resilio/actions/workflows/ci.yml/badge.svg)

---

## Overview

Resilio is a professional-grade performance engineering toolkit designed for QA Engineers, Developers, and DevOps practitioners. It provides a structured, technology-agnostic methodology to measure the speed, endurance, and scalability of web applications and APIs.

By leveraging the reliability of ApacheBench and adding layers of statistical analysis, automated hypothesis testing, and research-based methodologies, Resilio transforms raw network data into high-fidelity performance intelligence.

### Why Resilio?

- **Research-Based Methodology**: Implements ISO 25010 standards and academic frameworks (Jain, 1991; Welch, 1947; Mann & Whitney, 1947)
- **Advanced Statistical Testing**: Automatic selection between parametric (Welch's t-test) and non-parametric (Mann-Whitney U) methods
- **Intelligent Test Selection**: Automatically chooses the best statistical test based on data distribution
- **Technology-Agnostic**: Tests any web application via HTTP protocol (PHP, Node.js, Python, Go, Java, Ruby, .NET, Rust)
- **Automated Regression Detection**: Compare against baselines with statistical hypothesis testing
- **Hybrid Baseline Management**: Git-integrated for production, local-only for development
- **Comprehensive Metrics**: RPS, percentiles (P50/P95/P99), latency, stability (CV), and error rates

---

## πŸ†• What's New in v6.3

### New Feature: Iteration Delay for Rate Limiting

v6.3 introduces a new configurable parameter for `slt.sh` to control the pacing of your load tests.

#### Key Benefits:
* **Controlled Test Pacing:** Prevent overwhelming target systems by introducing configurable pauses between test cycles.
* **Reduced System Load:** Space out test requests to simulate more realistic user behavior or to comply with system capacity limits.
* **Improved Stability:** Help maintain the stability of the system under test during prolonged load testing by giving it time to recover between iterations.

#### How to Use:
You can configure the delay by setting the `ITERATION_DELAY_SECONDS` environment variable before running `slt.sh`:

```bash
ITERATION_DELAY_SECONDS=5 ./bin/slt.sh
```

This will introduce a 5-second pause after all scenarios within a single iteration have completed, before the next iteration begins.

### Backward Compatibility

**βœ… 100% compatible with v6.2 usage:**
- All v6.2 commands work identically
- Baseline format unchanged
- Report structure preserved
- CLI interface identical
- Only enhancement: Addition of iteration delay for SLT.

**Migration:** Simply use `v6.3` - no configuration changes needed!

---

## Core Engines

### Resilio SLT (Simple Load Testing) - `bin/slt.sh` v2.3 (Suite v6.3)

The **SLT engine** is optimized for agile development cycles and rapid feedback. Perfect for:

- Quick performance checks during development
- Smoke testing before deployments
- CI/CD pipeline integration
- Endpoint comparison and basic benchmarking

**Key Features:**
- Configurable iterations (default: 1000)
- Concurrent user simulation (default: 10)
- Percentile analysis (P50, P95, P99)
- Stability measurement (Coefficient of Variation)
- Error tracking without breaking calculations
- Comprehensive summary reports in Markdown

---

### Resilio DLT (Deep Load Testing) - `bin/dlt.sh` v6.3

The **DLT engine** is a research-grade powerhouse designed for rigorous statistical analysis. Perfect for:

- Production baseline establishment
- Statistical hypothesis testing with automatic test selection
- Regression detection with effect size analysis
- Capacity planning and SLA validation
- Performance trending over releases
- Tail latency analysis (P95/P99)

**Key Features:**

#### Statistical Testing (v6.2)
- **Python-powered backend** - Extremely fast calculations for any data volume.
- **Automatic test selection** - Chooses best method for your data.
- **Mann-Whitney U test** - Robust for non-normal distributions ($O(n \log n)$).
- **Welch's t-test** - Powerful for normal distributions.
- **Normality checking** - Skewness and kurtosis analysis.
- **Effect size calculation** - Cohen's d and rank-biserial correlation.
- **95% confidence intervals** - Statistical accuracy bounds.

#### Test Execution
- Three-phase execution (Warm-up β†’ Ramp-up β†’ Sustained)
- Realistic workload simulation (2-second think time)
- System resource monitoring (CPU, memory, disk I/O)
- Automated regression detection

#### Baseline Management
- Git-integrated baseline management
- Production vs development modes
- Metadata tracking with Git commits
- Automatic baseline comparison

---

## When to Use Each Engine

| Scenario | Use SLT | Use DLT |
|----------|---------|---------|
| Quick performance check | βœ… | ❌ |
| CI/CD integration | βœ… | ⚠️ (time-consuming) |
| Compare endpoints | βœ… | ❌ |
| Initial benchmarking | βœ… | ❌ |
| Production baseline | ❌ | βœ… |
| Statistical validation | ❌ | βœ… |
| **Tail latency testing (P95/P99)** | ❌ | βœ… **(v6.2 excels!)** |
| Regression detection | ❌ | βœ… |
| Capacity planning | ❌ | βœ… |
| SLA validation | ❌ | βœ… |
| Memory leak detection | ❌ | βœ… |

---

## Technology Compatibility

Resilio works with **any web technology** because it tests via HTTP protocol:

| Technology | Framework Examples | Status |
|------------|-------------------|--------|
| **PHP** | Laravel, Symfony, WordPress, Slim | βœ… Fully Supported |
| **JavaScript** | Node.js, Express, Next.js, Nest.js | βœ… Fully Supported |
| **Python** | Django, Flask, FastAPI, Pyramid | βœ… Fully Supported |
| **Go** | Gin, Echo, Fiber, Chi | βœ… Fully Supported |
| **Ruby** | Rails, Sinatra, Hanami | βœ… Fully Supported |
| **Java** | Spring Boot, Micronaut, Quarkus | βœ… Fully Supported |
| **.NET** | ASP.NET Core, Nancy | βœ… Fully Supported |
| **Rust** | Actix-web, Rocket, Axum | βœ… Fully Supported |

**Why it works:** Resilio operates at the HTTP protocol layer, measuring request/response cycles exactly as end-users experience themβ€”regardless of backend implementation.

---

## Quick Start

### Prerequisites

- **Python 3.10+** (Mandatory for DLT math engine)
- **ApacheBench (ab)** (Standard `apache2-utils`)
- **Bash 4.4+**
- **bc** (Arbitrary precision calculator)
- **GNU Coreutils** (`awk`, `grep`, `sed`, `sort`, `uniq`)
- **Git** (For baseline version control)
- **curl** (For system metric validation)
- iostat (for system monitoring)

**Installation:**

```bash
# Ubuntu/Debian
sudo apt-get update
sudo apt-get install apache2-utils bc gawk grep coreutils sysstat

# CentOS/RHEL/Fedora
sudo yum install httpd-tools bc gawk grep coreutils sysstat

# macOS
brew install apache2
# bc, awk, grep are pre-installed
```

**Verify Installation:**

```bash
ab -V && bc --version && awk --version && grep --version
```

### Installation

```bash
# 1. Clone or download the repository
git clone https://github.com/cakmoel/resilio.git
cd resilio

# 2. Make scripts executable
chmod +x bin/slt.sh bin/dlt.sh

# 3. Configure test scenarios (edit the SCENARIOS section)
nano bin/dlt.sh # or bin/slt.sh
```

### Basic Usage

**Simple Load Testing (SLT):**

```bash
# Default: 1000 iterations, 100 requests/test, 10 concurrent users
./bin/slt.sh

# Custom parameters
ITERATIONS=500 AB_REQUESTS=50 AB_CONCURRENCY=5 ./bin/slt.sh

# With iteration delay
ITERATION_DELAY_SECONDS=5 ITERATIONS=100 AB_REQUESTS=10 AB_CONCURRENCY=2 ./bin/slt.sh
```

**Deep Load Testing (DLT):**

```bash
# Research-based three-phase test with automatic statistical test selection
./bin/dlt.sh

# Results include hypothesis testing against baseline
cat load_test_reports_*/hypothesis_testing_*.md
```

---

## Performance Methodology

Resilio is not a basic wrapper for ApacheBenchβ€”it's a framework implementing rigorous statistical controls to ensure performance data is actionable and scientifically sound.

### 1. Tail Latency Analysis (P95/P99)

Average response times mask the "long tail" of user dissatisfaction. Resilio focuses on **P95 and P99 latencies** to identify worst-case scenarios caused by:
- Resource contention
- Garbage collection pauses
- Network jitter
- Database query variance

**New in v6.2:** Mann-Whitney U test is specifically designed for tail latency metrics, providing more accurate detection of regressions in P95/P99 values.

### 2. Stability Measurement (Coefficient of Variation)

The **CV metric** reveals system consistency:
- **CV < 10%**: Excellent stability
- **CV < 20%**: Good stability
- **CV < 30%**: Moderate stability
- **CV β‰₯ 30%**: Poor stability (investigate)

A low average RPS is acceptable if CV is low (consistency), but high RPS with high CV indicates instability.

### 3. Three-Phase Execution (DLT Only)

Adheres to the **USE Method** (Utilization, Saturation, Errors):

1. **Warm-up Phase** (50 iterations): Primes JIT compilers, connection pools, and caches
2. **Ramp-up Phase** (100 iterations): Gradually increases load to observe the "Knee of the Curve"
3. **Sustained Load** (850 iterations): Collects primary dataset for statistical analysis

### 4. Statistical Hypothesis Testing (DLT Only)

**New in v6.2:** Automatic test selection between two methods:

#### Welch's t-test (Parametric)
**Used when:** Data is approximately normal (|skewness| < 1.0 AND |kurtosis| < 2.0)

**Best for:**
- Mean RPS (requests per second)
- Average response time
- Throughput metrics

**Advantages:** More statistical power (better at detecting true differences)

#### Mann-Whitney U Test (Non-Parametric) - NEW!
**Used when:** Data is non-normal (|skewness| β‰₯ 1.0 OR |kurtosis| β‰₯ 2.0)

**Best for:**
- P95/P99 latencies (long tails)
- Error rates (heavily skewed)
- Cache hit rates (bimodal)

**Advantages:** Robust to outliers, no distribution assumptions

#### Hypothesis Testing Framework

- **Null Hypothesis (Hβ‚€)**: No significant difference exists
- **Alternative Hypothesis (H₁)**: Significant difference detected
- **Significance Level**: Ξ± = 0.05 (95% confidence)

**Effect Size:**
- **Cohen's d** (for Welch's t-test): Standardized mean difference
- **Rank-biserial r** (for Mann-Whitney U): Analogous to Cohen's d

**Interpretation (both metrics):**
- < 0.2: Negligible
- 0.2 - 0.5: Small
- 0.5 - 0.8: Medium
- \> 0.8: Large

This ensures decisions are based on **both statistical significance and practical importance**.

### 5. 95% Confidence Intervals

All Mean RPS values include confidence intervals, ensuring results represent true system capacityβ€”not lucky runs.

---

## Understanding Results

### SLT Output Structure

```
load_test_results_YYYYMMDD_HHMMSS/
β”œβ”€β”€ summary_report.md # Main performance report
β”œβ”€β”€ console_output.log # Real-time test output
β”œβ”€β”€ execution.log # Detailed execution log
β”œβ”€β”€ error.log # Error tracking
└── raw_*.txt # Raw ApacheBench outputs
```

**Key Metrics:**
- **Average RPS**: Mean throughput
- **Median RPS**: Less affected by outliers
- **Standard Deviation**: Consistency indicator
- **P50/P95/P99**: Percentile response times
- **CV (Coefficient of Variation)**: Stability score
- **Success/Error Rate**: Reliability metrics

---

### DLT Output Structure

```
load_test_reports_YYYYMMDD_HHMMSS/
β”œβ”€β”€ research_report_*.md # Comprehensive analysis
β”œβ”€β”€ hypothesis_testing_*.md # Statistical comparison (enhanced in v6.1)
β”œβ”€β”€ system_metrics.csv # CPU, memory, disk I/O
β”œβ”€β”€ error_log.txt # Error tracking
β”œβ”€β”€ execution.log # Phase-by-phase log
β”œβ”€β”€ raw_data/ # All ApacheBench outputs
└── charts/ # Reserved for visualizations
```

**Key Metrics:**
- **Mean with 95% CI**: Statistical accuracy bounds
- **Statistical Test Used**: Shows which test was automatically selected (v6.2)
- **Test Statistic**: t-value (Welch's) or U-value (Mann-Whitney)
- **p-value**: Statistical significance
- **Effect Size**: Cohen's d or rank-biserial r
- **Verdict**: Regression/Improvement/No Change
- **Distribution Characteristics**: Skewness and kurtosis (v6.2)

---

### Example: Enhanced v6.2 Report

```markdown
### API_Endpoint

**Test Used**: Mann-Whitney U test (non-parametric)
**Reason**: Non-normal distribution detected

| Metric | Value | Interpretation |
|--------|-------|----------------|
| **Test Statistic** | 1247 | U-value |
| **p-value** | 0.032 | Statistically significant β˜… |
| **Effect Size** | -0.34 | Rank-biserial r |
| **Effect Magnitude** | small | - |
| **Verdict** | ⚠️ SIGNIFICANT REGRESSION | - |

#### Distribution Characteristics

- **Baseline**: non_normal|skew=2.34|kurt=8.91
- **Candidate**: non_normal|skew=1.87|kurt=6.23

Mann-Whitney U test was used because at least one sample showed
non-normal distribution. This test is more robust to outliers and
skewed data, making it ideal for tail latency metrics (P95/P99).

- **Strong evidence** against Hβ‚€ (95% confidence)
- Effect size is **small** (Rank-biserial r = -0.34)
- **Practical significance**: Change is statistically detectable but may not be practically important
```

---

## Configuration

### Configuring Test Scenarios

Both scripts use a `SCENARIOS` associative array:

```bash
# Edit bin/slt.sh or bin/dlt.sh
declare -A SCENARIOS=(
["Homepage"]="http://localhost:8000/"
["API_Users"]="http://localhost:8000/api/users"
["Product_Page"]="http://localhost:8000/products/123"
)
```

### Environment Variables (SLT)

```bash
ITERATIONS=1000 # Number of test iterations
AB_REQUESTS=100 # Requests per test
AB_CONCURRENCY=10 # Concurrent users
AB_TIMEOUT=30 # Timeout in seconds
```

**Example:**

```bash
ITERATIONS=500 AB_CONCURRENCY=20 ./bin/slt.sh
```

### Environment Configuration (DLT)

**Production Mode** (Git-tracked baselines):

```bash
# Create .env file
echo "APP_ENV=production" > .env

# Configure URLs
echo 'STATIC_PAGE=https://prod.example.com/' >> .env
echo 'DYNAMIC_PAGE=https://prod.example.com/api/users' >> .env

./bin/dlt.sh
```

Baselines saved to: `./baselines/` (Git-tracked)

**Local Development Mode** (local-only baselines):

```bash
echo "APP_ENV=local" > .env
./bin/dlt.sh
```

Baselines saved to: `./.dlt_local/` (not Git-tracked)

---

## CI/CD Integration

### GitHub Actions Example

```yaml
name: Performance Regression Check

on:
pull_request:
branches: [main]

jobs:
load-test:
runs-on: ubuntu-latest

steps:
- uses: actions/checkout@v3
with:
fetch-depth: 0 # Need baselines from history

- name: Install Dependencies
run: |
sudo apt-get update
sudo apt-get install -y apache2-utils bc sysstat

- name: Run Load Test (v6.3 with automatic test selection)
run: |
chmod +x bin/dlt.sh
./bin/dlt.sh

- name: Check for Regressions
run: |
REPORT=$(cat load_test_reports_*/hypothesis_testing_*.md)

# Check for significant regressions
if echo "$REPORT" | grep -q "SIGNIFICANT REGRESSION"; then
echo "⚠️ Performance regression detected!"
echo "$REPORT"
exit 1
fi

# v6.2: Also check which test was used
echo "Statistical Test Summary:"
echo "$REPORT" | grep "Test Used:"

- name: Upload Reports
if: always()
uses: actions/upload-artifact@v3
with:
name: performance-reports
path: load_test_reports_*/**
```

---

## Best Practices

### Before Testing

1. **Never test production** without authorization
2. **Warm up your application** before recording metrics
3. **Check resource limits**: `ulimit -n 10000`
4. **Disable rate limiting** temporarily during tests
5. **Monitor application logs** during test execution

### Interpreting Results (Updated for v6.2)

1. **Focus on percentiles**: P95/P99 matter more than averages
2. **Check CV first**: High CV = unstable system
3. **Compare against baselines**: Use DLT for trend analysis
4. **Consider both p-value AND effect size**: Statistical significance β‰  practical importance
5. **Review test selection** (v6.1): Check if Mann-Whitney U was used for tail latencies
6. **Inspect distribution characteristics** (v6.1): High skewness/kurtosis indicates need for non-parametric tests
7. **Document test conditions**: Note system state, data volume, background jobs

### When to Trust Mann-Whitney U Results (v6.2)

Mann-Whitney U test is **more reliable** than Welch's t-test when:
- Testing P95/P99 latencies (almost always non-normal)
- Data has outliers (e.g., occasional 5-second response times)
- Error rates (many zeros, few spikes)
- Cache performance (bimodal distribution: hit vs miss)

**Check your report:** Look for `"Test Used: Mann-Whitney U test"` in the hypothesis testing report.

### Production Baseline Management

```bash
# 1. Establish baseline during stable period
echo "APP_ENV=production" > .env
./bin/dlt.sh

# 2. Commit baselines to Git
git add baselines/
git commit -m "chore: establish performance baseline for release v2.0"
git push

# 3. Future tests automatically compare against this baseline
./bin/dlt.sh
# v6.3 automatically selects best statistical test!

# 4. Check results
cat load_test_reports_*/hypothesis_testing_*.md
```

---

## Troubleshooting

### Common Issues

**1. "bc incompatible with current locale"**

```bash
# Solution A: Use C locale
LC_NUMERIC=C ./bin/dlt.sh

# Solution B: Install en_US.UTF-8
sudo locale-gen en_US.UTF-8
```

**2. Connection Refused**

```bash
# Verify application is running
curl http://localhost:8000/

# Check firewall
sudo ufw status
```

**3. Timeout Errors**

```bash
# Increase timeout or reduce concurrency
AB_TIMEOUT=60 AB_CONCURRENCY=5 ./bin/slt.sh
```

**4. Too Many Open Files**

```bash
# Increase file descriptor limit
ulimit -n 10000
```

**5. Unexpected Test Selection (v6.1)**

```bash
# If Mann-Whitney U is used when you expect Welch's t-test:
# Check the distribution characteristics in the report

# Example:
# Distribution: non_normal|skew=2.34|kurt=8.91
# ^^^^^^^^^^
# High skewness (2.34 > 1.0) triggered Mann-Whitney U

# This is CORRECT behavior - your data is skewed!
```

---

---

## Upgrading from v6.2 to v6.3

### Migration Guide

- **Zero-Risk Upgrade - 100% Backward Compatible**

```bash
# 1. Backup v6.2 (optional - recommended)
cp bin/slt.sh bin/slt_v6.2_backup.sh

# 2. Replace with v6.3
# Download new slt.sh from repository
chmod +x bin/slt.sh

# 3. Test (works identically to v6.2)
./bin/slt.sh

# 4. Try new iteration delay feature
ITERATION_DELAY_SECONDS=5 ./bin/slt.sh
```

### What Changed

**Same (100% compatible):**
- All CLI commands for both SLT and DLT
- Baseline file format
- Environment variables
- Report locations
- All v6.2 functionality

**Enhanced (SLT only):**
- Iteration delay support for rate limiting
- Configurable pacing between test cycles
- Better control for system under test stability
- Improved simulation of realistic user behavior

**No configuration changes needed!**

---

## Documentation

- **[USAGE_GUIDE.md](docs/USAGE_GUIDE.md)** - Comprehensive usage guide with real-world scenarios
- **[REFERENCES.md](docs/REFERENCES.md)** - Academic and research references (updated for v6.2)
- **[CHANGELOG.md](CHANGELOG.md)** - Version history and release notes
- **[Performance Methodology](docs/methodology.md)** - Mathematical formulas and ISO 25010 compliance

---

## Research Foundations

Resilio v6.3 implements methodologies from:

### Original Foundations (v6.0 & v6.1)
- **Jain, R. (1991)** - Statistical methods for performance measurement
- **Welch, B. L. (1947)** - Unequal variance t-test
- **Cohen, J. (1988)** - Effect size interpretation
- **ISO/IEC 25010:2011** - Performance efficiency metrics
- **Barford & Crovella (1998)** - Workload characterization
- **Gunther, N. J. (2007)** - Queueing theory and capacity planning
- **Mann, H. B., & Whitney, D. R. (1947)** - Non-parametric rank-based comparison
- **Wilcoxon, F. (1945)** - Rank-sum test theoretical foundation
- **D'Agostino, R. B. (1971)** - Normality testing via skewness and kurtosis
- **Kerby, D. S. (2014)** - Rank-biserial correlation for effect size

### New in v6.2
- **Ruxton, G. D. (2006)** - The unequal variance t-test is an underused substitution for Student's t-test and the Mann-Whitney U test.

---

## Version Comparison

| Feature | v6.0 | v6.1 | v6.2 | v6.3 |
|---------|------|------|------|------|
| Welch's t-test | βœ… | βœ… | βœ… | βœ… |
| Mann-Whitney U | ❌ | βœ… | βœ… | βœ… |
| Automatic test selection | ❌ | βœ… | βœ… | βœ… |
| Normality checking | ❌ | βœ… | βœ… | βœ… |
| Cohen's d | βœ… | βœ… | βœ… | βœ… |
| Rank-biserial r | ❌ | βœ… | βœ… | βœ… |
| Baseline management | βœ… | βœ… | βœ… | βœ… |
| Smart locale detection | βœ… | βœ… | βœ… | βœ… |
| Python Math Engine (40x) | ❌ | ❌ | βœ… | βœ… |
| Iteration Delay (Rate Limiting) | ❌ | ❌ | ❌ | βœ… |
| Best for tail latencies | ⚠️ | βœ… | βœ… | βœ… |
| Handles outliers | ⚠️ | βœ… | βœ… | βœ… |

---

## Contributing

Contributions are welcome! Please:

1. Fork the repository
2. Create a feature branch
3. Include tests for new functionality
4. Update documentation (including REFERENCES.md for new methods)
5. Submit a pull request

### Areas for Contribution

- Multiple comparison correction (Bonferroni/Holm)
- Sequential Probability Ratio Test (SPRT) for early stopping
- Bayesian A/B testing as an alternative approach
- Visualization dashboards for trends
- Integration with monitoring tools (Prometheus, Grafana)

---

## License

This project is licensed under the MIT License.

Copyright Β© 2025 M.Noermoehammad

---

## Support

- **Issues**: [GitHub Issues](https://github.com/cakmoel/resilio/issues)
- **Discussions**: [GitHub Discussions](https://github.com/cakmoel/resilio/discussions)
- **Email**: alanmoehammad@gmail.com

---

## Citation

If you use Resilio in academic research, please cite:

```bibtex
@software{resilio2026,
author = {Noermoehammad, M.},
title = {Resilio: Research-Based Performance Testing Suite},
year = {2026},
version = {6.3.0},
url = {https://github.com/cakmoel/resilio}
}
```

---

**Resilio v6.3: Built for Speed, Tested for Durability, Proven by Science**

*Now with iteration delay control for realistic load testing.*