{"id":50299414,"url":"https://github.com/kernelci/pullab_cloud","last_synced_at":"2026-05-28T11:30:44.763Z","repository":{"id":358040793,"uuid":"1224408980","full_name":"kernelci/pullab_cloud","owner":"kernelci","description":null,"archived":false,"fork":false,"pushed_at":"2026-05-22T22:15:55.000Z","size":451,"stargazers_count":0,"open_issues_count":3,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-05-22T23:36:43.155Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/kernelci.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-04-29T08:52:47.000Z","updated_at":"2026-05-22T22:16:00.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/kernelci/pullab_cloud","commit_stats":null,"previous_names":["kernelci/pullab_cloud"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/kernelci/pullab_cloud","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kernelci%2Fpullab_cloud","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kernelci%2Fpullab_cloud/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kernelci%2Fpullab_cloud/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kernelci%2Fpullab_cloud/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/kernelci","download_url":"https://codeload.github.com/kernelci/pullab_cloud/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kernelci%2Fpullab_cloud/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":33607334,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-05-28T02:00:06.440Z","response_time":99,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2026-05-28T11:30:44.080Z","updated_at":"2026-05-28T11:30:44.746Z","avatar_url":"https://github.com/kernelci.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Kernel CI Cloud Labs\n\n## Overview\n\nKernel CI Cloud Labs is an automated testing framework designed to validate Linux kernel builds across cloud infrastructure. The system orchestrates parallel kernel testing by providing scripts to run tests from a Fargate container on EC2 VMs:\n\n- **Spawning multiple EC2 instances** to run different test suites simultaneously\n- **Managing test execution** through containerized ECS Fargate tasks that coordinate VM operations\n- **Supporting diverse test types** including kernel installation, reboots, performance benchmarks (UnixBench), and comprehensive test suites (LTP, kselftest)\n- **Collecting and storing results** in S3 with detailed logs for each VM and test run\n\n**Future Goal:** This pipeline is designed to integrate with the existing KernelCI testing architecture, receiving test triggers from KernelCI and pushing results back to the KernelCI Database (KCIDB) for centralized reporting and analysis.\n\nThis package provides the kernel-ci-cloud-runner python application as an entry point to configure and run kernel testing in AWS EC2 VMs.\n\n## Installation\n\n**Python 3.11 is required.** Older interpreters (including the `python3` shipped\nby Amazon Linux 2023, which is 3.9) are not supported — `pip install -e .` will\nrefuse to run on them. On AL2023, install `python3.11 python3.11-pip\npython3.11-devel`; see [AWS_INSTALL.md](AWS_INSTALL.md) for the full host setup.\n\nRun the package in a virtual environment. We also provide the script \"tests/test-in-venv.sh\" to wrap these steps in en environment.\n\nIf you do not have the code already, get your copy (git URL to be defined).\n```bash\n# Clone the repository\ngit clone \u003crepository-url\u003e\ncd kernel-ci-cloud-labs\n```\n\nSetup virtual environment:\n\n```bash\n# Create virtual environment (Python 3.11 required)\npython3.11 -m venv .venv\nsource .venv/bin/activate  # On Windows: .venv\\Scripts\\activate\n\n# Install package (runtime only - boto3)\npython3.11 -m pip install -e .\n\n# Optional: Install with dev dependencies (pytest, black, pylint, pre-commit, pytest-cov)\npython3.11 -m pip install -e \".[dev]\"\n\n# Optional: Install with analysis dependencies (pandas, matplotlib, seaborn)\npython3.11 -m pip install -e \".[analysis]\"\n\n# Recommended: Install everything (dev + analysis)\npython3.11 -m pip install -e \".[dev,analysis]\"\n\n# Optional: Install pre-commit hooks (only if you plan to commit code)\npre-commit install\n```\n\n## Quick Start\n\nBelow steps guide you through the project to be able to trigger testing with the integration test.\n\n### 1. Configure AWS Credentials\n\nUse one of the following methods:\n\n- **AWS default credentials** (recommended): Configure via `aws configure`, IAM role, or environment variables. If you already have working AWS CLI access, skip to Step 2.\n- **Explicit credentials file**: Create `examples/aws/credentials.json`:\n  ```json\n  {\n    \"access_key_id\": \"YOUR_ACCESS_KEY_ID\",\n    \"secret_access_key\": \"YOUR_SECRET_ACCESS_KEY\"\n  }\n  ```\n\n### 2. Configure the project\n\nCustomize resource names with your own prefix and region, using the default configuration file (used for integration testing)\n\n```bash\nkernel-ci-cloud-runner aws setup configure --prefix kernel-ci-$USER- --region us-west-2\n```\n\nThis sets unique names for all AWS resources and avoids conflicts with other users. Run with `--dry-run` to preview changes. Use `--test-filter` to limit which tests are included (e.g., `--test-filter unixbench`).\n\nWith `--prefix kernel-ci-$USER-`, the following resources will be created:\n- S3: `kernel-ci-$USER-results-\u003cACCOUNT_ID\u003e` (test results), `kernel-ci-$USER-storage` (kernel RPMs)\n- IAM: `kernel-ci-$USER-ecs-role`\n- ECS: cluster `kernel-ci-$USER-cluster`, task `kernel-ci-$USER-task`\n- ECR: `kernel-ci-$USER-ecr`\n- CloudWatch: `/ecs/kernel-ci-$USER-task`, `/ec2/kernel-ci-$USER-vms`\n\n\nActually writing a configuration file for a given setup can be done with an explicit configuration:\n\n```bash\nkernel-ci-cloud-runner aws setup configure --prefix kernel-ci-$USER- --region us-west-2 --output my-config.config\n```\n\n### Validate setup (optional)\n\nBefore launching real VMs, run a pre-flight check of AWS permissions, IAM resources, the results bucket, and KernelCI/KCIDB tokens. The command is read-only by default — pass `--fix` to create the S3 bucket if it doesn't exist yet.\n\n```bash\nkernel-ci-cloud-runner aws setup validate \\\n  --bucket kernel-ci-$USER-results \\\n  --role kernel-ci-$USER-vm-role \\\n  --region us-west-2\n```\n\nWhat it checks:\n\n| Check | What it does |\n| --- | --- |\n| `aws_credentials` | `sts:GetCallerIdentity` — prints account + principal ARN |\n| `ec2_describe` | confirms `ec2:DescribeInstances` works |\n| `ec2_console_output` | probes `ec2:GetConsoleOutput` (needed to capture kernel boot logs) |\n| `ssm` | `ssm:DescribeInstanceInformation` — needed to drive the test client |\n| `iam_role` / `instance_profile` | only when `--role` is given — verifies trust policy and attached managed policies |\n| `s3_bucket` | `head_bucket`; with `--fix`, creates the bucket (region-aware) and enables Block Public Access |\n| `kernelci_api_token` | `GET \u003capi_base_uri\u003e/whoami` with `Bearer` from `KERNELCI_API_TOKEN` or `UNIFIED_TOKEN` |\n| `kcidb_jwt` | decodes the JWT payload (no signature verification) and reports `exp`, `iss`, `sub`; sources the token from `KCIDB_JWT`, `KCIDB_REST=https://\u003cjwt\u003e@host/path`, or `UNIFIED_TOKEN` |\n\nExits non-zero if any check fails. Useful when iterating on IAM policies, rotating tokens, or onboarding a new AWS account.\n\n### 3. Run integration test to verify setup\n\nThe integration test uses only `basic-test` and `example-reboot-test` — no kernel RPMs needed. This is the fastest way to verify everything works. The test will fail if you do not provide your configuration.\n\n```bash\npytest tests/integration/ -v -m integration\n```\n\n- Completes in ~2-5 minutes, spawns 2 EC2 VMs (1 x86_64 + 1 ARM64)\n- Check status by pipeline log message: \"VMs: 2/2 spawned, 2 successful, 0 failed, 0 missing\"\n- **Logs:** `tests/integration/logs/`\n\n### 4. Upload kernel RPMs (required for kernel-install tests)\n\nTests that install custom kernels (`example-kernel-reboot-test`, `unixbench-kernel-regression`) need kernel RPMs in an S3 bucket. Tests like `basic-test`, `example-reboot-test`, and `simple-unixbench` do **not** need this step.\n\n```bash\n# Upload local RPMs to the external storage bucket\nkernel-ci-cloud-runner aws setup upload-rpms \\\n  --bucket kernel-ci-$USER-storage \\\n  --local-rpms /path/to/rpms/x86_64/ /path/to/rpms/aarch64/\n```\n\nRPMs are classified by filename suffix (`.x86_64.rpm`, `.aarch64.rpm`, `.src.rpm`) and uploaded to:\n```\ns3://bucket-name/kernel-rpms/\n├── src/              # *.src.rpm\n└── binary/\n    ├── x86_64/       # *.x86_64.rpm\n    └── aarch64/      # *.aarch64.rpm\n```\n\nThe bucket name must match `external_storage.bucket` in `examples/aws/config.json` (set automatically by `setup configure`). If this is your first run or you changed IAM policies, set `\"force_recreate_roles\": true` in `config.json` to apply the updated policies.\n\n### 5. Run the Pipeline\n\nUsing the default configured configuration\n\n```bash\nkernel-ci-cloud-runner aws run\n```\n\nUsing an explicit configuration file (recommended)\n```\n# Or with a custom config file\nkernel-ci-cloud-runner aws run --config my-config.json\n```\n\n- Check status by pipeline log message: \"VMs: X/X spawned, Y successful, 0 failed, 0 missing\"\n- **Logs:** `logs/`\n\n### 6. Check results\n\nOpen AWS console → S3 → bucket `kernel-ci-$USER-results`, or use the CLI:\n\n```bash\n# List all test runs\naws s3 ls s3://kernel-ci-$USER-results/ --region us-west-2\n\n# List output files for a specific run and test\naws s3 ls s3://kernel-ci-$USER-results/run_\u003cTEST_ID\u003e_\u003cDATETIME\u003e/test_\u003cTEST_NAME\u003e/output/ \\\n  --recursive --region us-west-2\n\n# Download a test result\naws s3 cp s3://kernel-ci-$USER-results/run_\u003cTEST_ID\u003e_\u003cDATETIME\u003e/test_\u003cTEST_NAME\u003e/output/\u003cINSTANCE_ID\u003e/result.txt - \\\n  --region us-west-2\n```\n\n### 7. Clean up resources\n\nList all AWS resources created by the pipeline\n```bash\nkernel-ci-cloud-runner aws setup cleanup --prefix kernel-ci-$USER- --region us-west-2\n```\n\nActually delete resources related to the configured infrastructure\n```bash\nkernel-ci-cloud-runner aws setup cleanup --prefix kernel-ci-$USER- --region us-west-2 --delete\n```\n\nThis finds and removes: EC2 instances, ECS clusters/tasks/task definitions, IAM roles, ECR repositories, CloudWatch log groups, and S3 buckets matching the prefix.\n\n## Example: Upstream Kernel Performance Regression Test\n\nThis walkthrough builds two upstream Linux kernel versions as RPMs on an x86_64 machine, uploads them, and runs a performance regression test.\nThe example uses `v6.1.141` (base) and `v6.1.150` (tip) from the stable kernel tree.\n\n### Prerequisites\n\n- An x86_64 Linux machine with at least 16 GB RAM and 20 GB free disk space\n- Build tools: `gcc`, `make`, `flex`, `bison`, `elfutils-libelf-devel`, `openssl-devel`, `rpm-build`\n\nInstall build dependencies for e.g AL2023 or Fedora:\n```bash\nsudo dnf install -y gcc make flex bison elfutils-libelf-devel openssl-devel \\\n  rpm-build perl-devel bc\n```\n\n### Project Setup\n\nAll `kernel-ci-cloud-runner` commands below require the package to be installed in a virtual environment.\nIf not done already, set this up once:\n\n```bash\ncd kernel-ci-cloud-labs\npython3.11 -m venv .venv\nsource .venv/bin/activate\npython3.11 -m pip install -e .\n```\n\nActivate the virtual environment. After this, `kernel-ci-cloud-runner` is available in your shell.\nIf you open a new terminal, re-activate the venv first:\n\n```bash\nsource .venv/bin/activate\n```\n\n### Build Kernel RPMs\n\nThe below steps build two versions of the 6.1 kernel, compatible to work in AWS EC2 VMs.\n\nCreate directory for RPMs\n```bash\nmkdir -p kernel-rpms\n```\n\n\nClone the stable kernel tree (one-time, ~3 GB) and select 6.1 tree.\n```bash\ngit clone --branch linux-6.1.y \\\n  https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git linux-stable\n```\n\nBuild the base kernel (v6.1.141)\n```bash\ncd linux-stable\ngit checkout v6.1.141\ngit clean -xdf\nmake olddefconfig\n# Enable AWS networking support (required for EC2 instances)\n./scripts/config --enable CONFIG_NET_VENDOR_AMAZON --enable CONFIG_ENA_ETHERNET\n# Build kernel with multiple processes\nmake binrpm-pkg -j$(nproc) 2\u003e\u00261 | tee build-6.1.141.log\n\n# Collect the built RPM\nfind ~/rpmbuild/RPMS/x86_64/ -name \"kernel-*.rpm\" ! -name \"*headers*\" \\\n  -exec cp {} ../kernel-rpms/ \\;\n\ncd ..\n\n# Show current RPMs\nls kernel-rpms\n```\n\n\nBuild the tip kernel (v6.1.150)\n```bash\ncd linux-stable\ngit checkout v6.1.150\ngit clean -xdf\nmake olddefconfig\n# Enable AWS networking support (required for EC2 instances)\n./scripts/config --enable CONFIG_NET_VENDOR_AMAZON --enable CONFIG_ENA_ETHERNET\n# Build kernel with multiple processes\nmake binrpm-pkg -j$(nproc) 2\u003e\u00261 | tee build-6.1.150.log\n\n# Collect the built RPM\nfind ~/rpmbuild/RPMS/x86_64/ -name \"kernel-*.rpm\" ! -name \"*headers*\" \\\n  -exec cp {} ../kernel-rpms/ \\;\ncd ..\n```\n\nCheck resulting RPM files:\n```\nls kernel-rpms/\n# Expected: kernel-6.1.141-1.x86_64.rpm  kernel-6.1.150-1.x86_64.rpm\n```\n\n### Upload and Run\n\nIn this step, we trigger the actual testing. The pipeline will spawn EC2 VMs, install each kernel in sequence, run UnixBench after each, and compare the results.\nAfter completion, the benchmark regression analysis prints which metrics regressed between v6.1.141 and v6.1.150 (see [Benchmark Regression Detection](#benchmark-regression-detection) for output format).\n\n**Note:** S3 bucket names are globally unique. If another user already created buckets with the same prefix, `setup configure` will succeed but the pipeline will fail when creating buckets.\nChoose a unique `$USER` prefix or add a random suffix to avoid collisions.\n\nConfigure for regression testing only (x86_64)\n```bash\nkernel-ci-cloud-runner aws setup configure \\\n  --prefix kernel-ci-$USER-demo- --region us-west-2 \\\n  --test-filter unixbench-kernel-regression \\\n  --output demo-config.json\n```\n\nUpload the built RPMs\n```bash\nkernel-ci-cloud-runner aws setup upload-rpms \\\n  --bucket kernel-ci-$USER-demo-storage \\\n  --local-rpms kernel-rpms/\n```\n\nRun the pipeline\n```bash\nkernel-ci-cloud-runner aws run --config demo-config.json\n```\n\nBased on the output you should see that testing passed, and no performance regressions have been detected.\n\n## Configuration\n\nThe project is configured via JSON files. By default, the file `examples/aws/config.json` is used. Explicitly specifying a config file to use is recommended.\n\nKey sections of the configuration file are described below.\n\n### Test Configuration\n```json\n\"test_config\": {\n  \"test_id\": \"test-001\",\n  \"role_name\": \"ecsTaskExecutionRole\",\n  \"vms\": [\n    {\n      \"ami_id\": \"resolve:ssm:/aws/service/ami-amazon-linux-latest/al2023-ami-kernel-default-x86_64\",\n      \"instance_type\": \"t3.micro\",\n      \"max_runtime\": 3600,\n      \"test\": [\"basic-test\", \"example-reboot-test\"],\n      \"min_count\": 1\n    }\n  ]\n}\n```\n\n**Test parameters:**\n- `test_id` - Test id used for results folder path (will be expanded by date+time stamp)\n- `role_name` - IAM role name to attach to EC2 instances (must match a role defined in the `roles` section)\n- `ami_id` - AMI to use (supports SSM parameter resolution for x86_64 and arm64)\n- `instance_type` - EC2 instance type\n- `max_runtime` - Maximum duration in seconds per VM. The VM has a safety timeout that triggers automatic shutdown if the test hangs. SSM command timeout is `max_runtime + 3600` seconds to allow for multi-stage tests with reboots\n- `test` - List of test directories from `vm-tests/`\n- `min_count` - Number of VMs to spawn\n\n### Storage Configuration\n```json\n\"storage\": {\n  \"type\": \"s3\",\n  \"bucket\": \"kernel-ci-results\",\n  \"results_prefix\": \"results\"\n}\n```\n\nS3 bucket names must be globally unique. If the name is taken, the system appends your AWS account number (e.g., `kernel-ci-results-123456789012`).\n\n### ECS and CloudWatch Configuration\n\nThe `ecs` section defines the Fargate container that orchestrates tests (cpu, memory, cluster name, task definition). The `cloudwatch` section defines log retention for container logs (`/ecs/...`, default 7 days) and VM logs (`/ec2/...`, default 3 days). These are set automatically by `setup configure` and rarely need manual changes.\n\n### Available Tests\n\nTests are located in `vm-tests/`:\n- `basic-test` - Simple connectivity test\n- `example-reboot-test` - Multi-stage reboot test\n- `example-kernel-reboot-test` - Kernel installation and reboot\n- `simple-unixbench` - UnixBench performance test\n- `unixbench-kernel-regression` - Kernel regression testing with two kernel versions\n\nEach test directory contains:\n- `run.sh` or `run-*.sh` - Test scripts (executed in version-sorted order)\n- `dependencies.txt` - Required packages\n- `README.md` - Test documentation\n- `external_requirements.json` - Declares external artifacts this test needs from the external storage bucket\n\nOptional files: `common_lib.sh` (shared functions), test-specific data files.\n\n#### Test Design\n\nTests are executed by the VM client script (`vm-tests/test-vm-client.sh`) which is downloaded and run on each EC2 instance via SSM. The client script manages the full lifecycle:\n\n1. Downloads and extracts the test payload zip from S3 (first run only)\n2. Discovers all `run*.sh` scripts in the test directory and sorts them by version (`sort -V`)\n3. Executes one script per boot cycle, tracking progress via a `run_id` counter in S3\n4. After each script: uploads the output log (`run-N-output.log`) and client log (`client-N.log`) to S3\n5. If more scripts remain, exits with code **194** to signal SSM to reboot the VM\n6. After reboot, SSM re-runs the client script, which increments `run_id` and executes the next script\n7. After the final script: uploads `result.txt`, `stats.json`, and any `benchmark-*.csv` files, then shuts down\n\nExit code conventions for `run*.sh` scripts:\n- **0** — success, continue to next script (or finish if last)\n- **194** — explicitly request reboot (same effect as success for non-final scripts)\n- **any other non-zero** — failure, stops the chain and reports the error\n\n#### Writing New Tests\n\nA test is a directory under `vm-tests/` containing at minimum:\n\n- **`run.sh`** or **`run-*.sh`** — one or more executable scripts, run in version-sorted order (e.g. `run-01-setup.sh`, `run-02-verify.sh`). Each script runs after a fresh boot.\n- **`external_requirements.json`** — declares which shared artifacts (e.g. kernel RPMs) the test needs from the external storage bucket. Set all values to `false` if none are needed.\n\nOptional files:\n- **`dependencies.txt`** — one package name per line (comments with `#`, blank lines ignored). Tests that need system packages should call an `install_test_dependencies` function from their `common_lib.sh` that reads this file and installs via `yum`/`dnf`.\n- **`common_lib.sh`** — shared shell functions sourced by the `run-*.sh` scripts\n- **`README.md`** — test documentation\n\nThe scripts run as `ec2-user` (or equivalent) with `sudo` available. The working directory is `$HOME/test-\u003cRUN_PREFIX\u003e-work/test/`, which persists across reboots. Environment variables like `RESULTS_BUCKET`, `RUN_PREFIX`, and `S3_PREFIX` are **not** passed to the scripts — the client script handles all S3 uploads. Scripts should write output files to the current directory.\n\n#### Writing New Performance Benchmark Tests\n\nWhen detecting performance regressions in a virtualized environment, comparing measured values with old known good values is error prone, as the environment can change.\nToday's VM can be located in a different data center, might use updated firmware or hypervisor, might be located in a different network and other changes.\nTherefore, to measure the impact of a changed kernel, we recommend to setup tests to run measurements with the current kernel (tip) as well as a reference kernel (base).\nThe testing system can automatically check for regressions if the two tests write two data CSV files that follow the naming pattern benchmark-base*.csv and benchmark-tip*.csv.\n\nThe CSV files must have the following columns:\n\n```\nmetric,unit,value,more_is_better,kernel_version,instance_id,instance_type,arch\n```\n\n- `metric` — name of the benchmark metric (e.g. `Dhrystone_2_using_register_variables`)\n- `unit` — measurement unit (e.g. `lps` for loops per second)\n- `value` — numeric result\n- `more_is_better` — `true` if higher values are better, `false` if lower is better\n- `kernel_version` — kernel version string (e.g. `6.1.141-3.x86_64`)\n- `instance_id`, `instance_type`, `arch` — EC2 instance metadata for traceability\n\nEach row is one metric measurement from one VM. With multiple VMs per test, the analyzer aggregates values across VMs and uses statistical tests (Welch's t-test, Mann-Whitney U, Cohen's d) to detect regressions.\n\n\n#### External Requirements\n\nTests declare which artifacts they need from the external storage bucket. Example:\n```json\n{\n  \"kernel-rpms/src\": false,\n  \"kernel-rpms/binary\": true\n}\n```\n\nWhen set to `true`, the pipeline copies that folder from the external storage bucket to `shared/` in the results bucket before the test runs (once per run, shared across tests). VMs download from the `shared/` path. Tests that don't need external artifacts set all values to `false`.\n\n## Architecture\n\nThis section briefly discusses how the project is setup.\n\n### Code Structure\n\n```\nkernel-ci-cloud-labs/\n├── src/\n│   └── kernel_ci_cloud_labs/\n│       ├── auth/           # Authentication modules (AWS)\n│       │   ├── aws_auth.py\n│       │   ├── aws_cluster_manager.py\n│       │   ├── aws_role_manager.py\n│       │   ├── aws_network_manager.py\n│       │   ├── aws_task_definition_manager.py\n│       │   └── aws_cloudwatch_manager.py\n│       ├── core/           # Base classes and utilities\n│       │   ├── base_auth.py\n│       │   ├── base_provider.py\n│       │   ├── base_storage.py\n│       │   ├── base_resource_manager.py\n│       │   ├── client_manager.py\n│       │   ├── registry.py\n│       │   └── pipeline.py\n│       ├── providers/      # Cloud provider implementations\n│       │   └── aws_provider.py\n│       ├── storage/        # Storage backends\n│       │   └── s3_storage.py\n│       ├── cli.py                  # CLI entry point (kernel-ci-cloud-runner)\n│       ├── eventbridge_handler.py  # EventBridge/Lambda entry point\n│       ├── setup_configure.py      # Project configuration\n│       ├── setup_upload_rpms.py    # RPM upload to S3\n│       ├── setup_cleanup.py        # AWS resource cleanup\n│       └── main.py\n├── tests/                  # Unit tests and integration tests\n├── vm-tests/               # VM test scripts\n├── examples/               # Example configurations\n├── .pre-commit-config.yaml # Pre-commit hooks config\n├── .pylintrc               # Pylint configuration\n├── pyproject.toml          # Black and pytest settings\n└── setup.py\n```\n\n### S3 Storage Structure\n\nThe overview below shows how the storage in S3 for a testrun is organized.\n\n```\ns3://kernel-ci-results-{ACCOUNT_ID}/\n└── run_{test_id}_{datetime}/\n    ├── shared/                          # Shared resources (uploaded once per run)\n    │   └── kernel-rpms/binary/\n    │       ├── x86_64/*.rpm\n    │       └── aarch64/*.rpm\n    │\n    └── test_{test_name}/\n        ├── input/\n        │   └── {test_name}_test_payload.zip  # Test scripts and dependencies\n        ├── output/\n        │   └── {instance_id}/\n        │       ├── client-{run_id}.log       # Client execution logs per run\n        │       ├── run-{run_id}-output.log   # Script output per run\n        │       ├── result.txt                # Final test result (SUCCESS/FAIL)\n        │       ├── stats.json                # Test statistics and timing\n        │       └── benchmark-*.csv           # Benchmark results (if applicable)\n        └── state/\n            └── {instance_id}/\n                └── run_id.txt                # Tracks current run stage\n```\n\n### Test Execution Flow\n\n1. Pipeline uploads test payload zip and `test-vm-client.sh` to S3\n2. Fargate container spawns EC2 VMs and sends SSM commands\n3. Each VM downloads the bootstrap script to `/tmp/test-vm-client.sh`\n    1. Test payload is extracted to `/home/ec2-user/test-run_\u003cRUN_PREFIX\u003e-work/test/`\n    2. Test scripts (`run-01-*.sh`, `run-02-*.sh`, ...) execute in version-sorted order\n    3. Run state is tracked via `run_id.txt` in S3 — after a reboot, the VM re-downloads the script, checks `run_id`, and continues from the next stage\n    4. Results are uploaded to S3 per instance\n\n## Benchmark Regression Detection\n\nWhen a pipeline run completes, the system automatically analyzes benchmark results from tests that produce `benchmark-base-*.csv` and `benchmark-tip-*.csv` files (e.g. `unixbench-kernel-regression`). Tests without benchmark CSVs are silently skipped.\n\nFor each metric, the analyzer compares the value distributions across all VMs and computes:\n- **Welch's t-test** and **Mann-Whitney U test** for statistical significance\n- **Cohen's d** for practical effect size\n\nA regression is flagged only when **both** conditions are met:\n1. At least one statistical test is significant (p \u003c 0.05)\n2. The effect size is meaningful (|Cohen's d| ≥ 0.5)\n\n### Example Output\n\n```\n============================================================\nBENCHMARK REGRESSION ANALYSIS\n============================================================\n\nTest: unixbench-kernel-regression\n  Base kernel: 6.1.141-165.249.amzn2023.x86_64\n  Tip kernel:  6.1.150-174.273.amzn2023.x86_64\n  Metrics compared: 24\n  ⚠ REGRESSIONS DETECTED: 16\n    Process_Creation: base=57012.12±658.82 (cv: 0.01) → tip=41959.92±306.71 (cv: 0.01) lps (-26.4%)\n      [t-test p=0.0000, U-test p=0.0001, Cohen's d=29.29]\n    Execl_Throughput: base=22944.75±155.73 (cv: 0.01) → tip=21684.21±154.45 (cv: 0.01) lps (-5.5%)\n      [t-test p=0.0000, U-test p=0.0001, Cohen's d=8.13]\n\n------------------------------------------------------------\nTests with benchmarks: 1 | Regressions found: 1\nTests with regressions: unixbench-kernel-regression\n============================================================\n```\n\n### Notification Hooks\n\nFor future extensions, or integration into Kernel CI workflows, we might need to report test results.\n\nThe `BenchmarkAnalyzer` returns a `PipelineBenchmarkSummary` dataclass with structured regression data. To add downstream notifications (e.g. SNS alerts, KCIDB reporting, Slack), see the `NOTIFICATION HOOK` comments in:\n- `src/kernel_ci_cloud_labs/core/benchmark_analyzer.py` — after summary logging\n- `src/kernel_ci_cloud_labs/core/pipeline.py` — after benchmark analysis completes\n\nThe `PipelineBenchmarkSummary` contains:\n- `test_results` — list of `TestBenchmarkResult`, one per test\n- `tests_with_regression` / `regression_test_names` — quick summary of which tests regressed\n\nEach `TestBenchmarkResult` contains:\n- `base_kernel` / `tip_kernel` — kernel version strings\n- `comparisons` — list of `MetricComparison` (one per benchmark metric)\n- `regressions` — property that filters to only regressed metrics\n\nEach `MetricComparison` contains:\n- `metric`, `unit`, `more_is_better` — metric identity\n- `base` / `tip` — `MetricStats` with `mean`, `median`, `stddev`, `cv`, `values`\n- `pct_change`, `t_pvalue`, `u_pvalue`, `cohens_d` — statistical results\n- `is_regression` — boolean flag\n\nExample integration at the `NOTIFICATION HOOK` in `pipeline.py`:\n\n```python\n# After benchmark_summary is computed:\nif benchmark_summary.tests_with_regression \u003e 0:\n    # SNS notification\n    sns = provider.auth.get_client(\"sns\")\n    message = f\"Regressions in: {', '.join(benchmark_summary.regression_test_names)}\"\n    sns.publish(TopicArn=\"arn:aws:sns:...:kernel-ci-alerts\", Message=message)\n\n    # KCIDB submission\n    for result in benchmark_summary.test_results:\n        for reg in result.regressions:\n            submit_to_kcidb(result.test_name, reg.metric, reg.pct_change,\n                            result.base_kernel, result.tip_kernel)\n\n    # Write machine-readable JSON for downstream tools\n    import json\n    regression_data = [{\n        \"test\": r.test_name,\n        \"base_kernel\": r.base_kernel,\n        \"tip_kernel\": r.tip_kernel,\n        \"regressions\": [{\n            \"metric\": c.metric, \"pct_change\": c.pct_change,\n            \"cohens_d\": c.cohens_d, \"t_pvalue\": c.t_pvalue,\n        } for c in r.regressions],\n    } for r in benchmark_summary.test_results if r.has_regression]\n    storage.upload_string(json.dumps(regression_data, indent=2),\n                          f\"{run_prefix}/regression_report.json\")\n```\n\n### Re-Analyzing Previous Runs\n\nTo re-run the analysis on results from a previous pipeline run without re-running the pipeline:\n\n```bash\nkernel-ci-cloud-runner aws analyze \\\n  --bucket kernel-ci-$USER-results \\\n  --run-prefix run_test-001_20260325_120000 \\\n  --region us-west-2\n```\n\nThis downloads all `benchmark-*.csv` files from S3, combines them, compares the two kernel versions, and generates regression plots (overall, x86_64, ARM64) in `analysis/data/{run_prefix}/`. Add `--upload-analysis` to upload the results back to S3.\n\nRequires the analysis dependencies: `python3.11 -m pip install -e \".[analysis]\"`\n\n## Automated Triggering via EventBridge\n\nThe pipeline can be triggered automatically using Amazon EventBridge — either on a schedule (e.g. daily regression runs) or from custom events (e.g. a new kernel build notification).\nThis functionality is not fully implemented yet and needs to be adapted to the specific use case.\n\n### How It Works\n\nThe EventBridge handler (`src/kernel_ci_cloud_labs/eventbridge_handler.py`) runs as an AWS Lambda function and performs:\n\n1. **Downloads the pipeline config** from an S3 URI provided in the event payload.\n2. **Prepares kernel RPMs** — currently expects RPMs to be pre-uploaded to the external storage bucket. A future implementation will automatically retrieve the latest tip kernel and a base kernel for comparison (see `_prepare_kernel_rpms()` in the handler).\n3. **Makes the config run-local** by appending a unique suffix to `test_id`, so parallel EventBridge invocations don't collide.\n4. **Runs the normal pipeline** — no additional permissions beyond what the pipeline already requires.\n\n### Why RPM Retrieval Runs in the Lambda, Not in Fargate\n\nKernel RPM retrieval must happen in the Lambda handler *before* the pipeline starts — it cannot run inside the Fargate container. The Fargate container (`launch_vm.py`) is a lightweight VM orchestrator with only `boto3` available. It spawns EC2 VMs, sends SSM commands, and waits for results. It has no access to the pipeline config or the `kernel_ci_cloud_labs` package.\n\nThe pipeline copies kernel RPMs from the external storage bucket to `shared/` in the results bucket *before* spawning the Fargate container. VMs then download RPMs from `shared/` when they boot. If RPMs aren't in the external storage bucket before `run_pipeline()` is called, VMs won't find them.\n\n```\nEventBridge → Lambda handler\n  1. Download config from S3\n  2. Retrieve kernel RPMs → upload to external storage bucket\n  3. Make config run-local (unique test_id)\n  4. run_pipeline() → copies RPMs to shared/ → spawns Fargate → VMs download from shared/\n```\n\n**Note:** When triggered via EventBridge, the pipeline pulls test scripts from the external storage bucket (uploaded via `setup upload-tests`), not from the Lambda deployment zip. If you update test scripts in `vm-tests/`, re-run `setup upload-tests` for changes to take effect in EventBridge-triggered runs. CLI runs (`aws run`) always use the local `vm-tests/` directory.\n\n### Prerequisites\n\n- All AWS resources must already exist (`kernel-ci-cloud-runner aws setup configure` + first manual run).\n- A valid `config.json` must be uploaded to S3 (e.g. `s3://kernel-ci-$USER-storage/configs/config.json`).\n- Kernel RPMs must be pre-uploaded to the external storage bucket (until automatic retrieval is implemented).\n- Test scripts must be uploaded to the external storage bucket:\n  ```bash\n  kernel-ci-cloud-runner aws setup upload-tests \\\n    --bucket kernel-ci-$USER-storage --region us-west-2\n  ```\n  Re-run this command after modifying any test scripts in `vm-tests/`.\n- Set `\"force_recreate_roles\": false` in the config to avoid disrupting parallel runs.\n\n### EventBridge Setup\n\n**1. Deploy the Lambda function:**\n\nPackage `kernel_ci_cloud_labs` and its dependencies (`boto3` is provided by the Lambda runtime) into a deployment zip, then create the function:\n\n```bash\naws lambda create-function \\\n  --function-name kernel-ci-daily-regression \\\n  --runtime python3.12 \\\n  --handler kernel_ci_cloud_labs.eventbridge_handler.lambda_handler \\\n  --role arn:aws:iam::\u003cACCOUNT\u003e:role/\u003cLAMBDA_EXECUTION_ROLE\u003e \\\n  --timeout 900 \\\n  --memory-size 256 \\\n  --zip-file fileb://deployment.zip \\\n  --region eu-west-2\n```\n\nThe Lambda execution role needs the same permissions as the pipeline (S3, ECS, EC2, SSM, IAM, ECR, CloudWatch). You can reuse the existing pipeline role or create a dedicated one.\n\n**2. Create a scheduled EventBridge rule (e.g. daily at 02:00 UTC):**\n\n```bash\naws events put-rule \\\n  --name kernel-ci-daily-regression \\\n  --schedule-expression \"cron(0 2 * * ? *)\" \\\n  --state ENABLED \\\n  --region eu-west-2\n```\n\n**3. Add the Lambda as target with the config payload:**\n\n```bash\naws events put-targets \\\n  --rule kernel-ci-daily-regression \\\n  --targets '[{\n    \"Id\": \"kernel-ci-pipeline\",\n    \"Arn\": \"arn:aws:lambda:\u003cREGION\u003e:\u003cACCOUNT\u003e:function:kernel-ci-daily-regression\",\n    \"Input\": \"{\\\"config_s3_uri\\\": \\\"s3://kernel-ci-$USER-storage/configs/config.json\\\", \\\"region\\\": \\\"eu-west-2\\\"}\"\n  }]' \\\n  --region eu-west-2\n```\n\n**4. Grant EventBridge permission to invoke the Lambda:**\n\n```bash\naws lambda add-permission \\\n  --function-name kernel-ci-daily-regression \\\n  --statement-id eventbridge-invoke \\\n  --action lambda:InvokeFunction \\\n  --principal events.amazonaws.com \\\n  --source-arn arn:aws:events:\u003cREGION\u003e:\u003cACCOUNT\u003e:rule/kernel-ci-daily-regression\n```\n\n### Event Payload Format\n\nThe handler expects this JSON structure (passed as the EventBridge target input):\n\n```json\n{\n  \"config_s3_uri\": \"s3://kernel-ci-$USER-storage/configs/config.json\",\n  \"region\": \"eu-west-2\"\n}\n```\n\n- `config_s3_uri` (required): S3 URI to the pipeline config JSON. Must use resource names matching your prefix (as produced by `setup configure`).\n- `region` (optional): AWS region. Defaults to `AWS_DEFAULT_REGION` environment variable or `us-west-2`.\n\n### Custom Event Triggers\n\nInstead of a schedule, you can trigger the pipeline from custom events (e.g. when a new kernel build completes):\n\n```bash\naws events put-rule \\\n  --name kernel-ci-on-new-build \\\n  --event-pattern '{\"source\": [\"custom.kernel-build\"], \"detail-type\": [\"BuildComplete\"]}' \\\n  --state ENABLED \\\n  --region eu-west-2\n```\n\nThen publish events to trigger it:\n\n```bash\naws events put-events --entries '[{\n  \"Source\": \"custom.kernel-build\",\n  \"DetailType\": \"BuildComplete\",\n  \"Detail\": \"{\\\"config_s3_uri\\\": \\\"s3://kernel-ci-$USER-storage/configs/config.json\\\", \\\"region\\\": \\\"eu-west-2\\\"}\"\n}]'\n```\n\nNote: For custom events, the handler reads from the top-level event dict. If your event source puts data in `detail`, you'll need to adjust the EventBridge target input transformer or the handler accordingly.\n\n### Debugging EventBridge Runs\n\n- Lambda logs go to CloudWatch Logs under `/aws/lambda/\u003cfunction-name\u003e`.\n- Each invocation logs a unique invocation ID for tracing.\n- Set the `LOG_LEVEL` environment variable on the Lambda to `DEBUG` for verbose output.\n- The handler returns a JSON response with `status`, `invocation_id`, and `test_id` for programmatic monitoring.\n\n## Development\n\n### Running Tests\n\nThe project has a Makefile that simplifies running most steps. The calls should be run in the virtual environment, to make sure dependencies are present.\n\n```bash\n# Run unit tests (fast, no AWS resources needed)\nmake test\n\n# Run linting (flake8 + pylint)\nmake lint\n\n# Format code (black + isort)\nmake format\n\n# Run integration tests (requires AWS credentials, creates real resources and incurs AWS costs)\npytest tests/integration/ -v -m integration\n\n# Run with coverage (unit tests only)\npytest tests/ -v --cov=src --cov-report=term-missing -m \"not integration\"\n```\n\n### Code Quality\n\nThe project uses Black for formatting, isort for import sorting, flake8 and Pylint for linting.\nAll tools are configured to use 120-character line width.\n\nConfiguration files: `pyproject.toml` (Black, pytest), `.flake8`, `.pylintrc`\n\n## Debugging\n\n### Force Recreate Roles\n\nThe `force_recreate_roles` parameter in `config.json` controls whether IAM roles are deleted and recreated:\n\n```json\n\"force_recreate_roles\": true\n```\n\n- Set to `true` when you've modified role policies or trust relationships and need to apply changes\n- Set to `false` for normal operation to avoid disrupting running tasks\n\n**⚠️ Warning:** Setting `force_recreate_roles: true` will delete existing IAM roles and terminate any running ECS tasks that use those roles. Never use while other pipelines are active.\n\n## Troubleshooting\n\n### Investigating Failed Pipeline Runs\n\nWhen a pipeline run fails, the container logs show a summary like:\n\n```\n[...] [INFO] ✓ Stopped task: 6e3b9905e4214826962a2f3a43d548fe\n```\n\nUse the task ID to investigate:\n\n**1. Check why the ECS task stopped:**\n\n```bash\naws ecs describe-tasks \\\n  --cluster kernel-ci-$USER-cluster \\\n  --tasks \u003cTASK_ID\u003e \\\n  --region us-west-2\n```\n\nLook for `stoppedReason`, `stopCode`, and the container's `exitCode`. Exit code 1 means the test failed; exit code 137 means the container was killed (OOM or timeout).\n\n**2. Read the container logs from CloudWatch:**\n\n```bash\naws logs filter-log-events \\\n  --log-group-name /ecs/kernel-ci-$USER-task \\\n  --log-stream-names ecs/kernel-ci-$USER-app/\u003cTASK_ID\u003e \\\n  --region us-west-2\n```\n\nThis shows which VMs were spawned, whether tests passed or failed, and the final summary (e.g. `=== All VMs completed: 0/1 successful, 1 failed ===`).\n\n**3. Check the VM's test output in S3:**\n\n```bash\n# List output files for a specific VM\naws s3 ls s3://kernel-ci-$USER-results/\u003cRUN_PREFIX\u003e/test_\u003cTEST_NAME\u003e/output/\u003cINSTANCE_ID\u003e/ \\\n  --region us-west-2\n\n# Download the run output log that failed (e.g. run-2)\naws s3 cp s3://kernel-ci-$USER-results/\u003cRUN_PREFIX\u003e/test_\u003cTEST_NAME\u003e/output/\u003cINSTANCE_ID\u003e/run-2-output.log - \\\n  --region us-west-2\n```\n\nThe `run-N-output.log` files contain the full shell trace (`set -x`) of each test stage. The `result.txt` file contains the final status (e.g. `FAILED: Exit code 1 at run 2`).\n\n**4. Check the client log for the bootstrap and S3 upload sequence:**\n\n```bash\naws s3 cp s3://kernel-ci-$USER-results/\u003cRUN_PREFIX\u003e/test_\u003cTEST_NAME\u003e/output/\u003cINSTANCE_ID\u003e/client-2.log - \\\n  --region us-west-2\n```\n\nThe `RUN_PREFIX` and `INSTANCE_ID` values are visible in the container logs from step 2.\n\n## License\n\nSee LICENSE file.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkernelci%2Fpullab_cloud","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fkernelci%2Fpullab_cloud","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkernelci%2Fpullab_cloud/lists"}