https://github.com/devops-ia/pr-generator
Image for pr-generator
https://github.com/devops-ia/pr-generator
bitbucket docker github helm kubernetes
Last synced: 9 days ago
JSON representation
Image for pr-generator
- Host: GitHub
- URL: https://github.com/devops-ia/pr-generator
- Owner: devops-ia
- License: mit
- Created: 2026-03-25T14:47:17.000Z (2 months ago)
- Default Branch: main
- Last Pushed: 2026-05-01T09:45:52.000Z (about 1 month ago)
- Last Synced: 2026-05-13T11:42:56.243Z (19 days ago)
- Topics: bitbucket, docker, github, helm, kubernetes
- Language: Python
- Homepage:
- Size: 58.6 KB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- License: LICENSE
Awesome Lists containing this project
README
# PR generator
[](https://github.com/devops-ia/pr-generator/actions/workflows/docker-build.yml)
[](https://github.com/devops-ia/pr-generator/releases)
[](https://hub.docker.com/r/devopsiaci/pr-generator)
[](https://hub.docker.com/r/devopsiaci/pr-generator)
[](https://www.python.org)
[](https://opensource.org/licenses/MIT)
Automated Pull Request creation daemon for **GitHub** and **Bitbucket Cloud**.
`pr-generator` runs as a long-lived service that periodically scans your repository branches, matches them against configurable regex patterns, and automatically opens Pull Requests toward the configured destination branches — skipping any PR that already exists.
---
## Table of Contents
- [How it works](#how-it-works)
- [Quick start](#quick-start)
- [Configuration](#configuration)
- [YAML file](#yaml-file)
- [Environment variables](#environment-variables)
- [Providers](#providers)
- [GitHub — App authentication](#github--app-authentication)
- [GitHub — PAT authentication](#github--pat-authentication)
- [Bitbucket Cloud](#bitbucket-cloud)
- [Rules](#rules)
- [ArgoCD Image Updater integration](#argocd-image-updater-integration)
- [Annotation-based discovery](#annotation-based-discovery)
- [Health endpoints](#health-endpoints)
- [Prometheus metrics](#prometheus-metrics)
- [Docker](#docker)
- [Development](#development)
- [Troubleshooting](#troubleshooting)
---
## How it works
```
┌─────────────────────────────────────────────────────────────┐
│ Scan cycle │
│ │
│ 1. Fetch all branches ──▶ GitHub / Bitbucket │
│ 2. For every rule │
│ match branches against regex pattern │
│ for each match │
│ skip if open PR already exists │
│ create PR source ──▶ destination │
│ 3. Sleep scan_frequency seconds │
│ 4. Repeat │
└─────────────────────────────────────────────────────────────┘
```
Key design points:
- **Concurrent**: branches are fetched from all providers in parallel; rule×provider pairs are also processed concurrently (up to 10 workers).
- **Idempotent**: an existing open PR for the same source→destination pair is detected and skipped.
- **Dry-run mode**: log what would be created without actually calling the API.
- **Graceful shutdown**: handles `SIGTERM` / `SIGINT` and drains in-progress work.
---
## Quick start
```bash
# Install
pip install -e .
# Point to your config file and run
CONFIG_PATH=./config.yaml pr-generator
```
Or with Docker:
```bash
docker run --rm \
-v "$(pwd)/config.yaml:/etc/pr-generator/config.yaml:ro" \
ghcr.io/devops-ia/pr-generator:latest
```
---
## Configuration
### YAML file
The default config path is `/etc/pr-generator/config.yaml`. Override with the `CONFIG_PATH` environment variable. The application exits with an error at startup if the file is not found.
```yaml
# config.yaml
# How often (seconds) to scan for new branches.
scan_frequency: 300 # default: 300
# Logging level: DEBUG | INFO | WARNING | ERROR
log_level: INFO # default: INFO
# Log format: "text" (human-readable) or "json" (structured, for log aggregators)
log_format: text # default: text
# When true, PRs are logged but never actually created.
dry_run: false # default: false
# Port for the built-in health server.
health_port: 8080 # default: 8080
providers:
github:
enabled: true
owner: my-org
repo: my-repo
app_id: "123456"
installation_id: "78901234" # optional — auto-resolved if omitted
private_key_path: /secrets/github-app.pem # path to PEM file
# Alternative: set GITHUB_APP_PRIVATE_KEY env var (plain PEM or base64-encoded)
timeout: 30 # HTTP timeout in seconds
bitbucket:
enabled: true
workspace: my-workspace
repo_slug: my-repo
token_env: BITBUCKET_TOKEN # name of the env var that holds the token
close_source_branch: true # delete source branch after merge (default: true)
timeout: 30
rules:
- pattern: "feature/.*" # Python regex matched against branch names
destinations:
github: main
bitbucket: develop
- pattern: "release/.*"
destinations:
github: main
- pattern: ".*-hotfix-.*"
destinations:
bitbucket: master
```
#### Multiple GitHub organisations
Use any name as the provider key and set `type: github` (or `type: bitbucket`) to identify the implementation. Rules reference providers by their name.
```yaml
providers:
github-acme:
type: github # required for non-standard key names
enabled: true
owner: acme-org
repo: backend
app_id: "111"
private_key_path: /secrets/acme-app.pem
github-skunkworks:
type: github
enabled: true
owner: skunkworks-org
repo: platform
auth_method: pat
token_env: SKUNKWORKS_GITHUB_TOKEN
bitbucket: # "github" / "bitbucket" keys default type automatically
enabled: true
workspace: my-workspace
repo_slug: my-repo
token_env: BITBUCKET_TOKEN
rules:
- pattern: "feature/.*"
destinations:
github-acme: main
github-skunkworks: develop
bitbucket: develop
```
**Config fields reference**
| Field | Type | Default | Description |
|-------|------|---------|-------------|
| `scan_frequency` | int | `300` | Seconds between scan cycles |
| `log_level` | string | `"INFO"` | Python logging level |
| `dry_run` | bool | `false` | Simulate PR creation without API calls |
| `health_port` | int | `8080` | Port for health HTTP server |
| `providers..type` | string | *(key name)* | Provider implementation: `github` or `bitbucket`. Required when the key name is not `github` or `bitbucket` |
| `providers..enabled` | bool | `false` | Activate this provider instance. If no providers are enabled the application starts in **idle mode** — it logs a warning and keeps running without performing any scans |
| `providers..owner` | string | — | GitHub organisation or user *(GitHub only)* |
| `providers..repo` | string | — | Repository name *(GitHub only)* |
| `providers..app_id` | string | — | GitHub App ID *(GitHub App auth)* |
| `providers..installation_id` | string | *(auto)* | Installation ID; resolved automatically if omitted *(GitHub App auth)* |
| `providers..private_key_path` | string | — | Path to GitHub App private key PEM file *(GitHub App auth)* |
| `providers..auth_method` | string | `"app"` | `app` (GitHub App) or `pat` (Personal Access Token) *(GitHub only)* |
| `providers..token_env` | string | `"GITHUB_TOKEN"` / `"BITBUCKET_TOKEN"` | Env var name containing the token *(PAT / Bitbucket)*. Must be **unique** across all enabled providers of the same type — duplicate values raise a `ValueError` at startup |
| `providers..workspace` | string | — | Bitbucket workspace slug *(Bitbucket only)* |
| `providers..repo_slug` | string | — | Bitbucket repository slug *(Bitbucket only)* |
| `providers..close_source_branch` | bool | `true` | Delete source branch after PR merges *(Bitbucket only)* |
| `providers..timeout` | float | `30` | HTTP timeout (seconds) |
| `rules[].pattern` | string | — | Python regex applied to branch names |
| `rules[].destinations` | map | — | `provider_name: destination_branch` pairs |
---
## Environment variables
| Variable | Description |
|----------|-------------|
| `CONFIG_PATH` | Path to the YAML config file. Default: `/etc/pr-generator/config.yaml` |
| `GITHUB_APP_PRIVATE_KEY` | GitHub App PEM key (plain text or base64-encoded). Used **only** when `private_key_path` is absent or empty in config — if `private_key_path` is set but the file does not exist, the application raises `FileNotFoundError` without falling back to this variable |
| `GITHUB_TOKEN` | Default token env var for GitHub PAT providers (`token_env: GITHUB_TOKEN`) |
| `BITBUCKET_TOKEN` | Default token env var for Bitbucket providers (`token_env: BITBUCKET_TOKEN`) |
| *any name* | Custom env var referenced by `token_env` in provider config |
---
## Providers
### GitHub App
Authentication uses a [GitHub App](https://docs.github.com/en/apps/creating-github-apps/about-creating-github-apps/about-creating-github-apps). Two modes are available:
**GitHub App (recommended)** — the provider:
1. Signs a short-lived JWT with the App's RSA private key.
2. Exchanges it for an installation access token (cached up to ~55 minutes).
3. Uses the installation token for all API calls.
4. Caches per-cycle PR-existence and branch-existence lookups to reduce API usage.
**Personal Access Token (PAT)** — set `auth_method: pat` and point `token_env` at an env var holding the PAT.
Required GitHub App permissions: **Contents** (read), **Pull requests** (read & write).
### Bitbucket Cloud
Authentication uses a project/repository **Bearer token** (HTTP access token).
The provider fetches default reviewers at PR creation time and automatically includes them in the payload.
Required Bitbucket permissions: **Repositories** (read), **Pull requests** (read & write).
---
## Rules
Each rule has:
- **`pattern`** — a Python regex (`re.compile`) matched against branch names using `re.match` (anchored at the start). The destination branch is excluded from matching.
- **`destinations`** — a map of `provider_name → destination_branch`. Only providers that are both listed here **and** active in `providers` are processed.
```yaml
rules:
- pattern: "feature/.*"
destinations:
github: main # create PRs toward "main" on GitHub
bitbucket: develop # create PRs toward "develop" on Bitbucket
```
Multiple rules are supported.
---
## ArgoCD Image Updater integration
`pr-generator` pairs naturally with [Argo CD Image Updater](https://argocd-image-updater.readthedocs.io/).
Image Updater creates branches named `argocd-image-updater-set---`.
Configure rules to catch those branches and open PRs toward the appropriate target branch per environment.
```yaml
scan_frequency: 120
providers:
github:
enabled: true
owner: my-org
repo: gitops-repo
auth_method: app
app_id: "123456"
private_key_path: /secrets/github-app.pem
rules:
- pattern: "argocd-image-updater-.*-dev-.*"
destinations:
github: develop
- pattern: "argocd-image-updater-.*-staging-.*"
destinations:
github: staging
- pattern: "argocd-image-updater-.*-pro-.*"
destinations:
github: main
```
---
## Annotation-based discovery
Instead of a central `rules` list, each ArgoCD Application CR can carry annotations
that define its own PR rules. `pr-generator` reads these annotations on every scan cycle
— no restart or config change required.
### Modes
| Mode | Behaviour |
|------|-----------|
| `config_only` | Static rules from `config.yaml` only. No Kubernetes API access. **Default.** |
| `annotations_only` | Rules come exclusively from annotated ArgoCD Applications. `rules:` is ignored at runtime. |
| `hybrid` | Both sources active. Annotation destinations win on same pattern+provider collision. |
### Annotation schema
```yaml
apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
name: my-app
annotations:
pr-generator.io/enabled: "true"
pr-generator.io/pattern: "^image-updater/.*"
pr-generator.io/destination.github: "main" # provider key → base branch
pr-generator.io/destination.bitbucket: "develop"
```
### config.yaml
```yaml
annotation_discovery:
mode: hybrid # config_only | annotations_only | hybrid
annotation_prefix: pr-generator.io # default
# rules: required when mode is config_only or hybrid; optional for annotations_only
rules:
- pattern: "^hotfix/.*"
destinations:
github: main
```
### RBAC requirement
Annotation discovery reads `applications.argoproj.io` cluster-wide. The Helm chart
creates a `ClusterRole` and `ClusterRoleBinding` automatically when
`annotationDiscovery.enabled: true`. For bare Docker/pip deployments, the pod's
ServiceAccount needs:
```yaml
rules:
- apiGroups: ["argoproj.io"]
resources: ["applications"]
verbs: ["get", "list"]
```
---
## Health endpoints
A lightweight HTTP server starts on `health_port` (default `8080`):
| Endpoint | Behaviour |
|----------|-----------|
| `GET /livez` | `200 live` while running; `503 shutting down` during shutdown |
| `GET /healthz` | Same as `/livez` (alias) |
| `GET /readyz` | `200 ready` after the **first** scan cycle completes; `503 not ready` before that |
| `GET /metrics` | Prometheus text exposition (see [Prometheus metrics](#prometheus-metrics)) |
Suitable for Kubernetes liveness, readiness, and startup probes:
```yaml
livenessProbe:
httpGet:
path: /livez
port: 8080
readinessProbe:
httpGet:
path: /readyz
port: 8080
```
---
## Prometheus metrics
`pr-generator` exposes Prometheus metrics at `GET /metrics` on the health port (default `8080`).
### Metrics reference
| Metric | Type | Labels | Description |
|--------|------|--------|-------------|
| `pr_generator_scan_cycles_total` | Counter | — | Scan cycles completed |
| `pr_generator_scan_duration_seconds` | Histogram | — | Duration per cycle (buckets: .1, .5, 1, 5, 10, 30, 60 s) |
| `pr_generator_last_scan_timestamp_seconds` | Gauge | — | Unix timestamp of last completed cycle |
| `pr_generator_prs_created_total` | Counter | `provider` | PRs opened |
| `pr_generator_prs_skipped_total` | Counter | `provider` | PRs skipped (already open) |
| `pr_generator_prs_simulated_total` | Counter | `provider` | PRs simulated (`dry_run: true`) |
| `pr_generator_scan_errors_total` | Counter | `provider` | Errors during branch fetch or PR creation |
| `pr_generator_rules_active` | Gauge | — | Rules active in the current cycle |
| `pr_generator_annotation_rules_discovered` | Gauge | — | Rules discovered from ArgoCD annotations in last cycle |
The `provider` label value is the key name from `config.providers` (e.g. `github`, `my-bitbucket`).
### Scraping
```bash
curl http://localhost:8080/metrics
```
### Helm chart — Prometheus Operator
```yaml
metrics:
enabled: true
serviceMonitor:
enabled: true # creates ServiceMonitor CRD
interval: 30s
labels:
release: kube-prometheus-stack # match your Operator's serviceMonitorSelector
```
### Programmatic API
```python
from prometheus_client import CollectorRegistry
from pr_generator.metrics import PrGeneratorMetrics
# Isolated registry (useful in tests)
m = PrGeneratorMetrics(registry=CollectorRegistry())
m.record_annotation_rules(3)
print(m.generate_latest().decode())
```
---
## Docker
The image is built from a two-stage Dockerfile:
- **Stage 1** – installs Python dependencies into `/install`.
- **Stage 2** – minimal `python:3.14-slim` runtime; runs as a non-root user (`prgen`).
```bash
# Build
docker build -t pr-generator .
# Run with YAML config
docker run --rm \
-v "$(pwd)/config.yaml:/etc/pr-generator/config.yaml:ro" \
-v "$(pwd)/github-app.pem:/secrets/github-app.pem:ro" \
-e BITBUCKET_TOKEN= \
-p 8080:8080 \
pr-generator
```
---
## Development
**Prerequisites**: Python ≥ 3.11
```bash
# Create and activate a virtual environment
python -m venv .venv
source .venv/bin/activate
# Install the package in editable mode with dev extras
pip install -e .
pip install pytest
# Run tests
pytest
# Run with a local config
CONFIG_PATH=./config.yaml python -m pr_generator
```
**Project layout**
```
src/pr_generator/
├── __main__.py # Entry point: startup, provider init, scan loop
├── config.py # Config loading from YAML file
├── models.py # Dataclasses: AppConfig, ProviderConfig, ScanRule, …
├── scanner.py # Concurrent scan cycle orchestrator
├── health.py # HTTP health server (/livez, /readyz, /healthz)
├── http_client.py # Shared HTTP client with retry/backoff
├── annotation_discovery.py # Kubernetes annotation-based rule discovery
├── config.py # Config loader (YAML → AppConfig)
├── health.py # HTTP health + metrics server (/livez, /readyz, /metrics)
├── logging_config.py # Logging setup (plain text or structured JSON)
├── metrics.py # Prometheus metrics (PrGeneratorMetrics)
└── providers/
├── base.py # ProviderInterface Protocol
├── github.py # GitHub App provider
└── bitbucket.py # Bitbucket Cloud provider
tests/
├── conftest.py # Shared pytest fixtures
├── test_annotation_discovery.py # Annotation discovery tests
├── test_config.py # Config loading tests
├── test_health.py # Health server tests
├── test_metrics.py # Prometheus metrics tests
├── test_models.py # Model tests
└── test_scanner.py # Scan cycle tests
```
---
## Troubleshooting
### Application exits with `FileNotFoundError`
```
FileNotFoundError: [Core] private_key_path '/secrets/github-app.pem' does not exist.
```
`private_key_path` is set in `config.yaml` but the file is not present at that path.
Either mount the PEM file at the configured path, or remove `private_key_path` from
the config and set the `GITHUB_APP_PRIVATE_KEY` environment variable instead.
### `ValueError: duplicate tokenEnv`
```
ValueError: [Core] Providers 'bb-eu' and 'bb-us' both use tokenEnv 'BITBUCKET_TOKEN'.
```
Two enabled providers of the same type share the same `token_env` value. Assign a
unique env var name to each provider and export the corresponding variable in your
runtime environment.
### `/readyz` returns `503`
This is expected during startup. The endpoint returns `503 not ready` until the first
full scan cycle completes. If it never flips to `200`, check the application logs for
errors in the scan cycle (API auth failures, missing config fields, network issues).
### No PRs are created (dry_run is false, branches exist)
1. **Regex anchoring** — rules use `re.match`, which is anchored at the start of the
string. A pattern `feature/.*` will **not** match `hotfix/feature/x`. Enable
`log_level: DEBUG` to see per-branch matching decisions.
2. **Provider name mismatch** — the name in `rules[].destinations` must exactly match
the provider key under `providers:`.
3. **Destination branch excluded** — pr-generator skips branches whose name equals the
destination branch to avoid self-targeting PRs.
### GitHub App: `RuntimeError: Could not resolve installation id`
Set `installation_id` explicitly in the provider config (find it in your GitHub App
settings under _Installations_), or ensure the GitHub App is installed on the target
repository.