{"id":50677617,"url":"https://github.com/sctg-development/backport-agent","last_synced_at":"2026-06-08T16:34:23.163Z","repository":{"id":361207071,"uuid":"1250134067","full_name":"sctg-development/backport-agent","owner":"sctg-development","description":null,"archived":false,"fork":false,"pushed_at":"2026-05-29T15:50:16.000Z","size":166,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-05-29T17:11:13.344Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"TypeScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/sctg-development.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.md","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-05-26T10:38:23.000Z","updated_at":"2026-05-29T15:50:20.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/sctg-development/backport-agent","commit_stats":null,"previous_names":["sctg-development/backport-agent"],"tags_count":null,"template":false,"template_full_name":null,"purl":"pkg:github/sctg-development/backport-agent","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sctg-development%2Fbackport-agent","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sctg-development%2Fbackport-agent/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sctg-development%2Fbackport-agent/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sctg-development%2Fbackport-agent/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/sctg-development","download_url":"https://codeload.github.com/sctg-development/backport-agent/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sctg-development%2Fbackport-agent/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":34071656,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-08T02:00:07.615Z","response_time":111,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2026-06-08T16:34:22.272Z","updated_at":"2026-06-08T16:34:23.148Z","avatar_url":"https://github.com/sctg-development.png","language":"TypeScript","funding_links":[],"categories":[],"sub_categories":[],"readme":"[![Npm package version](https://badgen.net/npm/v/@sctg/backport-agent)](https://npmjs.com/package/@sctg/backport-agent)[![TypeScript](https://badgen.net/badge/icon/typescript?icon=typescript\u0026label)](https://typescriptlang.org)\n# Backport Agent\n\nA deterministic IA powered agent for keeping a heavily customized Git fork in sync with an active upstream repository.\n\nThis project is the implementation of the architecture described in [analysis.md](analysis.md). It is designed for the real-world case where a fork is not just a few patches on top of upstream, but a living codebase with custom providers, build-time rewrites, documentation changes, and operational workflows that must survive every sync.\n\n## Why this exists\n\nKeeping a fork aligned with upstream is hard when the fork carries important product decisions. A naive merge strategy can silently break custom behavior even when Git reports no conflicts.\n\nBackport Agent focuses on the parts that matter most:\n\n- identify the upstream commits that still need to be integrated;\n- classify change risk before touching the fork;\n- preserve fork-specific customizations;\n- run validation after each meaningful integration step;\n- produce a clear report instead of pushing blind changes.\n\n## What it does\n\nThe agent works as a sync pipeline rather than a one-shot merge bot. It reads the upstream history, selects candidate commits, evaluates their risk, applies them in controlled batches, validates the result, and generates a report for review.\n\nIt is built to support forks that include features such as:\n\n- custom LLM providers (e.g. `keypoollive` with encrypted vault-backed key rotation);\n- build-time package renaming and CI workflow customizations;\n- documentation generation pipelines;\n- a local reporting and validation workflow.\n\n## How it is structured\n\nThe codebase is intentionally split into small, testable pieces:\n\n- `src/git` handles Git operations and cherry-pick workflows;\n- `src/risk` classifies commits and customization sensitivity;\n- `src/validation` runs allowlisted validation commands;\n- `src/github` manages pull request creation and metadata;\n- `src/reports` assembles the final sync report;\n- `src/ai` exposes analysis helpers used when deterministic logic is not enough;\n- `src/config` and `src/customizations` load the sync configuration and fork-specific invariants.\n\nThe agent entry point is [src/main.ts](src/main.ts), which wires the sync flow together and also enables the built-in SDK tools used by the runtime.\n\n## Getting started\n\n1. Install dependencies.\n\n```bash\nnpm install\n```\n\n2. Copy the example configuration files and adjust them for your environment.\n\n- [config.example.json](config.example.json)\n- [customizations.example.yaml](customizations.example.yaml)\n\n3. Set the required provider credentials in your shell or `.env` file.\n\nFor the **keypoollive** provider (vault-based key rotation):\n\n```bash\nKEYPOOL_VAULT_URL=https://...\nKEYPOOL_LIVE_SECRET=...\n```\n\nFor any other provider supported by `@sctg/cline-sdk`, set the corresponding API key:\n\n```bash\nANTHROPIC_API_KEY=sk-ant-...\n# or OPENAI_API_KEY=sk-..., MISTRAL_API_KEY=..., etc.\n```\n\nYou can also override the provider or API key at runtime without editing `config.json`:\n\n```bash\nnpm start -- --provider anthropic --api-key sk-ant-...\n```\n\n4. Start the agent.\n\n```bash\nnpm start\n```\n\nIf you want a no-op run that still exercises the workflow, use:\n\n```bash\nnpm run dry-run\n```\n\nSet `VERBOSE=true` to see detailed iteration and tool-call progress in stderr:\n\n```bash\nVERBOSE=true npm start\n```\n\n## Retry behavior\n\nThe agent includes automatic retry logic for transient provider errors (rate limits, overloaded endpoints, high-demand responses, HTTP 503, etc.). When a retriable error is detected, the agent waits with exponential backoff (15 s, 30 s, 45 s…) and restarts up to 5 times.\n\nBecause agent state is anchored to Git, restarting is safe — already-applied commits are detected from the git log and skipped automatically.\n\nThe iteration counter in verbose output is continuous across retries. A retry is indicated by a suffix in the progress lines:\n\n```\n--- iteration 16 ---\n[Retry] Silent provider error on attempt 1/5: This model is currently experiencing high demand…\n[Retry] Waiting 15s before retrying...\n--- iteration 17 - Retry 1 ---\n--- iteration 18 - Retry 1 ---\n```\n\n## Validation and tests\n\nThe repository includes both unit and integration coverage.\n\n- `npm run typecheck` checks the TypeScript build.\n- `npm test` runs the full test suite.\n- `npm run test:unit` runs fast deterministic tests.\n- `npm run test:integration` runs integration tests, including real KeypoolLive calls when your vault is configured.\n\nThe integration suite is intentionally practical. It verifies Git behavior in temporary repositories and exercises real SDK tools against a configured provider (defaults to `keypoollive` with the `mistral/devstral-latest` model) when `.env` is available.\n\n## Configuration\n\nThe main runtime configuration lives in a JSON file modeled after [config.example.json](config.example.json). It defines:\n\n- the upstream repository and branch;\n- the fork repository and branch;\n- the working directory;\n- the LLM provider and model selection (`provider`, `fast`, `specialist`, `powerful`);\n- sync limits and batching;\n- validation tiers.\n\nThe `provider` field in the `models` section is required. It accepts any provider ID supported by `@sctg/cline-sdk` (e.g. `\"keypoollive\"`, `\"anthropic\"`, `\"openai\"`, `\"mistral\"`, `\"gemini\"`). The API key is resolved from the `apiKey` field, a `$ENV_VAR` reference, or the implicit `{PROVIDER_UPPER}_API_KEY` environment variable.\n\n### `sync.prNumberMatching` — Manual backport detection (optional)\n\nBy default, the agent detects already-applied commits using three signals: `git cherry` patch comparison, exact subject-line match, and the `cherry picked from commit \u003csha\u003e` annotation added by `git cherry-pick -x`.\n\nWhen a commit is cherry-picked manually (conflict resolution, subject rewrite, no `-x` flag), all three signals can miss it. Enabling `prNumberMatching` adds a fourth signal: if a fork commit references the same upstream PR number **and** the two subjects are similar enough (Jaccard word-token score), the commit is considered already applied.\n\n```json\n\"sync\": {\n  \"prNumberMatching\": {\n    \"enabled\": true,\n    \"minSubjectSimilarity\": 0.4\n  }\n}\n```\n\n| Field | Default | Description |\n|---|---|---|\n| `enabled` | `false` | Activate PR-number-based duplicate detection. |\n| `minSubjectSimilarity` | `0.4` | Minimum Jaccard word-token similarity (0–1) between the upstream subject and the matching fork subject. Lower → more permissive (risk of false positives). Higher → stricter (may miss heavily reworded backports). |\n\n**Example:** upstream commit `Move \\`sdk/apps/\\` to \\`apps/\\` (#11200)` is detected as already applied when the fork contains `feat(backport): Move sdk/apps/ to apps/ (cline#11200)` — the PR number matches and the similarity score (~0.67) exceeds the default threshold.\n\nEnable this only when your team consistently includes the upstream PR number in manual backport commit messages.\n\n### `ai` section — Quality guardrails (optional)\n\nThe optional `ai` section configures the AI quality guardrails introduced to improve backport reliability. All fields have safe defaults and the section can be omitted entirely.\n\n```json\n\"ai\": {\n  \"minAutoApplyConfidence\": \"medium\",\n  \"requireReviewOnSemanticRisk\": false,\n  \"enableConflictConsensus\": false,\n  \"conflictConsensusThreshold\": 0.7,\n  \"enrichCustomizationContext\": true\n}\n```\n\n| Field | Default | Description |\n|---|---|---|\n| `minAutoApplyConfidence` | `\"medium\"` | Minimum AI confidence level (`\"high\"` or `\"medium\"`) to auto-apply a conflict resolution. Use `\"high\"` for stricter auto-apply. |\n| `requireReviewOnSemanticRisk` | `false` | When `true`, any commit carrying semantic risk factors is escalated to `\"review-required\"` by `reconcile_ai_assessments`, regardless of the individual AI recommendations. |\n| `enableConflictConsensus` | `false` | **Opt-in.** Runs a second, independent conflict resolution using `config.models.powerful` and compares both outputs with a Dice-coefficient similarity score. If the two resolutions diverge below `conflictConsensusThreshold`, confidence is downgraded to `\"low\"`. Enabling this roughly doubles LLM cost per conflict. |\n| `conflictConsensusThreshold` | `0.7` | Minimum line-level similarity (0–1) required for consensus. Only used when `enableConflictConsensus: true`. |\n| `enrichCustomizationContext` | `true` | When `true`, `check_customization_compatibility` reads up to 2 source files matching each customization glob (2 000 chars each) and injects their content into the AI prompt for richer analysis. |\n\n### AI sub-agent tools\n\nThe `src/ai` module exposes four tools that the main agent invokes when deterministic logic is not enough.\n\n| Tool | Type | Purpose |\n|---|---|---|\n| `resolve_conflict_with_ai` | LLM call | Resolves merge conflicts in a single file using the configured `specialist` model. Returns `resolvedContent`, `confidence` (`\"high\"` / `\"medium\"` / `\"low\"`), and `reasoning`. Guards: conflict-marker detection, syntax balance check (JS/TS), optional dual-model consensus. |\n| `analyze_commit_for_backport` | LLM call | Analyzes a commit diff to produce a summary, key changes, complexity estimate, semantic risk factors, and a backport `recommendation`. Also runs hallucination detection on referenced file paths. |\n| `check_customization_compatibility` | LLM call | Checks whether a set of changes is compatible with the fork's declared customizations. Optionally enriches the prompt with actual file content when `ai.enrichCustomizationContext` is enabled. |\n| `reconcile_ai_assessments` | Deterministic | **No LLM call.** Combines the outputs of the two analysis tools into a single `finalRecommendation`. Detects contradictions (e.g. analyze said \"apply\" but compatibility check failed), applies `requireReviewOnSemanticRisk` escalation, and always resolves ambiguity conservatively. Call this after both analysis tools have run for the same commit. |\n\nEvery LLM call is logged to the run's `.prompts.jsonl` file alongside structured quality signals (guards triggered, confidence, hallucination suspects). The detailed report includes a **Decision Quality Metrics** section summarising these signals across the full run.\n\n#### Benchmark replay\n\nThe `src/tools/benchmark-replay.ts` script lets you compare two models side-by-side without running a full sync against a real repository. It reads an existing `.prompts.jsonl` log, replays every LLM call with the alternative model, and prints a Markdown comparison report.\n\n```bash\nnpx tsx src/tools/benchmark-replay.ts \\\n  --log run-1780060224987.prompts.jsonl \\\n  --model anthropic/claude-sonnet-4-5 \\\n  --provider anthropic \\\n  --api-key \"$ANTHROPIC_API_KEY\" \u003e comparison.md\n```\n\nCustom fork invariants live in a YAML file modeled after [customizations.example.yaml](customizations.example.yaml). This is where you describe the areas that must not be broken by a backport run.\n\n\n\n## For contributors\n\nContributions are especially welcome in the following areas:\n\n- additional integration tests for more SDK tools and runtime behaviors;\n- stronger customization detection and risk classification;\n- better report formatting and human-review summaries;\n- more realistic validation strategies for large forks;\n- documentation improvements and onboarding examples;\n- support for additional providers or model-routing strategies.\n\nIf you are looking for a good first contribution, start with tests or documentation. The project already has a deterministic core, so incremental improvements are easy to verify.\n\n## Design principles\n\nThis project intentionally avoids the “merge everything and hope” approach. The main design goals are:\n\n- preserve the fork’s intent;\n- keep changes small and reviewable;\n- use deterministic logic first;\n- use AI only where it adds clear value;\n- fail safely when confidence is low.\n\nThat makes the agent more useful for real maintenance work and easier for contributors to reason about.\n\n## License\n\nMIT License. See [LICENSE.md](LICENSE.md) for details.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsctg-development%2Fbackport-agent","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsctg-development%2Fbackport-agent","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsctg-development%2Fbackport-agent/lists"}