https://github.com/ianlintner/caretaker
Agentic Repo Maintenance
https://github.com/ianlintner/caretaker
agentic caretaker copilot devops llm python
Last synced: about 1 month ago
JSON representation
Agentic Repo Maintenance
- Host: GitHub
- URL: https://github.com/ianlintner/caretaker
- Owner: ianlintner
- Created: 2026-04-13T20:35:13.000Z (2 months ago)
- Default Branch: main
- Last Pushed: 2026-05-06T10:36:15.000Z (about 2 months ago)
- Last Synced: 2026-05-06T12:27:48.584Z (about 2 months ago)
- Topics: agentic, caretaker, copilot, devops, llm, python
- Language: Python
- Homepage: https://ianlintner.github.io/caretaker/
- Size: 7.31 MB
- Stars: 1
- Watchers: 0
- Forks: 0
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- Codeowners: .github/CODEOWNERS
- Agents: docs/agents.md
Awesome Lists containing this project
README
# Caretaker
Autonomous GitHub repository management powered by Copilot and github app.

Documentation: https://ianlintner.github.io/caretaker/
**One issue. No CLI. No tooling.** Paste a setup issue into your repo, assign it to `@copilot`, walk away. Your repo is now autonomously maintained.
---
## How It Works
1. **You** paste a setup issue into your repo and assign it to `@copilot`
2. **Copilot** reads our [SETUP_AGENT.md](setup-templates/SETUP_AGENT.md), analyzes your repo, and opens a PR with everything configured
3. **You** merge the PR
4. **The orchestrator** runs daily via GitHub Actions, managing PRs, issues, and upgrades
The orchestrator uses Copilot as its execution engine — it observes your repo state, decides what needs to happen, and delegates code changes to Copilot via structured comments.
---
## Setup
### 1. Create a new issue in your repo:
> **Tip:** Visit the [Getting Started docs](https://ianlintner.github.io/caretaker/getting-started/) and use the **copy** button on the code block below to copy the issue template in one click.
```markdown
## Setup Caretaker
@copilot Please set up the caretaker system for this repository.
### Instructions
1. Read the setup guide at:
https://github.com/ianlintner/caretaker/blob/main/setup-templates/SETUP_AGENT.md
2. Follow the instructions in that guide exactly.
3. After creating all files, open a single PR with the changes.
Title: "chore: setup caretaker"
### Context
This repo uses the caretaker system for automated repo management.
See: https://github.com/ianlintner/caretaker
```
### 2. Assign the issue to `@copilot`
### 3. Review and merge the PR that Copilot opens
### 4. Add `COPILOT_PAT` from a write-capable user for Copilot hand-offs, and `ANTHROPIC_API_KEY` for enhanced AI features
`COPILOT_PAT` should be a fine-grained PAT that belongs to a real user or machine user with write access to the repository.
Caretaker uses that token for:
- API-based assignment of issues to GitHub Copilot
- PR comments that `@copilot` must see as coming from a write-capable identity rather than `github-actions[bot]`
---
## What Gets Installed
After setup, your repo has:
```
.github/
copilot-instructions.md ← Copilot project memory (appended)
agents/
maintainer-pr.md ← PR agent persona
maintainer-issue.md ← Issue agent persona
maintainer-upgrade.md ← Upgrade agent persona
maintainer/
config.yml ← Repo-specific settings
.version ← Pinned version
```
No Python. No Node. No vendored code. **No GitHub Actions workflow either** — all
execution happens server-side, driven by App webhooks. Just config and Copilot
instructions.
---
## Features
### Coding backends
When an agent needs to make a code change, it routes through the
`ExecutorDispatcher`, which picks one of four backends per dispatch:
- **Copilot** — `@copilot` hand-off comment (the legacy default).
- **Foundry** — in-process LLM tool loop, drives Azure AI Foundry or
any LiteLLM-compatible provider directly from `mcp_backend`.
- **HandoffAgent** — tags the PR/issue and lets `claude-code-action`
or `opencode_local` GitHub Actions run asynchronously.
- **K8s Job** — durable per-task pod for long-running work; brokered
through Azure Service Bus and the `caretaker-job-dispatcher`
deployment.
Three labels override per-item: `agent:custom`, `agent:copilot`,
`agent:quarantine`.
### Core Agents
These eleven agents handle the day-to-day repository workload. Seven
additional specialist agents (review, principal, refactor, perf,
migration, test, bootstrap) are documented in
[docs/agents.md](https://ianlintner.github.io/caretaker/agents/).
#### PR Agent
- Monitors all open PRs in real-time
- Detects and triages CI failures (test, lint, build, type errors)
- Requests fixes from Copilot via structured comments
- Retry loop with escalation after max attempts
- Auto-merge for Copilot, Dependabot, and human PRs (configurable)
- Handles flaky test detection and CI re-runs
- Review state analysis and auto-approval (configurable)
#### Issue Agent
- Triages incoming issues (bug, feature, question, duplicate, stale)
- Dispatches implementable issues through the configured coding backend
- Tracks issue → PR → merge lifecycle
- Auto-closes answered questions and stale issues (configurable)
- Escalates complex issues to repo owners
#### DevOps Agent
- Monitors default-branch CI failures
- Automatically creates fix issues for build/test failures
- Deduplicates similar issues with cooldown periods
- Routes fixes through the configured coding backend
#### Self-Heal Agent
- Detects `mcp_backend` and dispatcher failures
- Creates self-diagnosis issues
- Reports bugs to upstream caretaker repository (configurable)
- Ensures the system can maintain itself
#### Security Agent
- Triages Dependabot alerts
- Monitors code scanning findings
- Tracks secret scanning alerts
- Filters by severity thresholds
- Creates remediation issues with context
#### Dependency Agent
- Reviews Dependabot PRs
- Auto-merges patch and minor updates (configurable)
- Posts dependency update digests
- Smart merge strategies by update type
#### Docs Agent
- Reconciles merged PRs into changelog updates
- Maintains documentation freshness
- Configurable lookback period
- Optional README updates
#### Charlie Agent
- Cleans up duplicate caretaker-managed issues and PRs
- Closes abandoned work after 14-day default window
- Prevents operational clutter accumulation
- Exempt label support for critical work
#### Stale Agent
- Warns and closes stale issues and PRs (60+ days default)
- Deletes merged branches automatically
- Configurable stale thresholds
- Exempt labels for pinned or security work
#### Escalation Agent
- Creates human escalation digest issues
- Aggregates work requiring maintainer attention
- Configurable targets and notification
- Tracks escalation age and priority
#### Upgrade Agent
- Detects new caretaker releases
- Creates upgrade issues for the configured coding backend to execute
- Supports multiple strategies: auto-minor, auto-patch, latest, pinned
- Handles breaking vs. non-breaking upgrades
- Version pinning via `.version` file
- Preview channel support
### Advanced Features
#### Goal Engine (Experimental)
- Quantitative goal-based agent dispatch
- Measures repository health across dimensions:
- CI health (green builds on main and PRs)
- PR lifecycle velocity
- Security posture
- Self-health monitoring
- Scores each goal from 0.0 (unmet) to 1.0 (satisfied)
- Prioritizes agents based on goal impact
- Detects divergence and critical states
- Tracks goal history for trend analysis
#### Memory Store
- Disk-backed SQLite storage for agent memory
- Persistent deduplication across runs
- Namespaced memory for different agent concerns
- Automatic snapshot generation for auditing
- Bounded storage with configurable limits
### Optional: Claude Integration
Add `ANTHROPIC_API_KEY` to unlock enhanced AI features:
- **CI log analysis** — better at parsing long, noisy logs
- **Architectural review** — understands complex code review comments
- **Issue decomposition** — breaks down multi-faceted bugs
- **Upgrade impact analysis** — assesses breaking change risk
### Optional: OpenRouter Integration
Set `OPENROUTER_API_KEY` (or its accepted alias `OPEN_ROUTER_API_KEY`)
and `provider: openrouter` in
`.github/maintainer/config.yml` to route LLM calls through
[OpenRouter](https://openrouter.ai), which gives you:
- **300+ models behind one key** — DeepSeek R1, Gemini, Llama, Qwen,
GLM, plus all the proprietary frontier models.
- **Per-feature model routing** — pin different caretaker features to
different best-fit models via `feature_models`.
- **Web-grounded analysis** — append `:online` to a model string to add
a web search step before the completion. Caretaker ships this as the
default for `upgrade_impact_analysis`, `migration_analysis`, and
`migration_plan` so release-note and breaking-change context comes
from current sources rather than stale model knowledge.
Sample config:
```yaml
llm:
provider: openrouter
default_model: openrouter/anthropic/claude-sonnet-4.6
feature_models:
ci_log_analysis:
model: openrouter/deepseek/deepseek-r1
principal_architecture_review:
model: openrouter/anthropic/claude-opus-4.6
```
**Cost note:** `:online` adds OpenRouter's web-search step
(~$4 per 1k searches) on top of the model call. The
`caretaker.llm.online=true` OTel span attribute lets you break out
web-grounded spend in cost dashboards.
When `provider: openrouter` is set, every model string must begin
with `openrouter/`. Caretaker rejects bare model names at
config-load to prevent the silent bypass to Anthropic-direct that
LiteLLM otherwise performs.
---
## What's new
### Fleet registry (opt-in)
Each consumer repo's successful `caretaker run` can POST a small
heartbeat to a central caretaker backend so an operator sees every
managed repository in one dashboard — without running an org-wide
GitHub crawl.
Enable in `.github/maintainer/config.yml`:
```yaml
fleet_registry:
enabled: true
endpoint: https:///api/fleet/heartbeat
```
See [docs/fleet-registry.md](docs/fleet-registry.md) for architecture,
payload shape, and HMAC-signed delivery.
### Custom coding agent
Small tasks (lint fixes, trivial test failures, review comments) no
longer have to go to `copilot-swe-agent[bot]`. A configurable
executor routes them to caretaker's own Foundry tool-loop or to an
`anthropics/claude-code-action` hand-off, with a size-budget guard
and an explicit escalation path back to Copilot.
Three routing labels let operators steer individual items:
- `agent:custom` — force the custom executor.
- `agent:copilot` — force the legacy path.
- `agent:quarantine` — refuse dispatch (for hostile or confusing issues).
On AKS deployments, the MCP backend exposes
`POST /api/admin/agent-tasks` which spawns a short-lived
`batch/v1 Job` per dispatch. See
[docs/custom-coding-agent-plan.md](docs/custom-coding-agent-plan.md)
for the full design, phased rollout, size budget, and security model;
[docs/custom-coding-agent-e2e.md](docs/custom-coding-agent-e2e.md)
for the operator runbook.
---
## Configuration
See [setup-templates/templates/config-default.yml](setup-templates/templates/config-default.yml) for the full config schema.
Key settings:
```yaml
pr_agent:
auto_merge:
copilot_prs: true # Auto-merge Copilot PRs
dependabot_prs: true # Auto-merge dependency updates
copilot:
max_retries: 2 # Fix attempts before escalation
issue_agent:
auto_assign_bugs: true # Auto-assign simple bugs to Copilot
auto_assign_features: false
devops_agent:
target_branch: main # Monitor default branch CI
max_issues_per_run: 3 # Prevent issue spam
dedup_open_issues: true
security_agent:
min_severity: medium # Filter by severity
include_dependabot: true
include_code_scanning: true
include_secret_scanning: true
dependency_agent:
auto_merge_patch: true
auto_merge_minor: true
post_digest: true
charlie_agent:
stale_days: 14 # Short janitorial window for caretaker-managed work
close_duplicate_issues: true
close_duplicate_prs: true
stale_agent:
stale_days: 60 # General stale threshold
close_after: 14
delete_merged_branches: true
upgrade_agent:
strategy: auto-minor # auto-minor | auto-patch | latest | pinned
channel: stable # stable | preview
goal_engine:
enabled: false # Experimental: goal-driven dispatch
goal_driven_dispatch: false # Reorder agents by goal impact
divergence_threshold: 3 # Runs before triggering alerts
memory_store:
enabled: true # Persistent agent memory
db_path: .caretaker-memory.db
max_entries_per_namespace: 1000
```
---
## Architecture
The orchestrator runs server-side on AKS, not in your repo. Three
deployable processes split the work:
```
GitHub App webhooks
│
▼
mcp_backend (FastAPI x2, AKS)
│ ├── HMAC + allow-list
│ ├── dedup + rate-limit
│ └── Redis Streams ──► agent router ──► ExecutorDispatcher
│ │
│ ├──► Copilot @-mention (legacy)
│ ├──► Foundry (in-process LLM tool loop)
│ ├──► HandoffAgent (opencode_local / claude-code-action)
│ └──► Azure Service Bus ──► caretaker-job-dispatcher
│ │
│ ▼
│ per-task K8s Job
│ │
▼ ▼
MongoDB / Cosmos · Neo4j · SQLite git push + PR comment
```
Eighteen agents live behind the dispatcher, grouped by trigger
(event-driven, scheduled, dispatch-time / advisory). The orchestrator
**never writes code itself** — it routes to one of four coding backends.
For diagrams of the runtime topology, webhook event pipeline, durable
coding-job lifecycle, and full agent inventory, see
[docs/architecture.md](https://ianlintner.github.io/caretaker/architecture/).
---
## Development
```bash
# Clone and install
git clone https://github.com/ianlintner/caretaker.git
cd caretaker
pip install -e ".[dev]"
# Run tests
pytest tests/ -v
# Lint
ruff check src/ tests/
ruff format --check src/ tests/
# Type check
mypy src/
```
---
## License
MIT