{"id":48671558,"url":"https://github.com/pcc-labs/sweeper","last_synced_at":"2026-05-16T11:06:28.339Z","repository":{"id":344633838,"uuid":"1180862796","full_name":"papercomputeco/sweeper","owner":"papercomputeco","description":"Multi threaded code maintenance with resource isolated subagents.","archived":false,"fork":false,"pushed_at":"2026-03-24T02:35:07.000Z","size":1541,"stargazers_count":7,"open_issues_count":1,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-03-25T02:23:37.344Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Go","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/papercomputeco.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":"AGENTS.md","dco":null,"cla":null}},"created_at":"2026-03-13T13:47:34.000Z","updated_at":"2026-03-24T02:35:11.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/papercomputeco/sweeper","commit_stats":null,"previous_names":["papercomputeco/sweeper"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/papercomputeco/sweeper","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/papercomputeco%2Fsweeper","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/papercomputeco%2Fsweeper/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/papercomputeco%2Fsweeper/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/papercomputeco%2Fsweeper/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/papercomputeco","download_url":"https://codeload.github.com/papercomputeco/sweeper/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/papercomputeco%2Fsweeper/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31642659,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-10T07:40:12.752Z","status":"ssl_error","status_checked_at":"2026-04-10T07:40:11.664Z","response_time":98,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2026-04-10T12:31:09.402Z","updated_at":"2026-05-16T11:06:28.290Z","avatar_url":"https://github.com/papercomputeco.png","language":"Go","funding_links":[],"categories":[],"sub_categories":[],"readme":"# 🧹 Sweeper Agent\n\nMulti-threaded code maintenance with resource isolated subagents and swappable AI providers.\n\nSweeper dispatches parallel AI agents to fix lint issues across your codebase, each running in its own isolated environment. Providers are swappable: use Claude Code (default), OpenAI Codex, or local models via Ollama. It groups issues by file, fans out concurrent fixes, escalates strategy when fixes stall, and records outcomes so it learns what works. With VM isolation enabled, each sub-agent runs inside a dedicated stereOS virtual machine with its own CPU, memory, and secrets boundary, safe to scale to 10+ concurrent agents.\n\n```\n                        sweeper run --vm -c 5\n                              │\n                    ┌─────────┼─────────┐\n                    ▼         ▼         ▼\n              ┌──────────────────────────────┐\n              │        Worker Pool           │\n              │ (rate-limited, max N=5)      │\n              └──┬───┬───┬───┬───┬──────────┘\n                 │   │   │   │   │\n                 ▼   ▼   ▼   ▼   ▼\n               ┌───┐┌───┐┌───┐┌───┐┌───┐\n               │VM ││VM ││VM ││VM ││VM │  ◄── stereOS isolation\n               │ 1 ││ 2 ││ 3 ││ 4 ││ 5 │      (secrets, CPU, memory)\n               └─┬─┘└─┬─┘└─┬─┘└─┬─┘└─┬─┘\n                 │     │     │     │     │\n                 ▼     ▼     ▼     ▼     ▼\n              claude  claude claude claude claude\n              --print --print --print --print --print\n                 │     │     │     │     │\n                 └─────┴──┬──┴─────┴─────┘\n                          │\n                    ┌─────┼───────────┐\n                    ▼     ▼           ▼\n               streaming telemetry  tapes\n               progress  (.jsonl)  (SQLite)\n```\n\nEach sub-agent works on a single file. Results stream back as they complete, giving real-time progress instead of blocking until the entire round finishes.\n\n## Why Sub-Agents\n\nThe main thread never reads or edits source files. It runs the linter, builds prompts, dispatches work, and collects results. All file-level reasoning happens inside sub-agents via `claude --print`, which are stateless, single-shot processes.\n\nThis matters because the orchestrator's context window stays small and predictable. It holds lint output, task metadata, and result summaries, not the contents of every file being fixed. A run that touches 50 files uses roughly the same orchestrator context as one that touches 5. The complexity scales in parallelism, not in context size.\n\n```\n  Orchestrator (main thread)              Sub-agents (disposable)\n  ┌────────────────────────┐\n  │ lint output            │              ┌──────────────────────┐\n  │ file groupings         │  ──dispatch──▶ claude --print       │\n  │ strategy decisions     │              │  reads auth.go       │\n  │ result summaries       │  ◀──result── │  writes fix          │\n  │                        │              └──────────────────────┘\n  │ (never sees file       │              ┌──────────────────────┐\n  │  contents directly)    │  ──dispatch──▶ claude --print       │\n  │                        │              │  reads router.go     │\n  │                        │  ◀──result── │  writes fix          │\n  └────────────────────────┘              └──────────────────────┘\n```\n\nSub-agents are fire-and-forget. Each one gets a prompt with the lint issues for its file, does the work, and exits. If it fails, the orchestrator knows from the exit code and can retry with an escalated strategy on the next round. No conversation state carries over between rounds, which keeps each attempt clean.\n\n## Setup\n\n### Go CLI (standalone)\n\nThe core binary. All integrations below (except Pi) require this.\n\n```bash\ngo install github.com/papercomputeco/sweeper@latest\nsweeper run                              # default: golangci-lint with claude\nsweeper run --provider codex             # use OpenAI Codex CLI instead\nsweeper run --provider ollama --model qwen2.5-coder:7b  # local model via Ollama\nsweeper run --vm -c 3 --max-rounds 3    # VM isolation, 3 agents, 3 rounds\nsweeper run -- npm run lint              # any linter\nsweeper observe                          # review success rates + token spend\n```\n\n### Claude Code\n\nTo use sweeper as a skill in [Claude Code](https://docs.anthropic.com/en/docs/claude-code):\n\n1. Build the binary:\n```bash\ngo build -o sweeper .\nexport PATH=\"$PWD:$PATH\"\n```\n\n2. Copy the skill into your project:\n```bash\ncp -r skills/sweeper/ /path/to/your-project/.claude/skills/sweeper/\n```\n\n3. Tell Claude: \"Run sweeper on this project\"\n\nClaude will orchestrate `sweeper run` with the right flags based on your project.\n\n\u003e **Plugin support:** This repo includes a `.claude-plugin/` manifest for distribution as a Claude Code plugin, but it is not yet published to the plugin marketplace. If there is community interest, I am happy to submit it.\n\n### opencode\n\n\u003e **Note:** opencode and Pi integrations have only been lightly tested. Claude Code is the primary development and testing target.\n\nTo use sweeper as a skill in [opencode](https://opencode.ai) (a terminal-based AI coding agent):\n\n1. Build the binary:\n```bash\ngo build -o sweeper .\nexport PATH=\"$PWD:$PATH\"\n```\n\n2. Copy the skill into your project's agents directory:\n```bash\nmkdir -p /path/to/your-project/.opencode/agents/\ncp skills/sweeper/SKILL.md /path/to/your-project/.opencode/agents/sweeper.md\n```\n\n3. Tell opencode: \"Run sweeper on this project\"\n\n### Pi\n\n[Pi](https://github.com/anthropics/pi) is a Claude-native IDE extension. Its sweeper integration reimplements the linting and telemetry loop in TypeScript using Pi's own tool system, so it does **not** need the Go binary.\n\n```bash\npi install sweeper\n```\n\nThis gives you `init_sweep`, `run_linter`, and `log_result` tools plus a dashboard widget. To start a sweep, tell Pi: \"Sweep this project for lint issues\"\n\n## Providers\n\nSweeper supports swappable AI providers. Well-scoped tasks like lint fixes can run on smaller, cheaper models.\n\n| Provider | Kind | Requires | Example |\n|----------|------|----------|---------|\n| `claude` (default) | CLI | `claude` CLI installed | `sweeper run` |\n| `codex` | CLI | `codex` CLI installed | `sweeper run --provider codex` |\n| `ollama` | API | Ollama running locally | `sweeper run --provider ollama --model qwen2.5-coder:7b` |\n\n**CLI providers** (claude, codex) have built-in file tools. Sweeper sends a prompt and the harness reads/writes files directly.\n\n**API providers** (ollama) are text-in, text-out. Sweeper includes file content in the prompt and applies the returned unified diff via `patch`.\n\n### Provider flags\n\n- `--provider \u003cname\u003e` — AI provider to use (default: `claude`)\n- `--model \u003cname\u003e` — Model override for the provider (e.g. `qwen2.5-coder:7b` for ollama)\n- `--api-base \u003curl\u003e` — API base URL for API providers (default: `http://localhost:11434` for ollama)\n\nVM isolation (`--vm`) is only compatible with CLI providers.\n\n## Examples\n\nSweeper works with any command that produces output, not just linters.\n\n```bash\n# Fix all golangci-lint issues (default)\nsweeper run\n\n# Fix ESLint issues across a JS/TS project\nsweeper run -- npx eslint --quiet .\n\n# Fix Clippy warnings in a Rust project\nsweeper run -- cargo clippy 2\u003e\u00261\n\n# Run a custom script that checks for AI slop patterns\nsweeper run -- ./scripts/check-slop.sh\n\n# Higher concurrency with VM isolation\nsweeper run --vm -c 5 --max-rounds 3 -- npm run lint\n\n# Use Codex CLI\nsweeper run --provider codex -- npm run lint\n\n# Use a local Ollama model\nsweeper run --provider ollama --model qwen2.5-coder:7b\n\n# Ollama with a custom API base\nsweeper run --provider ollama --model codellama --api-base http://gpu-server:11434\n```\n\n### Refactors\n\nSweeper fixes anything you can express as `file:line: message` output. Pipe the output of any detection command and agents will work on each file in parallel.\n\n```bash\n# Refactor files over 1000 lines\nfind . -name '*.go' -exec wc -l {} + \\\n  | awk '$1 \u003e 1000 {print $2\":1: file exceeds 1000 lines\"}' \\\n  | sweeper run\n\n# Resolve TODO comments across the codebase\ngrep -rn 'TODO\\|FIXME\\|HACK' --include='*.go' . | sweeper run\n\n# Find and fix functions exceeding cyclomatic complexity\ngocyclo -over 15 ./... | sweeper run\n\n# Split oversized React components\nfind src -name '*.tsx' -exec wc -l {} + \\\n  | awk '$1 \u003e 500 {print $2\":1: component exceeds 500 lines\"}' \\\n  | sweeper run\n\n# Fix failing tests by feeding test output to agents\ngo test ./... 2\u003e\u00261 | sweeper run\n```\n\nWhen using sweeper as a skill, you can pass arbitrary goals to the agent:\n\n- \"Run sweeper on this project\" — default lint-fix loop\n- \"Run sweeper to clean up AI slop — remove verbose comments, unnecessary null checks, filler docstrings, and over-abstractions\"\n- \"Run sweeper to fix all failing tests\"\n- \"Run sweeper to migrate deprecated API calls\"\n- \"Run sweeper to refactor files over 1000 lines\"\n- \"Run sweeper to resolve all TODOs\"\n\nThe agent will pick the right command and flags based on your goal.\n\n## How It Works\n\nThis describes the Go CLI and skill-based integrations (Claude Code, opencode). Pi manages its own lint-fix loop through built-in tools and does not use the CLI.\n\n1. **Lint**: run any linter, parse structured output (or fall back to raw mode)\n2. **Plan**: group issues by file, pick strategy per file based on history\n3. **Dispatch**: fan out to N concurrent sub-agents (default 2, max 5, rate-limited)\n4. **Stream**: results arrive in real time as each file completes\n5. **Escalate**: stalled files get retry prompts, then exploration prompts that consider surrounding code\n6. **Record**: outcomes logged to `.sweeper/telemetry/` and tapes captures token usage\n7. **Learn**: `sweeper observe` shows success rates by strategy, round, and linter\n\n## Tapes: The Learning Center\n\nEvery sub-agent session is recorded in [tapes](https://github.com/papercomputeco/tapes). This gives you:\n\n- **Token spend per linter**: know what each fix costs\n- **Strategy effectiveness**: standard vs retry vs exploration success rates\n- **Round effectiveness**: which retry rounds contribute most fixes\n- **Trend tracking**: are you fixing more issues with fewer tokens over time?\n\nRun `sweeper observe` after each sweep to see insights and tune your next run.\n\n## Confluent Cloud Telemetry\n\nSweeper can stream telemetry events to Confluent Cloud alongside local JSONL files. Enable it in `.sweeper/config.toml`:\n\n```toml\n[telemetry]\nbackend = \"confluent\"\ndir = \".sweeper/telemetry\"\n\n[telemetry.confluent]\nbrokers = [\"pkc-xxxxx.region.provider.confluent.cloud:9092\"]\ntopic = \"sweeper.telemetry\"\nclient_id = \"sweeper\"\napi_key_env = \"SWEEPER_CONFLUENT_API_KEY\"\napi_secret_env = \"SWEEPER_CONFLUENT_API_SECRET\"\n```\n\nSet `SWEEPER_CONFLUENT_API_KEY` and `SWEEPER_CONFLUENT_API_SECRET` in your environment. The config references env var names, not raw credentials.\n\nFor cluster and topic setup, install the [confluent-cloud-setup](https://github.com/papercomputeco/skills/tree/main/skills/confluent-cloud-setup) skill:\n\n```bash\nnpx skills add papercomputeco/skills\n```\n\n## VM Isolation\n\nSub-agents can run inside ephemeral [stereOS](https://stereos.ai) virtual machines, managed by the `mb` (Masterblaster) CLI. This is what makes high concurrency safe.\n\nWithout VMs, sub-agents share the host process, filesystem, and API keys. At low concurrency (2-3) this works fine. At higher concurrency, you want each agent isolated so a runaway process or leaked credential stays contained.\n\nWith `--vm`, each sub-agent gets:\n\n- **Own CPU and memory**: 4 cores, 8GB RAM per VM (configurable). No resource contention between agents.\n- **Secret boundary**: `ANTHROPIC_API_KEY` is injected into the VM and never touches the host filesystem.\n- **Nesting safety**: `claude --print` fails inside active Claude Code sessions due to nesting detection. VMs sidestep this entirely.\n- **Clean teardown**: VMs are ephemeral. On exit (success, failure, or SIGINT), the VM is destroyed automatically.\n\n```bash\nsweeper run --vm -c 5 --max-rounds 3    # 5 isolated agents, 3 retry rounds\n```\n\n## Responsible Use\n\nSweeper dispatches automated Claude sub-agents. To stay within [Anthropic's usage policy](https://www.anthropic.com/legal/aup):\n\n- **Rate limiting**: agents are dispatched with a configurable delay between each (default 2s, `--rate-limit`)\n- **Concurrency cap**: hard maximum of 5 parallel agents regardless of flags\n- **Skip permissions**: sub-agents use `--dangerously-skip-permissions` for non-interactive automated operation\n- **Backoff**: exponential delay between retry rounds (5s, 10s, 20s, ... capped at 60s)\n- **Agent identification**: all prompts identify the sub-agent as an automated tool with human oversight\n\nA human must initiate every sweep and should review all changes before committing.\n\n## Session State\n\nSession state lives in `sweeper.md` for resume across restarts. The CLI generates this automatically, and the skill uses it to track progress and token spend.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpcc-labs%2Fsweeper","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fpcc-labs%2Fsweeper","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpcc-labs%2Fsweeper/lists"}