{"id":50719232,"url":"https://github.com/pingchesu/hermes-curator-evolver","last_synced_at":"2026-06-09T22:01:51.334Z","repository":{"id":355369213,"uuid":"1227831105","full_name":"pingchesu/hermes-curator-evolver","owner":"pingchesu","description":"Evidence-driven skill evolution for Hermes Agent — reports, dry-run proposals, candidate search, and guarded apply","archived":false,"fork":false,"pushed_at":"2026-05-29T01:12:00.000Z","size":242,"stargazers_count":12,"open_issues_count":1,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-05-29T02:25:00.522Z","etag":null,"topics":["agent-tools","agents","ai-agents","ai-governance","ai-skills","curator","dry-run-proposals","evidence-driven","guarded-apply","hermes-agent","hermes-plugin","llmops","python","read-only","semantic-search","skill-evolution","skill-governance","sqlite"],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/pingchesu.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-05-03T08:16:00.000Z","updated_at":"2026-05-29T01:12:05.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/pingchesu/hermes-curator-evolver","commit_stats":null,"previous_names":["pingchesu/hermes-curator-evolver"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/pingchesu/hermes-curator-evolver","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pingchesu%2Fhermes-curator-evolver","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pingchesu%2Fhermes-curator-evolver/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pingchesu%2Fhermes-curator-evolver/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pingchesu%2Fhermes-curator-evolver/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/pingchesu","download_url":"https://codeload.github.com/pingchesu/hermes-curator-evolver/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pingchesu%2Fhermes-curator-evolver/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":34127345,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-09T02:00:06.510Z","response_time":63,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["agent-tools","agents","ai-agents","ai-governance","ai-skills","curator","dry-run-proposals","evidence-driven","guarded-apply","hermes-agent","hermes-plugin","llmops","python","read-only","semantic-search","skill-evolution","skill-governance","sqlite"],"created_at":"2026-06-09T22:01:48.333Z","updated_at":"2026-06-09T22:01:51.324Z","avatar_url":"https://github.com/pingchesu.png","language":"Python","funding_links":[],"categories":["Skills \u0026 Plugins"],"sub_categories":["Plugins"],"readme":"\u003cdiv align=\"center\"\u003e\n\n# 🧬 Hermes Curator Evolver\n\n\u003ch3\u003eMake Hermes skills improve from real usage — with evidence, review, and rollback.\u003c/h3\u003e\n\n\u003cp\u003e\n  \u003cb\u003eInspired by \u003ca href=\"https://github.com/AMAP-ML/SkillClaw\"\u003eSkillClaw\u003c/a\u003e\u003c/b\u003e, adapted for Hermes Agent as a local-first plugin:\u003cbr/\u003e\n  session evidence in, safer skill updates out.\n\u003c/p\u003e\n\n\u003cp\u003e\n  If you use Hermes skills heavily and do not want them to silently rot, this plugin turns real usage into reviewable, reversible skill maintenance.\n\u003c/p\u003e\n\n[![Hermes Agent](https://img.shields.io/badge/Hermes-Agent-ff6b6b?style=flat-square)](https://github.com/NousResearch/hermes-agent)\n[![Inspired by SkillClaw](https://img.shields.io/badge/Inspired%20by-SkillClaw-f97316?style=flat-square)](https://github.com/AMAP-ML/SkillClaw)\n[![AI Skills](https://img.shields.io/badge/AI-Skills-8A2BE2?style=flat-square)](https://github.com/pingchesu/hermes-curator-evolver)\n[![Agents](https://img.shields.io/badge/Agents-skill%20governance-2563eb?style=flat-square)](https://github.com/pingchesu/hermes-curator-evolver)\n[![Python](https://img.shields.io/badge/Python-3.11%2B-3776AB?style=flat-square\u0026logo=python\u0026logoColor=white)](https://www.python.org/)\n[![SQLite](https://img.shields.io/badge/SQLite-local%20evidence-003B57?style=flat-square\u0026logo=sqlite\u0026logoColor=white)](https://www.sqlite.org/)\n[![Safety](https://img.shields.io/badge/v0.10-bootstrap%20%2B%20safe%20autorun-22c55e?style=flat-square)](#safety-model)\n[![License](https://img.shields.io/badge/License-MIT-green?style=flat-square)](./LICENSE)\n\n| 📚 Session evidence | 🧠 Memory/skill candidates | 🧪 `--variants` dry-run | 🛡️ Guarded automation |\n|:-:|:-:|:-:|:-:|\n| Learn from real Hermes work | Mine sessions into a review queue | Compare bounded deterministic variants | Bounded notes, reference spillover, rollback |\n\n\u003c/div\u003e\n\n---\n\n## Latest update: variants + session mining\n\nTwo new reviewer-first paths are now visible up front:\n\n- **`auto-run --variants N`** generates up to four deterministic, model-free bounded update variants, scores them with local safety/quality signals, and selects one winner. The default remains `--variants 1`, so existing dry runs stay stable.\n- **`candidates-mine` + `candidates-list`** turns already-redacted session evidence into a local SQLite review queue. It classifies findings as `memory`, `skill_update`, `skill_new`, `replay_benchmark`, or `ignore` so a human can decide what should become durable memory, a skill patch, a new skill, or an evaluation case. It does not write Hermes memory, edit skills, or enable auto-apply.\n\n```bash\n# Compare bounded variants without writing files\nhermes-curator-evolver auto-run --skills-dir ~/.hermes/skills --variants 3 --format json\n\n# Mine session-derived evidence into a human-review queue\nhermes-curator-evolver candidates-mine --input-jsonl redacted-evidence.jsonl --queue-db curator-review.sqlite\nhermes-curator-evolver candidates-list --queue-db curator-review.sqlite --status pending --format json\n```\n\n## Contents\n\n- [Who this is for](#who-this-is-for)\n- [Quick start: install, backfill, autorun](#quick-start-install-backfill-autorun)\n- [At a glance](#at-a-glance)\n- [Trust boundary](#trust-boundary)\n- [Why this exists](#why-this-exists)\n- [Inspired by SkillClaw, made Hermes-native](#inspired-by-skillclaw-made-hermes-native)\n- [Launch / discussion kit](#launch--discussion-kit)\n- [Architecture](#architecture)\n- [Model usage plan](#model-usage-plan)\n- [Safety model](#safety-model)\n- [Examples and demo](#examples-and-demo)\n- [Feedback wanted](#feedback-wanted)\n- [CLI reference](#cli-reference)\n- [Contributing](#contributing)\n- [Uninstall](#uninstall)\n\n## Who this is for\n\nHermes Curator Evolver is for people who treat agent skills as operational memory: debugging playbooks, deployment habits, project conventions, and lessons learned from real work. It helps answer a practical question: **how can those skills improve from evidence without letting automation silently rewrite the library?**\n\nUse it when you want:\n\n- local evidence reports before any skill update,\n- dry-run proposals that can be reviewed like maintenance notes,\n- explicit write approval, exact target-hash checks, backups, and rollback,\n- safe unattended maintenance limited to bounded managed blocks,\n- optional semantic search/rerank only when you choose to enable it.\n\nIt is **not** a general AutoML system, a skill marketplace, or an agent that freely rewrites every prompt it can see. The default path is local, model-free, reversible, and intentionally boring.\n\n## Quick start: install, backfill, autorun\n\nCopy, paste, done. `bootstrap` handles the noisy parts: backfill old sessions + enable daily safe autorun.\n\n```bash\nhermes plugins install pingchesu/hermes-curator-evolver --enable\nuv pip install --python ~/.hermes/hermes-agent/venv/bin/python -e ~/.hermes/plugins/curator-evolver\nhermes-curator-evolver bootstrap\n```\n\nThat is the default, model-free path. It writes only low-risk bounded notes to **local agent-created** skills, spills bulky evidence into `references/` when needed, then validates the changed `SKILL.md` before the apply is considered successful. Official/bundled, hub-installed, plugin-provided, `skills.external_dirs`, pinned, unknown-source, and already-over-hard-cap skills are skipped.\n\nWant multilingual semantic/rerank ordering? Make the opt-in explicit:\n\n```bash\nuv pip install --python ~/.hermes/hermes-agent/venv/bin/python -e \"$HOME/.hermes/plugins/curator-evolver[semantic]\"\nhermes-curator-evolver bootstrap --semantic\n```\n\nQuick checks:\n\n```bash\nhermes-curator-evolver status\n# Linux / systemd:\nsystemctl --user list-timers 'hermes-curator-evolver*' --all --no-pager\n# macOS / launchd:\nlaunchctl list | grep hermes-curator-evolver || true\n```\n\n`bootstrap` now installs the daily autorun with the native user scheduler: systemd user timers on Linux and a LaunchAgent plist at `~/Library/LaunchAgents/com.pingchesu.hermes-curator-evolver.auto.plist` on macOS. Use `--schedule hourly|daily|weekly` for portable cadences; custom systemd `OnCalendar` values remain Linux-only.\n\nIf Hermes gateway was already running, restart it once so plugin hooks are loaded. For health checks, timer logs, model details, and uninstall steps, see [docs/after-install.md](docs/after-install.md).\n\n## At a glance\n\n| 1. Collect | 2. Rank | 3. Improve | 4. Protect |\n|:-:|:-:|:-:|:-:|\n| Tool calls + skill loads + old sessions | Evidence counts; optional Qwen + bge rerank | Daily bounded notes + reference spillover + post-apply validation | Only local agent-created skills are writable |\n\n```mermaid\nflowchart LR\n    S[Hermes sessions + tool calls] --\u003e DB[(SQLite evidence)]\n    DB --\u003e T[daily bootstrap scheduler]\n    T --\u003e A[bounded notes to local agent-created skills]\n    A -. bulky evidence .-\u003e REF[references/ spillover]\n    T -. skip .-\u003e P[official / hub / external / pinned skills]\n    A --\u003e B[backup + rollback manifest]\n```\n\n| User concern | Short answer |\n| --- | --- |\n| **Will it run by itself?** | Yes. `bootstrap` enables a daily user-level scheduler: systemd on Linux, launchd on macOS. |\n| **Will it rewrite my skills?** | No. Autorun only updates a managed bounded block and spills bulky evidence to `references/`. |\n| **Will it touch official/team skills?** | No. Provenance gate skips bundled, hub, plugin, and `external_dirs` skills. |\n| **Can I inspect first?** | Yes. `auto-run --format json` is dry-run by default. |\n\n## Trust boundary\n\nThe default experience is designed to be inspectable before it is writable:\n\n- **Read-only first:** `status`, `report`, `analyze`, `candidates`, `candidates-mine`, `candidates-list`, `propose`, `verify`, and default `auto-run` do not mutate skills.\n- **No blind model dependency:** the default bootstrap path is model-free; model-assisted proposal drafting and semantic/rerank ordering require explicit opt-in flags.\n- **Narrow unattended writes:** low-risk autorun writes only a managed bounded notes block, and only after both `--apply-low-risk` and `--approve-auto-apply`.\n- **Size guardrails:** `SKILL.md` updates target a 90k soft cap, spill bulky evidence into `references/`, and skip unattended writes when the target skill is already over the 100k hard cap.\n- **Source provenance gate:** official/bundled, hub-installed, plugin-provided, `skills.external_dirs`, pinned, and unknown-source skills are skipped from unattended writes.\n- **Rollback is concrete:** guarded apply records backups and manifests so you can restore exact prior content.\n\nFor a quick visual walkthrough, see [docs/demo-script.md](docs/demo-script.md). For synthetic output examples, see [examples/](examples/).\n\n### Read-only session + skill candidate review queue\n\n`candidates-mine` turns already-redacted evidence packets into a local SQLite review queue. It classifies each record as `memory`, `skill_update`, `skill_new`, `replay_benchmark`, or `ignore`, but it never writes to Hermes memory, never edits skills, and never enables auto-apply. Every row is pending human review by default.\n\n```bash\ncat \u003e /tmp/redacted-evidence.jsonl \u003c\u003c'EOF'\n{\"text\":\"durable memory 只存精簡宣告事實；流程/步驟/SOP 進 skill；不存 task progress / PR / SHA / 短期狀態\",\"evidence_ref\":\"session:policy\"}\n{\"text\":\"Workflow: 1. First run `ingest`. 2. Then run `mine`. 3. Finally review.\",\"evidence_ref\":\"session:workflow\"}\n{\"text\":\"{\\\"exit_code\\\":1,\\\"output\\\":\\\"remote: Repository not found\\\"}\",\"evidence_ref\":\"session:failure\",\"tool_name\":\"terminal\"}\nEOF\n\nhermes-curator-evolver candidates-mine \\\n  --input-jsonl /tmp/redacted-evidence.jsonl \\\n  --queue-db /tmp/curator-review.sqlite \\\n  --format markdown\n\nhermes-curator-evolver candidates-list \\\n  --queue-db /tmp/curator-review.sqlite \\\n  --status pending \\\n  --format json\n```\n\nThe miner is intentionally conservative: unknown cases become `ignore`, raw `read_file`/source dumps are suppressed as `source_dump`, near-cap `SKILL.md` evidence is marked `direct_append_allowed=false`, and the queue refuses any candidate with `auto_apply_allowed=true` even if a caller bypasses the normal `Candidate` constructor.\n\n## Why this exists\n\nHermes skills are operational memory. They capture how an agent should debug, deploy, research, and communicate in a real environment. But memory decays: stale commands, duplicated workflows, missing caveats, weak trigger descriptions, and hard-won lessons trapped in old session logs.\n\n**Hermes Curator Evolver** closes that loop: session evidence in, safer skill updates out — without patching Hermes core or silently rewriting your skill library.\n\n## Inspired by SkillClaw, made Hermes-native\n\n[SkillClaw](https://github.com/AMAP-ML/SkillClaw) showed the right idea: agents should evolve skills from session trajectories. Hermes Curator Evolver adapts that idea to a local-first Hermes plugin.\n\n| SkillClaw lesson | Hermes-native adaptation |\n| --- | --- |\n| Learn from sessions. | Runtime hooks + historical backfill feed local SQLite evidence. |\n| Retrieve similar skills before editing. | Lexical search by default; optional Qwen embeddings + bge reranking. |\n| Verify skill changes. | Dry-run proposals, verifier gates, exact SHA match, backups, rollback. |\n| Avoid uncontrolled mutation. | No Hermes core patches, pinned skills are skipped, official/hub/external/plugin skills are protected from unattended writes, autorun is bounded and can spill bulky evidence into `references/`. |\n\n## Launch / discussion kit\n\nIf you are evaluating or sharing the project, start with the smallest concrete claim:\n\n\u003e A local-first Hermes Agent plugin that turns session history into evidence-backed skill maintenance, with dry-run proposals and provenance-safe bounded autorun.\n\nUseful links for reviewers and community posts:\n\n- [docs/core-algorithm.md](docs/core-algorithm.md) — exact evidence, candidate-selection, semantic/rerank, and autorun algorithm.\n- [docs/architecture.md](docs/architecture.md) — one-page architecture and safety boundary.\n- [docs/after-install.md](docs/after-install.md) — what to expect after install, health checks, scheduler logs, and uninstall.\n- [docs/hyperagents-design-notes.md](docs/hyperagents-design-notes.md) — clean-room design notes explaining why HyperAgents is *not* a dependency and which concepts (multi-variant candidates, staged verifier) are adapted.\n- [docs/reddit-launch.md](docs/reddit-launch.md) — recommended cadence and concise community-post drafts.\n- [docs/reddit-launch-kit.md](docs/reddit-launch-kit.md) — expanded subreddit-specific titles, replies, and disclosure notes.\n\n## Architecture\n\nSee [docs/architecture.md](docs/architecture.md) for the one-page architecture diagram, model usage plan, and safety boundary. See [docs/after-install.md](docs/after-install.md) for the post-install autorun guide, health checks, uninstall path, and supported models.\n\n```mermaid\nflowchart LR\n    H[Hermes runtime] --\u003e P[curator-evolver plugin]\n    P --\u003e DB[(local SQLite evidence)]\n    DB --\u003e R[reports]\n    R --\u003e Proposal[dry-run proposals]\n    Proposal --\u003e Verify[verifier gate]\n    Verify --\u003e Human[human approval]\n    Human --\u003e Apply[guarded apply + rollback]\n    DB --\u003e Auto[auto-run low-risk bounded update]\n    Auto --\u003e Apply\n```\n\n## Model usage plan\n\n| Phase | Model | Purpose | Default |\n| --- | --- | --- | --- |\n| v0.1 | None | Evidence collection and report aggregation. | Local/read-only. |\n| v0.2 | Hermes configured chat model | Draft improvement proposals from evidence + skill text. | Optional `--draft-with-model`; dry-run artifact; no skill writes. |\n| v0.2 | Deterministic verifier + future verifier prompt | Check grounding, safety, and non-destructive behavior. | Blocks mutation by default. |\n| v0.3/v0.5 | `Qwen/Qwen3-Embedding-0.6B` | Candidate skill/evidence/user-correction search. | Optional `--execute-semantic`; no default download. |\n| v0.3/v0.5 | `BAAI/bge-reranker-v2-m3` | Re-rank candidates, especially for mixed Chinese/English agent workflows. | Optional `--rerank`; no default download. |\n| v0.4 | Verifier + local validation command | Guard final reviewed content before apply. | Requires approval, backup, verification, rollback. |\n| v0.6 | None by default | Automatic low-risk managed skill updates from observed evidence. | Optional `install-auto`; no Hermes core modification. |\n| v0.7 | `Qwen/Qwen3-Embedding-0.6B` + `BAAI/bge-reranker-v2-m3` | Optional model-assisted autorun candidate ordering. | Explicit `--semantic-candidates --rerank-candidates`; models only reorder evidence-eligible candidates. |\n| v0.9 | None | Provenance-safe unattended auto-apply. | Writes only local agent-created skills; skips bundled, hub, plugin, external, pinned, and unknown sources. |\n| v0.10 | None by default | One-command setup and clearer public README. | `bootstrap` backfills sessions and installs/enables autorun; `bootstrap --semantic` is explicit model opt-in. |\n| v0.11 | None | Size-bounded unattended auto-apply. | Keeps `SKILL.md` under the 100k tool cap by targeting a 90k soft cap, spilling bulky evidence into `references/`, and skipping already-over-hard-cap skills. |\n| v0.12 | None by default | Clean-room multi-variant candidate selection and staged verification inspired by HyperAgents concepts. | `--variants N` is deterministic/model-free; `--staged-verify` runs local structural checks before optional user-supplied verify commands. No HyperAgents dependency or model-generated code execution. |\n| v0.13 | None by default | Session-content mining for memory/skill/replay candidates. | `candidates-mine` classifies redacted evidence into a local SQLite review queue, then `candidates-list` lets reviewers decide whether an item should become memory, a skill update, a new skill, a replay benchmark, or be ignored. It is read-only and never writes memory or skills. |\n\n## Safety model\n\nThe guarded path requires:\n\n1. evidence report,\n2. dry-run proposal,\n3. verifier pass,\n4. human-reviewed content,\n5. exact target SHA256 match,\n6. explicit `--approve`,\n7. backup manifest,\n8. optional validation command,\n9. rollback path.\n\nHard defaults:\n\n- ✅ Evidence/report/proposal/candidate commands do not mutate skills.\n- ✅ Semantic mode does not download models by default; `--execute-semantic` / `--rerank` are explicit opt-ins.\n- ✅ Apply refuses to run without `--approve`.\n- ✅ Apply refuses if the target SHA256 changed.\n- ✅ Apply creates a backup before writing.\n- ✅ Failed validation auto-restores the backup.\n- ✅ `auto-run` writes only managed bounded blocks and still requires both `--apply-low-risk` and `--approve-auto-apply` before mutation.\n- ✅ Bulky autorun evidence spills into `references/` instead of growing `SKILL.md` past the tool cap; already-over-hard-cap skills are skipped.\n- ✅ Even with both write flags, unattended auto-apply writes only local agent-created skills. Official/bundled skills (`.bundled_manifest`), hub-installed skills (`.hub/lock.json`), plugin-provided skills, `skills.external_dirs`, pinned skills, and unknown sources are skipped.\n- ✅ `--semantic-candidates` / `--rerank-candidates` are explicit opt-ins and only reorder skills that already passed the evidence threshold.\n- ✅ Optional `--variants N` (default `1`) deterministically generates up to four bounded variants and picks one winner; only the winner is applied, and variant generation never executes model-generated code. See [docs/hyperagents-design-notes.md](docs/hyperagents-design-notes.md).\n- ✅ Optional staged verifier gate: cheap built-in structural check (managed-block + size invariants) runs before any expensive `--verify-command`, so a failing cheap stage skips the expensive stage entirely and still rolls back.\n- ✅ Optional restore-drill gate: `hermes-curator-evolver restore-drill --manifest \u003cmanifest\u003e` replays a rollback manifest into a clean temp directory and emits a pass/fail report. Pair with `auto-run --require-restore-drill` to refuse further mutating apply when the last apply has not been drill-verified yet (default: warn only, never silent).\n\n### Rollback manifest vs restore drill\n\n| Concept | What it proves |\n| --- | --- |\n| **Rollback manifest** | Records original SHA, backup path, support-file snapshots, provenance, evidence DB reference, and any scheduler hooks at the moment of apply. Lets `rollback` restore the prior file in place. |\n| **Restore drill** | Actually replays that manifest into a clean directory (default: temp dir; explicit `--target-dir` must be empty) and verifies: skill content sha256, support files (references/templates/scripts/assets), evidence DB reference is a real SQLite file, provenance metadata is recorded, scheduler/cron references exist on disk. Emits machine-readable JSON. Drill state is checked by `auto-run --require-restore-drill` so unattended apply can refuse to widen risk after a failed, missing, stale, or unreadable drill state. |\n\n## Examples and demo\n\nIf you want to inspect the behavior before installing, start here:\n\n- [60-second demo script](docs/demo-script.md) — terminal walkthrough for a GIF/asciinema recording.\n- [Example artifacts](examples/) — synthetic report, proposal, bounded managed-block diff, and rollback manifest.\n- [Promotion readiness plan](docs/promotion-readiness-plan.md) — what changed to make the repo easier to evaluate publicly.\n- [Architecture notes](docs/architecture.md) — one-page data flow and safety boundary.\n- [Post-install guide](docs/after-install.md) — health checks, scheduler logs, model details, and uninstall steps.\n\n## Feedback wanted\n\nThis project is intentionally conservative, and feedback is most useful around the trust model:\n\n1. Is the provenance gate strict enough for unattended skill maintenance?\n2. Should proposals become PR-like diffs instead of bounded managed notes?\n3. Which evidence signals should count: tool sequences, repeated fixes, user corrections, failed commands, or something else?\n4. What rollback UX would make automated skill maintenance trustworthy?\n5. What evaluation would show that a skill update actually improves future agent behavior?\n\nIf you are sharing or reviewing this project publicly, the community launch notes and draft posts live in [docs/reddit-launch.md](docs/reddit-launch.md).\n\n## CLI reference\n\n```bash\n# One-command bootstrap\nhermes-curator-evolver bootstrap\nhermes-curator-evolver bootstrap --semantic\nhermes-curator-evolver bootstrap --format json\n\n# Evidence\nhermes-curator-evolver status\nhermes-curator-evolver report --days 7 --format json\nhermes-curator-evolver backfill-sessions --sessions-dir ~/.hermes/sessions --days 30 --format json\nhermes-curator-evolver analyze --skill hermes-agent --days 30\n\n# Proposal + verifier\nhermes-curator-evolver propose --skill hermes-agent --skill-file ./SKILL.md --format json --output proposal.json\nhermes-curator-evolver propose --skill hermes-agent --skill-file ./SKILL.md --draft-with-model --model-timeout 180\nhermes-curator-evolver verify --proposal-file proposal.json --skill hermes-agent --format json\n\n# Candidate generation\nhermes-curator-evolver candidates --query \"gateway restart plugin cli\" --skills-dir ~/.hermes/skills\nhermes-curator-evolver candidates --query \"中文 mixed agent skill\" --skills-dir ~/.hermes/skills --semantic --format json       # plan only\nhermes-curator-evolver candidates --query \"中文 mixed agent skill\" --skills-dir ~/.hermes/skills --execute-semantic --format json\nhermes-curator-evolver candidates --query \"中文 mixed agent skill\" --skills-dir ~/.hermes/skills --execute-semantic --rerank --format json\n\n# Guarded apply\nsha256sum ./SKILL.md\nhermes-curator-evolver apply \\\n  --target ./SKILL.md \\\n  --content-file ./reviewed-SKILL.md \\\n  --expected-sha256 \u003ccurrent-sha256\u003e \\\n  --backup-dir .curator-evolver-backups \\\n  --verify-command \"python -m pytest -q\" \\\n  --approve\n\n# Rollback\nhermes-curator-evolver rollback --manifest .curator-evolver-backups/\u003ctimestamp\u003e/manifest.json\n\n# Restore drill (non-destructive: replay manifest into a clean dir and report pass/fail)\nhermes-curator-evolver restore-drill --manifest .curator-evolver-backups/\u003ctimestamp\u003e/manifest.json --format json\nhermes-curator-evolver restore-drill --manifest .curator-evolver-backups/\u003ctimestamp\u003e/manifest.json --target-dir /tmp/drill-XYZ --format markdown\n\n# Automatic evolution\nhermes-curator-evolver auto-run --skills-dir ~/.hermes/skills --format json                  # dry-run\nhermes-curator-evolver auto-run --skills-dir ~/.hermes/skills --semantic-candidates --rerank-candidates --format json\nhermes-curator-evolver auto-run --skills-dir ~/.hermes/skills --apply-low-risk --approve-auto-apply\nhermes-curator-evolver auto-run --skills-dir ~/.hermes/skills --semantic-candidates --rerank-candidates --apply-low-risk --approve-auto-apply\nhermes-curator-evolver auto-run --skills-dir ~/.hermes/skills --apply-low-risk --approve-auto-apply --block-auto-apply-skill 'github-*'\nhermes-curator-evolver auto-run --skills-dir ~/.hermes/skills --apply-low-risk --approve-auto-apply --allow-auto-apply-skill store-playbook  # only within local agent-created source boundary\nhermes-curator-evolver auto-run --skills-dir ~/.hermes/skills --variants 3 --format json                                                   # generate 3 deterministic variants, pick winner (dry-run)\nhermes-curator-evolver auto-run --skills-dir ~/.hermes/skills --apply-low-risk --approve-auto-apply --staged-verify                        # cheap built-in check before expensive verify\nhermes-curator-evolver auto-run --skills-dir ~/.hermes/skills --apply-low-risk --approve-auto-apply --require-restore-drill              # block apply unless last apply was drill-verified\nhermes-curator-evolver install-auto --schedule daily --enable\nhermes-curator-evolver install-auto --schedule daily --enable --semantic-candidates --rerank-candidates\nhermes-curator-evolver uninstall-auto\n```\n\n## Contributing\n\nContributions are welcome. See [CONTRIBUTING.md](CONTRIBUTING.md) for local setup, TDD expectations, PR checklist, smoke tests, and CI behavior.\n\n## Credits and inspiration\n\n**Inspired by [SkillClaw](https://github.com/AMAP-ML/SkillClaw)** — especially the idea that agent skills should evolve from real session evidence, not only from hand-written maintenance. Hermes Curator Evolver keeps that inspiration, but applies it through Hermes-native plugin hooks, local SQLite evidence, explicit model opt-ins, and conservative guarded writes.\n\n## Uninstall\n\nHermes already provides plugin removal:\n\n```bash\nhermes plugins disable curator-evolver\nhermes plugins uninstall curator-evolver   # alias: remove/rm\n```\n\nIf you enabled the optional auto-evolve scheduler, remove it first:\n\n```bash\nhermes-curator-evolver uninstall-auto\n```\n\nPlugin removal does not delete historical evidence by default. Remove it manually only if you want a clean slate:\n\n```bash\nrm -rf ~/.hermes/plugins/curator-evolver/data ~/.hermes/plugins/curator-evolver/backups\n```\n\n## Agent tool\n\nWhen enabled, Hermes can call:\n\n```text\ncurator_evidence_report\n```\n\nto retrieve a JSON evidence report.\n\n## Install from source\n\n```bash\ngit clone https://github.com/pingchesu/hermes-curator-evolver.git\ncd hermes-curator-evolver\npython -m pip install -e .\nhermes plugins enable curator-evolver\n```\n\nIf your Hermes environment does not provide `pip`, use:\n\n```bash\nuv pip install -e .\n```\n\n## Directory-plugin install\n\nYou can also symlink this repository into the Hermes plugin directory:\n\n```bash\nmkdir -p ~/.hermes/plugins\nln -s /path/to/hermes-curator-evolver ~/.hermes/plugins/curator-evolver\nhermes plugins enable curator-evolver\n```\n\n## Data location\n\nDefault:\n\n```text\n~/.hermes/plugins/curator-evolver/data/evidence.sqlite\n```\n\nOverride:\n\n```bash\nexport HERMES_CURATOR_EVOLVER_DB=/custom/path.sqlite\n```\n\n## Roadmap status\n\n- ✅ **v0.1** — evidence/report plugin.\n- ✅ **v0.2** — proposal generation + verifier gate, dry-run by default.\n- ✅ **v0.3** — candidate generation interface with optional embedding/reranker model plan.\n- ✅ **v0.4** — guarded apply with explicit approval, backup, verification, and rollback.\n- ✅ **v0.5** — explicit model execution paths: Hermes chat-model drafts, Qwen embedding candidate ranking, and bge reranking.\n- ✅ **v0.6** — plug-and-play `auto-run` + optional native user scheduler for low-risk managed skill improvements without Hermes core changes.\n- ✅ **v0.7** — explicit `--semantic-candidates` / `--rerank-candidates` for model-assisted autorun candidate ordering.\n- ✅ **v0.8** — `backfill-sessions` for existing Hermes history, `CONTRIBUTING.md`, and GitHub Actions CI.\n- ✅ **v0.9** — provenance-safe autorun: only local agent-created skills can be auto-applied; bundled, hub, plugin, external, pinned, and unknown sources are skipped.\n- ✅ **v0.10** — `bootstrap` one-command setup plus a shorter, visual quick start.\n- ✅ **v0.11** — size-bounded autorun: target a 90k `SKILL.md` soft cap, spill bulky evidence into `references/`, and skip already-over-hard-cap skills.\n- ✅ **v0.12** — deterministic `--variants N`, staged verification, and restore-drill gating for safer autorun choices.\n- ✅ **v0.13** — read-only session mining into a human-review queue for `memory`, `skill_update`, `skill_new`, `replay_benchmark`, or `ignore` decisions.\n\n---\n\n\u003cdiv align=\"center\"\u003e\n\nBuilt for people who want agent skills to improve — without letting automation silently rewrite the library.\n\n\u003c/div\u003e\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpingchesu%2Fhermes-curator-evolver","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fpingchesu%2Fhermes-curator-evolver","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpingchesu%2Fhermes-curator-evolver/lists"}