{"id":49869205,"url":"https://github.com/coredipper/operon-openhands-gates","last_synced_at":"2026-05-15T04:38:21.730Z","repository":{"id":352401220,"uuid":"1214967287","full_name":"coredipper/operon-openhands-gates","owner":"coredipper","description":"Structural reliability critics for the OpenHands Agent SDK — certified stagnation detection, backed by the Operon categorical framework.","archived":false,"fork":false,"pushed_at":"2026-04-19T12:45:23.000Z","size":74,"stargazers_count":0,"open_issues_count":1,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2026-04-19T13:07:33.777Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"https://banu.be/operon/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/coredipper.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-04-19T09:48:52.000Z","updated_at":"2026-04-19T09:49:18.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/coredipper/operon-openhands-gates","commit_stats":null,"previous_names":["coredipper/operon-openhands-gates"],"tags_count":null,"template":false,"template_full_name":null,"purl":"pkg:github/coredipper/operon-openhands-gates","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/coredipper%2Foperon-openhands-gates","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/coredipper%2Foperon-openhands-gates/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/coredipper%2Foperon-openhands-gates/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/coredipper%2Foperon-openhands-gates/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/coredipper","download_url":"https://codeload.github.com/coredipper/operon-openhands-gates/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/coredipper%2Foperon-openhands-gates/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":33054121,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-13T13:14:54.681Z","status":"online","status_checked_at":"2026-05-15T02:00:06.351Z","response_time":103,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2026-05-15T04:38:21.061Z","updated_at":"2026-05-15T04:38:21.722Z","avatar_url":"https://github.com/coredipper.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# operon-openhands-gates\n\n\u003e **In-loop** structural reliability critics for the [OpenHands Agent SDK](https://github.com/OpenHands/software-agent-sdk) — drop-in, cert-emitting.\n\nOpenHands' own docs flag an architectural gap in iterative refinement:\n\n\u003e *\"the current implementation relies solely on threshold/iteration limits rather than monitoring improvement velocity or convergence rates — suggesting this is an architectural gap where monitoring logic could plug in.\"*\n\u003e — https://docs.openhands.dev/sdk/guides/iterative-refinement\n\nThis package ships the missing monitor as a `CriticBase` subclass. It replaces an LLM-judged success score with a **Bayesian stagnation signal** computed over the conversation's message history. When the agent goes in circles, the critic's score drops below threshold, iterative refinement terminates, and a replayable `behavioral_stability` certificate is emitted.\n\n**At a glance:**\n\n- `OperonStagnationCritic` — `epiplexic_integral`-based detection (Paper 4 §4.3, 0.960 convergence accuracy with real embeddings) that plugs directly into `Agent(critic=...)`.\n- One certificate per detection transition, self-verifiable via `certificate.verify()`.\n- Zero-dep `NGramEmbedder` default — bring your own neural embedder for paraphrase-robust detection.\n\n## Install\n\n```bash\npip install operon-openhands-gates\n```\n\nRequires `operon-ai\u003e=0.34.4` and `openhands-sdk\u003e=1.15`.\n\n## Quickstart\n\n```python\nfrom openhands.sdk import Agent, Conversation, LLM\nfrom openhands.sdk.critic.base import IterativeRefinementConfig\nfrom operon_openhands_gates import OperonStagnationCritic\n\ncritic = OperonStagnationCritic(\n    threshold=0.2,\n    critical_duration=3,\n    iterative_refinement=IterativeRefinementConfig(\n        success_threshold=0.2,  # match the critic's threshold\n        max_iterations=5,\n    ),\n)\n\nagent = Agent(llm=LLM(model=\"anthropic/claude-sonnet-4-5\"), tools=[...], critic=critic)\nconversation = Conversation(agent=agent, workspace=workspace)\nconversation.send_message(\"Fix the failing test in ...\")\nconversation.run()  # iterative refinement terminates on sustained stagnation\n\nif critic.certificate is not None:\n    # Replayable evidence of what the gate saw.\n    verification = critic.certificate.verify()\n    assert verification.holds\n```\n\n### Why the non-default `success_threshold`\n\nOpenHands' default `success_threshold=0.6` is tuned for LLM probability-of-success scores. `OperonStagnationCritic` returns the `epiplexic_integral` directly — in [0, 1] where low = stagnant. Paper 4 §4.3 uses δ=0.2 as the stagnation threshold, so match it on the refinement config.\n\n## Sibling package\n\n- [`operon-langgraph-gates`](https://github.com/coredipper/operon-langgraph-gates) — same Paper 4 substrate, same `behavioral_stability_windowed` certificate, targeted at LangGraph's `StateGraph` with `.wrap()` / `.edge()` node APIs. Two packages, one core — this is the framework-portability claim from Paper 5 §3 in code.\n\n## Certificate theorem name and verification\n\nCertificates emitted by this package carry the theorem name `behavioral_stability_windowed` (not the core's shared `behavioral_stability`). The two differ in how they verify:\n\n- `behavioral_stability` (shared core): `mean(severities) \u003c threshold`. Loses the per-window structure that rolling-integral detection operates on.\n- `behavioral_stability_windowed` (shared core, since operon-ai 0.36.0): `max(per_window_severity_means) \u003c= stability_threshold`. Mirrors detection exactly.\n\nBoth verifiers are registered in `operon_ai.core.certificate._THEOREM_FN_PATHS`, so deserialized certificates resolve through `_resolve_verify_fn` without this package needing to be imported. Any consumer with `operon-ai\u003e=0.36.0` can round-trip a `behavioral_stability_windowed` certificate correctly.\n\n### Breaking change from pre-alpha prototypes\n\nEarlier pre-release builds emitted certificates with theorem name `behavioral_stability` (the shared core name), bound to a locally-attached `_verify_fn`. That shape was semantically wrong — the shared verifier is flat-mean-based, so any cert round-tripped through serialization would silently revert to the wrong replay logic. Consumers that key on `certificate.theorem == \"behavioral_stability\"` or `metadata[\"certificate_theorem\"] == \"behavioral_stability\"` must update to `\"behavioral_stability_windowed\"`. No migration path is provided; alpha.\n\n## Citations\n\nBacked by [Paper 4 §4.3](https://github.com/coredipper/operon/blob/main/article/paper4/main.pdf): convergence/false-stagnation accuracy **0.960** with real sentence embeddings (all-MiniLM-L6-v2, N = 300 trials). Full numbers and reproduction commands in the Operon repo at `eval/results/benchmarks_real_embeddings/multi_model_summary.json`. [Paper 5 §3](https://github.com/coredipper/operon/blob/main/article/paper5/main.pdf) establishes the preservation-under-compilation framework that the certificate follows.\n\n## Status\n\n**Alpha.** API may change before `0.1.0` stable. Feedback welcome via Issues.\n\n## License\n\nMIT — see [LICENSE](./LICENSE).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcoredipper%2Foperon-openhands-gates","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fcoredipper%2Foperon-openhands-gates","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcoredipper%2Foperon-openhands-gates/lists"}