https://github.com/yangfei222666-9/self-improving-loop

Safety-first self-improvement loop for AI agents — auto-rollback on regression, pure stdlib, MIT
https://github.com/yangfei222666-9/self-improving-loop
agent-evolution ai-agents auto-rollback autonomous-agents feedback-loop llm-tooling python self-improving taijios
Last synced: 3 months ago
JSON representation
Safety-first self-improvement loop for AI agents — auto-rollback on regression, pure stdlib, MIT
Host: GitHub
URL: https://github.com/yangfei222666-9/self-improving-loop
Owner: yangfei222666-9
License: mit
Created: 2026-02-24T10:03:42.000Z (5 months ago)
Default Branch: main
Last Pushed: 2026-04-17T11:02:44.000Z (3 months ago)
Last Synced: 2026-04-17T13:09:30.652Z (3 months ago)
Topics: agent-evolution, ai-agents, auto-rollback, autonomous-agents, feedback-loop, llm-tooling, python, self-improving, taijios
Language: Python
Size: 25.4 KB
Stars: 0
Watchers: 0
Forks: 0
Open Issues: 8
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
Awesome Lists containing this project

README

          # self-improving-loop

> A hexagram-guided reliability loop for AI agents:

> **trace → six lines → hexagram policy → guarded patch → rollback on regression.**

[![GitHub Release](https://img.shields.io/github/v/release/yangfei222666-9/self-improving-loop?display_name=tag)](https://github.com/yangfei222666-9/self-improving-loop/releases/tag/v0.1.1)

[![CI](https://github.com/yangfei222666-9/self-improving-loop/actions/workflows/ci.yml/badge.svg)](https://github.com/yangfei222666-9/self-improving-loop/actions/workflows/ci.yml)

[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE)

[![Python](https://img.shields.io/badge/python-3.9%2B-blue.svg)](https://www.python.org/)

[![LLM overhead](https://img.shields.io/badge/LLM%20overhead-%3C1%25-brightgreen)](#performance)

Latest verified release: [v0.1.1](https://github.com/yangfei222666-9/self-improving-loop/releases/tag/v0.1.1) · Launch copy: [English + 中文](LAUNCH_COPY_BILINGUAL.md)

Most "self-improving agent" projects stop at *"log the failures, let the next run read the log"*. That's a methodology, not a loop. **This package is the loop, implemented as a compact pure-stdlib Python runtime** — no framework lock-in, no LLM dependency, no cloud.

The differentiator is the optional Yijing strategy: runtime signals are mapped

into six engineering lines, recognized as a hexagram state, converted into a

bounded policy patch, then verified through the same rollback guard as any

other strategy. It is a state machine, not fortune telling.

Overhead is negligible for normal LLM/HTTP agent calls; for sub-10ms in-memory functions, measure before wrapping.

Wrap any function, get:

- 📊 **Automatic execution tracking** (success rate, latency, rolling window)

- 🗄 **Trace storage choice**: readable JSONL by default, SQLite/WAL for multi-worker deployments

- 🧠 **Adaptive thresholds** per agent profile (high-freq / mid-freq / low-freq / critical)

- ☯️ **Optional hexagram state strategy**: six runtime dimensions → policy patch

- 🛠 **Strategy hook** for proposing improvement configs when failure patterns are detected

- 🧩 **ConfigAdapter contract** for real config backup / patch / restore

- 🛡 **Rollback trigger** when the new config regresses (>10% success drop, >20% latency gain, or 5 consecutive failures)

- 📬 **Pluggable notifier** (stub by default — swap in Telegram / Slack / whatever)

Extracted from [**TaijiOS**](https://github.com/yangfei222666-9/taiji), where the same six-line state model is used for production-scale agent workloads.

![self-improving-loop terminal demo](assets/demo/self_improving_loop_demo.svg)

Demo artifacts: [terminal transcript](assets/demo/self_improving_loop_demo.txt) · [asciinema cast](assets/demo/self_improving_loop_demo.cast)

---

## Install

Latest verified GitHub release:

```bash

pip install https://github.com/yangfei222666-9/self-improving-loop/releases/download/v0.1.1/self_improving_loop-0.1.1-py3-none-any.whl

```

From source:

```bash

pip install git+https://github.com/yangfei222666-9/self-improving-loop.git@v0.1.1

```

Zero required dependencies. Everything is Python stdlib, including optional

SQLite trace storage via `sqlite3`.

---

## Not a...

- **...methodology doc.** Many "self-improving agent" repos are markdown templates that ask *you* to log learnings to `CLAUDE.md`. This is the runtime loop that does it for you.

- **...heavyweight framework.** Compact stdlib code. Drop it next to your existing code. No decorators forced on you. No background process.

- **...LLM-dependent.** The analysis is statistical, not LLM-based. If you want LLM-authored config tweaks, pass an `improvement_strategy` object and ask your favorite LLM inside its `analyze()` method.

---

## What is stable today

Stable:

- Execution tracking

- Adaptive failure thresholds

- JSONL trace storage with a cross-process lock

- Optional SQLite/WAL trace storage

- Strategy-triggered config patching

- ConfigAdapter-backed rollback when a patch regresses

- Optional Yijing hexagram strategy as a deterministic state router:

  runtime traces -> six engineering lines -> hexagram -> bounded policy patch

Experimental:

- Choosing the best config patch automatically. The loop calls your

  `improvement_strategy`; it does not pretend to know your agent better than

  your production tests.

- Full 64-hexagram policy coverage. The first Yijing strategy supports only

  eight core states and should be treated as a bounded policy router.

---

## 30-second example

```python

from self_improving_loop import SelfImprovingLoop

loop = SelfImprovingLoop()

def my_agent_work():

    # Your actual agent call / LLM chain / tool invocation

    return {"status": "ok", "data": ...}

result = loop.execute_with_improvement(

    agent_id="my-agent",

    task="handle user query",

    execute_fn=my_agent_work,

)

if result["improvement_triggered"]:

    print(f"Strategy applied {result['improvement_applied']} config tweaks")

if result["rollback_executed"]:

    print(f"Rolled back because: {result['rollback_executed']['reason']}")

```

That's it. The loop watches every execution and decides when to trigger tuning.

To mutate and restore real agent config, provide a strategy hook plus either a

`ConfigAdapter` or the legacy strategy `current_config/apply/rollback` methods.

---

## Run the four useful examples

From a repo checkout, start here:

```bash

python examples/01_basic_tracking.py

python examples/02_config_rollback.py

python examples/03_langgraph_adapter.py

python examples/04_yijing_strategy.py

```

They prove the four important contracts:

- `01_basic_tracking.py`: wrapper records traces and exposes stats.

- `02_config_rollback.py`: a bad patch is applied, regression is detected, and `ConfigAdapter.rollback_config()` restores the previous config.

- `03_langgraph_adapter.py`: a LangGraph-style node can be wrapped without adopting a new framework.

- `04_yijing_strategy.py`: traces become six runtime lines, a hexagram state, and a bounded policy patch.

For the verbose rollback event trail, run:

```bash

python examples/regression_rollback_demo.py

```

---

## Use it as a safety layer for your current agent

This package is not trying to replace LangGraph, CrewAI, AutoGen, OpenAI

Agents, or your own internal runner. It wraps the callable you already trust:

```python

result = loop.execute_with_improvement(

    agent_id="support-agent",

    task="answer ticket",

    execute_fn=lambda: existing_agent.run(ticket),

    context={"framework": "your-current-stack"},

)

```

`loop.track(...)` is also available as a shorter alias for the same API.

Dependency-free examples show the integration seam:

```bash

python examples/03_langgraph_adapter.py

python examples/wrap_existing_agent.py

```

The goal is narrow: traces, thresholds, guarded strategy application, and

rollback evidence around an agent you already have.

---

## Trace storage

By default, traces are written to a readable `traces.jsonl` file with a

cross-platform sidecar lock. For multi-worker deployments, switch to SQLite:

```python

from self_improving_loop import SelfImprovingLoop

loop = SelfImprovingLoop(storage="sqlite")

```

This writes `traces.sqlite3` with WAL mode enabled. The public API is unchanged:

`execute_with_improvement()` records traces, and the loop reads them back for

thresholds, metrics, and rollback checks.

---

## Optional Yijing policy strategy

The Yijing layer is implemented as a deterministic state machine, not as a

fortune-telling layer:

`runtime traces -> six engineering lines -> hexagram state -> policy patch`

The six lines are:

1. stability

2. efficiency

3. learning activity

4. routing accuracy

5. collaboration

6. governance

Use it as the strategy:

```python

from self_improving_loop import SelfImprovingLoop, YijingEvolutionStrategy

loop = SelfImprovingLoop(

    strategy=YijingEvolutionStrategy(),

    config_adapter=my_config_adapter,

)

```

`improvement_strategy=` remains supported for backward compatibility.

The engineering mapping is explicit:

| Line | Dimension | Yang means | Yin means |

|---|---|---|---|

| 1 | stability | dependencies are healthy | API/network/dependency failure |

| 2 | efficiency | high success, low latency | low success or high latency |

| 3 | learning activity | feedback / recovery signal exists | repeated failure without learning |

| 4 | routing accuracy | model/tool choice looks correct | wrong model/tool/schema drift |

| 5 | collaboration | tools / agents hand off cleanly | conflicts or context breaks |

| 6 | governance | cost and rollout are bounded | quota, cost, or policy drift |

The first version supports eight core policy states: Qian, Kun, Zhen, Kan, Bo,

Fu, Ji Ji, and Wei Ji. It returns a bounded config patch and relies on the same

canary/rollback path as any other strategy.

---

## Why this exists

Most agents have this failure mode:

1. You ship an agent.

2. It works for a week.

3. Something upstream changes (rate limits, schema drift, a new edge case).

4. Your agent starts failing.

5. You find out three days later from angry users.

6. You tweak a config, hope for the best, ship it.

7. The tweak makes another scenario worse.

8. You roll it back manually, losing the original learning.

`self-improving-loop` turns steps 3–8 into a tight feedback loop that runs inside your process, without needing observability infra, Kubernetes, or a dedicated ML team.

---

## Adaptive thresholds (no magic numbers)

Different agents have different "pulse rates". A critical alerting agent should reconsider after 1 failure; a batch classifier can tolerate 5 before triggering analysis. The library classifies agents by execution frequency and adjusts:

The automatic profile is based on `exec_count_24h`; override it with `set_manual_threshold()` when production semantics matter more than raw frequency.

| Agent profile | Failure trigger | Analysis window | Cooldown |

|---|---|---|---|

| High-frequency (>10/day) | 5 failures | 48h | 3h |

| Medium-frequency (3-10/day) | 3 failures | 24h | 6h |

| Low-frequency (<3/day) | 2 failures | 72h | 12h |

| Critical (user-marked) | 1 failure | 24h | 6h |

Or bypass the classifier and set manually:

```python

from self_improving_loop import AdaptiveThreshold

adaptive = AdaptiveThreshold()

adaptive.set_manual_threshold(

    "critical-agent",

    failure_threshold=1,

    analysis_window_hours=12,

    cooldown_hours=1,

    is_critical=True,

)

```

---

## Auto-rollback (the safety net)

When a config change ships, the loop keeps watching. It rolls back if **any** of these become true:

- Success rate drops >10%

- Average latency increases >20%

- ≥5 consecutive failures after the change

Real rollback requires a config hook. Prefer an explicit `ConfigAdapter`:

```python

from self_improving_loop import SelfImprovingLoop

class MyConfigAdapter:

    def get_config(self, agent_id):

        return load_agent_config(agent_id)

    def apply_config(self, agent_id, config_patch):

        save_agent_config(agent_id, {**load_agent_config(agent_id), **config_patch})

        return True

    def rollback_config(self, agent_id, backup_config):

        save_agent_config(agent_id, backup_config)

loop = SelfImprovingLoop(

    improvement_strategy=my_strategy,

    config_adapter=MyConfigAdapter(),

)

```

Without a config adapter or strategy rollback hook, the loop will record the

rollback decision but will not claim that your external agent config was

restored.

```python

# See recent rollbacks

rollback_history = loop.auto_rollback.get_rollback_history("my-agent")

for event in rollback_history:

    print(event["reason"], event["timestamp"])

```

---

## Pluggable notifier

The built-in `TelegramNotifier` is a stub — it logs to stdout. Override `_send_message()` to hook any channel:

```python

from self_improving_loop import TelegramNotifier

class MySlackNotifier(TelegramNotifier):

    def __init__(self, webhook_url, **kw):

        super().__init__(**kw)

        self.webhook_url = webhook_url

    def _send_message(self, message, priority="normal"):

        import requests

        requests.post(self.webhook_url, json={"text": f"[{priority}] {message}"})

loop = SelfImprovingLoop(notifier=MySlackNotifier(webhook_url="https://hooks..."))

```

---

## Performance

Measured locally with `benchmarks/overhead.py` (200 iterations per workload, Python 3.12, Windows):

| Workload profile | Absolute overhead | Relative overhead |

|---|---|---|

| ~100 ms agent call (typical LLM) | +0.27 ms | **+0.3%** |

| ~10 ms agent call (tool call) | +0.31 ms | **+3.0%** |

| sub-millisecond call | +0.08 ms | >>% (don't wrap these) |

The wrapper adds a stable **~300 μs of fixed cost per call** (trace append + threshold check). Whether that's negligible depends on your workload:

- LLM calls (>500 ms): overhead is ≤0.06% — invisible

- HTTP / DB calls (~30-100 ms): ≤1%

- Fast in-memory work (<10 ms): 3%+ — reconsider whether you need this for those

Rerun the benchmark on your own machine with `python benchmarks/overhead.py`.

Separate operation costs (triggered occasionally, not per-call):

| Operation | Cost |

|---|---|

| Failure analysis (only when threshold crossed) | ~100 ms |

| Applying improvement config | ~200 ms |

| Rollback execution | ~10 ms |

---

## Background

Extracted from [**TaijiOS**](https://github.com/yangfei222666-9/taiji) — a self-learning AI operating system with 5 *I Ching*–bound engines and a 346-heartbeat Ising physics experiment. The parent project has 14 modules; this one is the most generally reusable, so it lives as a standalone package.

TaijiOS started on **Chinese New Year 2026-02-17** and has been built through multi-AI collaboration since then.

---

## License

MIT. Ship it wherever.

## Contact / Feedback

This is a very early release. Every bug report, every "didn't work for me", every "I wish it did X" is read:

- GitHub Issue: [open one](https://github.com/yangfei222666-9/self-improving-loop/issues/new)

- Parent project: [TaijiOS](https://github.com/yangfei222666-9/taiji)

---

*"Safety first, then automation."*
ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/yangfei222666-9/self-improving-loop

Awesome Lists containing this project

README