{"id":47560077,"url":"https://github.com/aiming-lab/MetaClaw","last_synced_at":"2026-04-08T13:00:33.906Z","repository":{"id":343265851,"uuid":"1176898834","full_name":"aiming-lab/MetaClaw","owner":"aiming-lab","description":"🦞 Just talk to your agent — it learns and EVOLVES 🧬.","archived":false,"fork":false,"pushed_at":"2026-03-23T23:00:06.000Z","size":70374,"stargazers_count":2591,"open_issues_count":5,"forks_count":265,"subscribers_count":17,"default_branch":"main","last_synced_at":"2026-03-24T16:42:27.025Z","etag":null,"topics":["agent","ai-agent","continual-learning","fine-tuning","llm","lora","meta-learning","metaclaw","online-learning","openclaw","reinforcement-learning","skill-learning","tinker"],"latest_commit_sha":null,"homepage":"https://arxiv.org/abs/2603.17187","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/aiming-lab.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-03-09T13:47:13.000Z","updated_at":"2026-03-24T16:30:05.000Z","dependencies_parsed_at":"2026-03-14T09:01:29.055Z","dependency_job_id":"c17d3845-3a37-458d-8afb-b6e7db590c46","html_url":"https://github.com/aiming-lab/MetaClaw","commit_stats":null,"previous_names":["aiming-lab/metaclaw"],"tags_count":4,"template":false,"template_full_name":null,"purl":"pkg:github/aiming-lab/MetaClaw","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aiming-lab%2FMetaClaw","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aiming-lab%2FMetaClaw/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aiming-lab%2FMetaClaw/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aiming-lab%2FMetaClaw/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/aiming-lab","download_url":"https://codeload.github.com/aiming-lab/MetaClaw/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/aiming-lab%2FMetaClaw/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31556239,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-08T10:21:54.569Z","status":"ssl_error","status_checked_at":"2026-04-08T10:21:38.171Z","response_time":54,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["agent","ai-agent","continual-learning","fine-tuning","llm","lora","meta-learning","metaclaw","online-learning","openclaw","reinforcement-learning","skill-learning","tinker"],"created_at":"2026-03-29T16:00:33.181Z","updated_at":"2026-04-08T13:00:33.892Z","avatar_url":"https://github.com/aiming-lab.png","language":"Python","readme":"\u003cdiv align=\"center\"\u003e\n\n\u003cimg src=\"assets/new_logo2.png\" alt=\"MetaClaw\" width=\"600\"\u003e\n\n\u003cbr/\u003e\n\n# Just talk to your agent — it learns and *EVOLVES*.\n\n\u003cp\u003eInspired by how brains learn. Meta-learn and evolve your 🦞 from every conversation in the wild. No GPU required.\n  \n\u003cbr/\u003e\n\n\n\u003cimg src=\"assets/metaclaw_mainfig_v2.png\" alt=\"MetaClaw Architecture\" width=\"800\"\u003e\n\n\u003cbr/\u003e\n\n\n\u003cp\u003e\n  \u003ca href=\"https://arxiv.org/abs/2603.17187\"\u003e\u003cimg src=\"https://img.shields.io/badge/📄_Technical_Report-purple?style=flat-square\" alt=\"Tech Report\" /\u003e\u003c/a\u003e\n  \u003ca href=\"https://github.com/aiming-lab/MetaClaw\"\u003e\u003cimg src=\"https://img.shields.io/badge/github-MetaClaw-181717?style=flat\u0026labelColor=555\u0026logo=github\u0026logoColor=white\" alt=\"GitHub\"\u003e\u003c/a\u003e\n  \u003ca href=\"LICENSE\"\u003e\u003cimg src=\"https://img.shields.io/badge/License-MIT-green?style=flat\u0026labelColor=555\" alt=\"License MIT\"\u003e\u003c/a\u003e\n  \u003cimg src=\"https://img.shields.io/badge/⚡_Fully_Async-yellow?style=flat\u0026labelColor=555\" alt=\"Fully Async\" /\u003e\n  \u003cimg src=\"https://img.shields.io/badge/☁️_No_GPU_Cluster-blue?style=flat\u0026labelColor=555\" alt=\"No GPU Cluster\" /\u003e\n  \u003cimg src=\"https://img.shields.io/badge/🛠️_Skill_Evolution-orange?style=flat\u0026labelColor=555\" alt=\"Skill Evolution\" /\u003e\n  \u003cimg src=\"https://img.shields.io/badge/🚀_One--Click_Deploy-green?style=flat\u0026labelColor=555\" alt=\"One-Click Deploy\" /\u003e\n\u003c/p\u003e\n\n[🇨🇳 中文](./assets/README_ZH.md) • [🇯🇵 日本語](./assets/README_JA.md) • [🇰🇷 한국어](./assets/README_KO.md) • [🇫🇷 Français](./assets/README_FR.md) • [🇩🇪 Deutsch](./assets/README_DE.md) • [🇪🇸 Español](./assets/README_ES.md) • [🇧🇷 Português](./assets/README_PT.md) • [🇷🇺 Русский](./assets/README_RU.md) • [🇮🇹 Italiano](./assets/README_IT.md) • [🇻🇳 Tiếng Việt](./assets/README_VI.md) • [🇦🇪 العربية](./assets/README_AR.md) • [🇮🇳 हिन्दी](./assets/README_HI.md)\n\n\u003cbr/\u003e\n\n[Overview](#-overview) • [Quick Start](#-quick-start) • [Multi-Claw Support](#-multi-claw-support) • [Configuration](#️-configuration) • [Skills Mode](#-skills-mode) • [RL Mode](#-rl-mode) • [Auto Mode](#-auto-mode-default) • [Memory](#-memory) • [Citation](#-citation)\n\n\u003c/div\u003e\n\n---\n\n\u003cdiv align=\"center\"\u003e\n\n### Two commands. That's it.\n\u003c/div\u003e\n\n```bash\nmetaclaw setup              # one-time config wizard\nmetaclaw start              # default: auto mode — skills + scheduled RL training\nmetaclaw start --mode rl    # RL without scheduler (trains immediately on full batch)\nmetaclaw start --mode skills_only  # skills only, no RL (no Tinker needed)\n```\n\n\u003cdiv align=\"center\"\u003e\n\u003cimg src=\"assets/metaclaw.gif\" alt=\"MetaClaw demo\" width=\"700\"\u003e\n\u003c/div\u003e\n\n---\n\n## 🔥 News\n\n- **[03/25/2026]** **v0.4.0** — Contexture layer: MetaClaw now persists cross-session memory for users and projects. Relevant facts, preferences, and project history are automatically retrieved and injected into prompts. Includes adaptive memory policy, background consolidation, and an optional memory sidecar service.\n- **[03/24/2026]** **v0.3.3** — One-click OpenClaw plugin: MetaClaw now ships as a native OpenClaw extension — drop the folder into OpenClaw's extensions, run one command, and everything is set up automatically.\n- **[03/18/2026]** Our technical report \"[MetaClaw: Just Talk -- An Agent That Meta-Learns and Evolves in the Wild](https://arxiv.org/pdf/2603.17187)\" is out! **🏆 Ranked No. 1** on [HuggingFace Daily Papers](https://huggingface.co/papers/2603.17187)! Check it out!\n- **[03/16/2026]** **v0.3.2** — Multi-claw support: IronClaw, PicoClaw, ZeroClaw, CoPaw, NanoClaw, and NemoClaw now supported alongside OpenClaw. NanoClaw connected via new `/v1/messages` Anthropic-compatible endpoint; NemoClaw via OpenShell inference routing. Added OpenRouter as a supported LLM platform.\n- **[03/13/2026]** **v0.3.1** — MinT backend support: RL training now works with both Tinker and MinT. Configurable via `rl.backend` (auto/tinker/mint).\n- **[03/13/2026]** **v0.3** — Continual meta-learning support: slow RL updates now only run during sleep hours, idle time, or Google Calendar meetings. Added support/query set separation to prevent stale reward signals from polluting model updates.\n- **[03/11/2026]** **v0.2** — One-click deployment via `metaclaw` CLI. Skills enabled by default, RL is now opt-in.\n- **[03/09/2026]** We release **MetaClaw** — Just talk to your agent and let it evolve automatically. **NO** GPU deployment required; just plug into the **API**.\n\n---\n\n## 🎥 Demo\n\nhttps://github.com/user-attachments/assets/d86a41a8-4181-4e3a-af0e-dc453a6b8594\n\n---\n\n## 📖 Overview\n\n**MetaClaw is an agent that meta-learns and evolves in the wild.**\nJust talk to your agent as you normally would — MetaClaw turns every live conversation into a learning signal, enabling the agent to continuously improve through real-world deployment rather than offline training alone.\n\nUnder the hood, it places your model behind a proxy that intercepts interactions from your personal agent (OpenClaw, CoPaw, IronClaw, PicoClaw, ZeroClaw, NanoClaw, NemoClaw, or any OpenAI-compatible client), injects relevant skills at each turn, and meta-learns from accumulated experience. For Anthropic-native agents like NanoClaw, MetaClaw also exposes a `/v1/messages` Anthropic-compatible endpoint so the full pipeline works without any agent-side changes. Skills are summarized automatically after each session; with RL enabled, a meta-learning scheduler defers weight updates to idle windows so the agent is never interrupted during active use.\n\nNo GPU cluster required. MetaClaw works with any OpenAI-compatible LLM API out of the box, and uses a Tinker-compatible backend for cloud-based LoRA training. [Tinker](https://www.thinkingmachines.ai/tinker/) is the default reference path; MinT and Weaver can be enabled through separate compatibility packages when needed.\n\n## 🤖 Key Features\n\n### **One-click deployment**\nConfigure once with `metaclaw setup`, then `metaclaw start` brings up the proxy, injects skills, and wires your chosen personal agent (OpenClaw, CoPaw, or IronClaw) automatically. No manual shell scripts needed.\n\n### **Three operating modes**\n\n| Mode | Default | What it does |\n|------|---------|--------------|\n| `skills_only` | | Proxy your LLM API. Skills injected and auto-summarized after each session. No GPU/Tinker required. |\n| `rl` | | Skills + RL training (GRPO). Trains immediately when a batch is full. Optional OPD for teacher distillation. |\n| `auto` | ✅ | Skills + RL + smart scheduler. RL weight updates only run during sleep/idle/meeting windows. |\n\n### **Long-term memory**\nMetaClaw can persist facts, preferences, and project history across sessions and inject relevant context at each turn — so your agent remembers what you've told it, even weeks later.\n\n### **Asynchronous by design**\nServing, reward modeling, and training are fully decoupled. The agent continues responding while scoring and optimization run in parallel.\n\n---\n\n## 🚀 Quick Start\n\n### 1. Install\n\n**OpenClaw (one-click):** use the [v0.4.0](https://github.com/aiming-lab/MetaClaw/releases/tag/v0.4.0) release—run the snippet below, then `metaclaw setup` and `metaclaw start`. More detail (Windows, mirrors, config, troubleshooting): [`extensions/metaclaw-openclaw/README.md`](./extensions/metaclaw-openclaw/README.md).\n\n```bash\ncurl -LO https://github.com/aiming-lab/MetaClaw/releases/download/v0.4.0/metaclaw-plugin.zip\nunzip metaclaw-plugin.zip -d ~/.openclaw/extensions\nopenclaw plugins enable metaclaw-openclaw \u0026\u0026 openclaw gateway restart\n```\n\n**pip** (PyPI or this repo):\n\n```bash\npip install -e .                        # skills_only mode (lightweight)\npip install -e \".[rl]\"                  # + RL training support (torch, transformers, tinker)\npip install -e \".[evolve]\"              # + skill evolution via OpenAI-compatible LLM\npip install -e \".[scheduler]\"           # + Google Calendar integration for scheduler\npip install -e \".[rl,evolve,scheduler]\" # recommended for full RL + scheduler setup\n```\n (Optional) WeChat integration uses the official [`@tencent-weixin/openclaw-weixin`](https://github.com/nicepkg/openclaw-weixin) plugin. MetaClaw auto-installs it when WeChat is enabled:\n\n```bash\nmetaclaw config wechat.enabled true\nmetaclaw start\n```\n\nThe plugin is installed automatically on `metaclaw start`. You can also install it manually:\n\n```bash\nnpx -y @tencent-weixin/openclaw-weixin-cli@latest install\n```\n\nTo switch WeChat accounts (re-login with a new QR code):\n\n```bash\nmetaclaw start --wechat-relogin\n```\n\nIf you want to run `rl.backend=mint`, install the MinT compatibility package separately in the same environment, for example [`mindlab-toolkit`](https://github.com/MindLab-Research/mindlab-toolkit). Similarly, for `rl.backend=weaver`, install [`nex-weaver`](https://github.com/nex-agi/weaver) separately. MetaClaw keeps these dependencies out of the default package so RL users can choose Tinker, MinT, or Weaver explicitly.\n\n### 2. Configure\n\n```bash\nmetaclaw setup\n```\n\nThe interactive wizard will ask you to:\n1. **Choose your personal agent** — `openclaw`, `copaw`, `ironclaw`, `picoclaw`, `zeroclaw`, `nanoclaw`, `nemoclaw`, or `none` (MetaClaw will auto-configure it on start)\n2. **Choose your auth method** — `api_key` (direct API) or `oauth_token` (CLI subprocess)\n3. **Choose your LLM provider**:\n   - **api_key**: Kimi, Qwen, OpenAI, Volcano Engine, or custom → enter API base + API key\n   - **oauth_token**: Anthropic (Claude Code), OpenAI Codex, or Gemini CLI → paste OAuth token\n4. **Enter your model ID** and optionally enable RL training\n\nMetaClaw's RL path can switch explicitly between `tinker`, `mint`, and `weaver`. `auto` is the recommended default and will infer the backend from credentials, base URLs, or environment variables when the corresponding package is installed.\n\n**Tinker**:\n\n```bash\nmetaclaw config rl.backend tinker\nmetaclaw config rl.api_key sk-...\nmetaclaw config rl.model moonshotai/Kimi-K2.5\n```\n\n**MinT**:\n\n```bash\nmetaclaw config rl.backend mint\nmetaclaw config rl.api_key sk-mint-...\nmetaclaw config rl.base_url https://mint.macaron.xin/\nmetaclaw config rl.model Qwen/Qwen3-4B-Instruct-2507\n```\n\n**Weaver**:\n\n```bash\nmetaclaw config rl.backend weaver\nmetaclaw config rl.api_key sk-...\nmetaclaw config rl.base_url https://weaver-console.nex-agi.cn\nmetaclaw config rl.model Qwen/Qwen3-8B\n```\n\nLegacy aliases `rl.tinker_api_key` and `rl.tinker_base_url` are still accepted for backward compatibility.\n\n### 3. Start\n\n```bash\nmetaclaw start\n```\n\nThat's it. MetaClaw starts the proxy, automatically configures your chosen personal agent to use it, and restarts the gateway. Open your agent and start chatting — skills are injected at every turn, and the session is automatically summarized into new skills when you're done.\n\n---\n\n## 🦞 Multi-Claw Support\n\nMetaClaw works as a transparent proxy in front of any personal agent that supports an OpenAI-compatible LLM backend. The `claw_type` setting tells MetaClaw which agent to auto-configure when it starts.\n\n| `claw_type` | Agent | What MetaClaw does on `start` |\n|---|---|---|\n| `openclaw` | [OpenClaw](https://openclaw.ai) | Runs `openclaw config set models.providers.metaclaw …` + `gateway restart`. Uses the `anthropic-messages` API format so memory plugins (Hindsight, mem0, memory-lancedb) receive `event.rawMessage` correctly. |\n| `copaw` | [CoPaw](https://github.com/agentscope-ai/CoPaw) | Patches `~/.copaw/config.json` → `models.default` → `openai_compatible` pointing at the proxy port. CoPaw's ConfigWatcher hot-reloads automatically. |\n| `ironclaw` | [IronClaw](https://github.com/nearai/ironclaw) | Patches `~/.ironclaw/.env` → `LLM_BACKEND=openai_compatible` + `LLM_BASE_URL/MODEL/API_KEY`. Runs `ironclaw service restart`. |\n| `picoclaw` | [PicoClaw](https://github.com/sipeed/picoclaw) | Injects a `metaclaw` entry into `~/.picoclaw/config.json` `model_list` and sets it as the default model. Runs `picoclaw gateway restart`. |\n| `zeroclaw` | [ZeroClaw](https://github.com/zeroclaw-labs/zeroclaw) | Patches `~/.zeroclaw/config.toml` → `provider = \"openai-compatible\"` + `base_url/model/api_key`. Runs `zeroclaw service restart`. |\n| `nanoclaw` | [NanoClaw](https://github.com/qwibitai/nanoclaw) | Patches nanoclaw's `.env` → `ANTHROPIC_BASE_URL` pointing at the proxy's `/v1/messages` Anthropic-compatible endpoint. Restarts via `launchctl` (macOS) or `systemctl --user` (Linux). |\n| `nemoclaw` | [NemoClaw](https://github.com/NVIDIA/NemoClaw) | Registers a `metaclaw` provider in OpenShell via `openshell provider create` and sets it as the active inference route via `openshell inference set`. Persists config to `~/.nemoclaw/config.json`. |\n| `hermes` | [Hermes Agent](https://github.com/NousResearch/hermes-agent) | Injects a `metaclaw` entry into `~/.hermes/config.yaml` `custom_providers` and sets `model.provider: custom:metaclaw`. Runs `hermes gateway restart`. |\n| `none` | — | Skips auto-configuration. Point your agent at the proxy manually. |\n\n### Setup\n\nPick your agent during `metaclaw setup` (the first question in the wizard):\n\n```\nPersonal agent to configure (openclaw/copaw/ironclaw/picoclaw/zeroclaw/nanoclaw/nemoclaw/hermes/none) [openclaw]:\n```\n\nOr set it directly at any time:\n\n```bash\nmetaclaw config claw_type copaw      # switch to CoPaw\nmetaclaw config claw_type ironclaw   # switch to IronClaw\nmetaclaw config claw_type picoclaw   # switch to PicoClaw\nmetaclaw config claw_type zeroclaw   # switch to ZeroClaw\nmetaclaw config claw_type nanoclaw   # switch to NanoClaw\nmetaclaw config claw_type nemoclaw   # switch to NemoClaw\nmetaclaw config claw_type hermes     # switch to Hermes Agent\nmetaclaw config claw_type none       # manual / custom agent\n```\n\nThen run `metaclaw start` as usual — the proxy comes up and the chosen agent is wired automatically.\n\n### Manual wiring (claw_type=none)\n\nPoint any OpenAI-compatible client at the MetaClaw proxy:\n\n```\nbase_url: http://127.0.0.1:30000/v1\napi_key:  metaclaw          # or whatever proxy.api_key is set to\nmodel:    \u003cyour model id\u003e\n```\n\nFor Anthropic-native clients (e.g. the Claude SDK or NanoClaw's credential proxy), use the Anthropic-compatible endpoint instead:\n\n```\nANTHROPIC_BASE_URL: http://127.0.0.1:30000\nANTHROPIC_API_KEY:  metaclaw\n```\n\n---\n\n## ⚙️ Configuration\n\nConfiguration lives in `~/.metaclaw/config.yaml`, created by `metaclaw setup`.\n\n**CLI commands:**\n\n```\nmetaclaw setup                  # Interactive first-time configuration wizard\nmetaclaw start                  # Start MetaClaw (default: auto mode)\nmetaclaw start --mode rl        # Force RL mode (no scheduler) for this session\nmetaclaw start --mode skills_only  # Force skills-only mode for this session\nmetaclaw stop                   # Stop a running MetaClaw instance\nmetaclaw status                 # Check proxy health, running mode, and scheduler state\nmetaclaw config show            # View current configuration\nmetaclaw config KEY VALUE       # Set a config value\nmetaclaw config llm.oauth_token TOKEN        # Store OAuth token for current CLI provider\nmetaclaw auth paste-token --provider anthropic      # Store OAuth token (anthropic | openai-codex | gemini)\nmetaclaw auth status                                # Show all stored auth profiles\nmetaclaw uninstall              # Remove all MetaClaw data, OpenClaw extension, and pip package\n```\n\nWhen you start MetaClaw, the command waits until the local proxy becomes healthy before returning. Use `metaclaw status` to verify readiness and `metaclaw stop` to stop the background process.\n\n\u003cdetails\u003e\n\u003csummary\u003e\u003cb\u003eFull config reference (click to expand)\u003c/b\u003e\u003c/summary\u003e\n\n```yaml\nmode: auto                 # \"auto\" | \"rl\" | \"skills_only\"\nclaw_type: openclaw        # \"openclaw\" | \"copaw\" | \"ironclaw\" | \"picoclaw\" | \"zeroclaw\" | \"nanoclaw\" | \"nemoclaw\" | \"hermes\" | \"none\"\n\nllm:\n  auth_method: api_key      # \"api_key\" | \"oauth_token\"\n  provider: kimi            # kimi | qwen | openai | minimax | novita | openrouter | volcengine | custom\n  model_id: moonshotai/Kimi-K2.5\n  api_base: https://api.moonshot.cn/v1\n  api_key: sk-...\n  # oauth_token example (token stored via `metaclaw auth paste-token`):\n  # auth_method: oauth_token\n  # provider: anthropic     # anthropic | openai-codex | gemini\n  # model_id: claude-sonnet-4-6\n\nproxy:\n  port: 30000\n  api_key: \"\"              # optional bearer token for the local MetaClaw proxy\n\nskills:\n  enabled: true\n  dir: ~/.metaclaw/skills   # your skill library\n  retrieval_mode: template  # template | embedding\n  top_k: 6\n  task_specific_top_k: 10   # cap task-specific skills (default 10)\n  auto_evolve: true         # auto-summarize skills after each session\n\nrl:\n  enabled: false            # set to true to enable RL training\n  backend: auto             # \"auto\" | \"tinker\" | \"mint\" | \"weaver\"\n  model: moonshotai/Kimi-K2.5\n  api_key: \"\"\n  base_url: \"\"              # optional backend endpoint, e.g. https://mint.macaron.xin/ for MinT or https://weaver-console.nex-agi.cn for Weaver\n  tinker_api_key: \"\"        # legacy alias for api_key\n  tinker_base_url: \"\"       # legacy alias for base_url\n  prm_url: https://api.openai.com/v1\n  prm_model: gpt-5.2\n  prm_api_key: \"\"\n  lora_rank: 32\n  batch_size: 4\n  resume_from_ckpt: \"\"      # optional checkpoint path to resume training\n  evolver_api_base: \"\"      # leave empty to reuse llm.api_base\n  evolver_api_key: \"\"\n  evolver_model: gpt-5.2\n\nopd:\n  enabled: false            # set to true to enable OPD (teacher distillation)\n  teacher_url: \"\"           # teacher model base URL (OpenAI-compatible /v1/completions)\n  teacher_model: \"\"         # teacher model name (e.g., Qwen/Qwen3-32B)\n  teacher_api_key: \"\"       # teacher model API key\n  kl_penalty_coef: 1.0      # KL penalty coefficient for OPD\n\nmax_context_tokens: 20000   # prompt token cap before truncation; 0 = no truncation (recommended\n                            # for skills_only mode with large-context cloud models)\ncontext_window: 0           # context window advertised to the agent (e.g. OpenClaw compaction\n                            # threshold); 0 = auto (200 000 in skills_only, 32 768 in rl/auto)\n\nscheduler:                  # v0.3: meta-learning scheduler (auto-enabled in auto mode)\n  enabled: false            # auto mode enables this automatically; set manually for rl mode\n  sleep_start: \"23:00\"\n  sleep_end: \"07:00\"\n  idle_threshold_minutes: 30\n  min_window_minutes: 15\n  calendar:\n    enabled: false\n    credentials_path: \"\"\n    token_path: \"\"\n```\n\n\u003c/details\u003e\n\n---\n\n## 💪 Skills Mode\n\n**`metaclaw start --mode skills_only`**\n\nThe lightest mode. No GPU, no RL backend needed. MetaClaw places your LLM behind a proxy that injects relevant skills at every turn, then auto-summarizes new skills after each conversation.\n\nFor OpenAI-compatible custom providers, set `llm.api_base` to the full chat API base (usually ending in `/v1`, for example `https://your-gateway.example/v1`). In `skills_only` mode, MetaClaw reuses that same endpoint for prompt compression and related helper LLM calls unless you configure a separate evolver endpoint.\n\nSkills are short Markdown instructions stored in `~/.metaclaw/skills/` as individual `SKILL.md` files. The library grows automatically with your usage.\n\nTo pre-load the built-in skill bank (40+ skills across coding, security, agentic tasks, etc.):\n\n```bash\ncp -r memory_data/skills/* ~/.metaclaw/skills/\n```\n\n---\n\n## 🔬 RL Mode\n\n**`metaclaw start --mode rl`**\n\nEverything in Skills Mode, plus continuous RL fine-tuning from live conversations. Each conversation turn is tokenized and submitted as a training sample. A judge LLM (PRM) scores responses asynchronously, and a Tinker-compatible backend (Tinker cloud, MinT, or Weaver) runs LoRA fine-tuning with hot-swapped weights.\n\n**Tinker**:\n\n```bash\nmetaclaw config rl.backend tinker\nmetaclaw config rl.api_key sk-...\nmetaclaw config rl.model moonshotai/Kimi-K2.5\nmetaclaw config rl.prm_url https://api.openai.com/v1\nmetaclaw config rl.prm_api_key sk-...\nmetaclaw start --mode rl\n```\n\n**MinT**:\n\n```bash\nmetaclaw config rl.backend mint\nmetaclaw config rl.api_key sk-mint-...\nmetaclaw config rl.base_url https://mint.macaron.xin/\nmetaclaw config rl.model Qwen/Qwen3-4B-Instruct-2507\nmetaclaw config rl.prm_url https://api.openai.com/v1\nmetaclaw config rl.prm_api_key sk-...\nmetaclaw start --mode rl\n```\n\n**Weaver**:\n\n```bash\nmetaclaw config rl.backend weaver\nmetaclaw config rl.api_key sk-...\nmetaclaw config rl.base_url https://weaver-console.nex-agi.cn\nmetaclaw config rl.model Qwen/Qwen3-8B\nmetaclaw config rl.prm_url https://api.openai.com/v1\nmetaclaw config rl.prm_api_key sk-...\nmetaclaw start --mode rl\n```\n\nA dedicated evolver LLM also extracts new skills from failed episodes, feeding them back into the skill library.\n\n**Programmatic rollout** (no OpenClaw TUI needed): set `openclaw_env_data_dir` to a directory of JSONL task files:\n\n```json\n{\"task_id\": \"task_1\", \"instruction\": \"Register the webhook at https://example.com/hook\"}\n```\n\n### On-Policy Distillation (OPD)\n\nOPD is an optional add-on for RL Mode. It distills a larger teacher model into the student on-policy: the student generates responses as usual, and the teacher provides per-token log-probabilities on those same responses. A KL penalty steers the student toward the teacher's distribution.\n\n```bash\nmetaclaw config opd.enabled true\nmetaclaw config opd.teacher_url http://localhost:8082/v1\nmetaclaw config opd.teacher_model Qwen/Qwen3-32B\nmetaclaw config opd.kl_penalty_coef 1.0\n```\n\nThe teacher must be served behind an OpenAI-compatible `/v1/completions` endpoint (e.g., vLLM, SGLang). OPD can be combined with PRM scoring, both run asynchronously. See `examples/run_conversation_opd.py` and `scripts/run_openclaw_tinker_opd.sh`.\n\n---\n\n## 🧠 Auto Mode (Default)\n\n**`metaclaw start`**\n\nEverything in RL Mode, plus a meta-learning scheduler that defers weight updates to user-inactive windows so the agent is never interrupted during active use. This is the default mode.\n\nThe RL weight hot-swap step pauses the agent for several minutes. Instead of training immediately when a batch is full (like RL Mode does), auto mode waits for an appropriate window.\n\nThree conditions trigger an update window (any one is sufficient):\n\n- **Sleep hours**: configurable start/end time (e.g., 23:00 to 07:00)\n- **Keyboard inactivity**: triggers after N minutes of idle time\n- **Google Calendar events**: detects meetings so updates can run while you're away\n\n```bash\nmetaclaw config scheduler.sleep_start \"23:00\"\nmetaclaw config scheduler.sleep_end   \"07:00\"\nmetaclaw config scheduler.idle_threshold_minutes 30\n\n# Optional: Google Calendar integration\npip install -e \".[scheduler]\"\nmetaclaw config scheduler.calendar.enabled true\nmetaclaw config scheduler.calendar.credentials_path ~/.metaclaw/client_secrets.json\n```\n\nIf the user returns mid-update, the partial batch is saved and resumed at the next window.\n\nEach `ConversationSample` is tagged with a `skill_generation` version. When skill evolution bumps the generation, the RL buffer is flushed so only post-evolution samples are used for gradient updates (MAML support/query set separation).\n\n---\n\n## 🧠 Memory\n\nMetaClaw v0.4.0 adds a long-term memory layer that runs alongside skills. Where skills capture *how* to do things, memory captures *what* has happened — user preferences, project state, recurring context, and cross-session facts.\n\n### How it works\n\nAt the end of each session, MetaClaw extracts structured memory units from the conversation and stores them locally. On the next turn, relevant memories are retrieved and injected into the prompt alongside skills — so the agent knows what you've worked on before, without you having to repeat yourself.\n\nMemory runs entirely in the background. There is nothing new to configure for basic use; it activates automatically when `memory.enabled` is set to `true`.\n\n```bash\nmetaclaw config memory.enabled true\n```\n\n### Memory types\n\n| Type | What it captures |\n|------|-----------------|\n| `episodic` | Specific past events and actions |\n| `semantic` | General facts about the user or project |\n| `preference` | Stated or inferred user preferences |\n| `project_state` | Current goals, open tasks, recent decisions |\n| `working_summary` | Rolling summary of recent activity |\n\n### Configuration\n\n```yaml\nmemory:\n  enabled: false\n  top_k: 5                       # memories injected per turn\n  max_tokens: 800                # token budget for memory block\n  retrieval_mode: hybrid         # keyword | semantic | hybrid\n  consolidation_interval: 10     # consolidate every N sessions\n  store_path: ~/.metaclaw/memory # local storage path\n```\n\n### Memory sidecar (optional)\n\nFor deployments that require process isolation, MetaClaw ships with a standalone memory sidecar service (`openclaw-metaclaw-memory`). When configured, the main proxy delegates all memory reads and writes to the sidecar over a local HTTP API.\n\n```bash\nmetaclaw config memory.sidecar_url http://127.0.0.1:30001\n```\n\n---\n\n## 📊 Benchmark\n\nThe MetaClaw-Bench evaluates how well AI agents learn and adapt from multi-day interaction histories. It ships two dataset variants (30-day full and 12-day small) with a CLI that covers the full pipeline — validation, inference, scoring, and reporting.\n\nSee [`benchmark/README.md`](benchmark/README.md) for setup and usage.\n\n---\n\n## 🗑️ Uninstall\n\n```bash\nmetaclaw uninstall\n```\n\nThis removes everything in one step: stops the running instance, cleans MetaClaw references from `~/.openclaw/openclaw.json`, deletes `~/.openclaw/extensions/metaclaw-openclaw/`, deletes `~/.metaclaw/`, uninstalls the pip package, and restarts the OpenClaw gateway. You will be prompted to confirm before anything is deleted.\n\nAfter uninstall, remove the source repo manually if you cloned it:\n\n```bash\nrm -rf /path/to/MetaClaw\n```\n\n---\n\n## 📚 Citation\n\n```bibtex\n@article{xia2026metaclaw,\n  title={MetaClaw: Just Talk An Agent That Meta-Learns and Evolves in the Wild},\n  author={Xia, Peng and Chen, Jianwen and Yang, Xinyu and Tu, Haoqin and Liu, Jiaqi and Xiong, Kaiwen and Han, Siwei and Qiu, Shi and Ji, Haonian and Zhou, Yuyin and Zheng, Zeyu and Xie, Cihang and Yao, Huaxiu},\n  journal={arXiv preprint arXiv:2603.17187},\n  year={2026}\n}\n```\n\n---\n\n## 🙏 Acknowledgements\n\nMetaClaw builds on top of the following open-source projects:\n\n- [OpenClaw](https://openclaw.ai) – the primary supported personal agent.\n- [CoPaw](https://github.com/agentscope-ai/CoPaw) – multi-channel personal agent support.\n- [IronClaw](https://github.com/nearai/ironclaw) – Rust-native personal agent support.\n- [NanoClaw](https://github.com/qwibitai/nanoclaw) – container-isolated Anthropic-native personal agent.\n- [NemoClaw](https://github.com/NVIDIA/NemoClaw) – NVIDIA OpenShell-sandboxed personal agent with NIM inference.\n- [SkillRL](https://github.com/aiming-lab/SkillRL) – our skill-augmented RL framework.\n- [Tinker](https://www.thinkingmachines.ai/tinker/) – used for online RL training.\n- [MinT](https://github.com/MindLab-Research/mindlab-toolkit) – alternative backend for online RL training.\n- [Weaver](https://github.com/nex-agi/weaver) – alternative backend for online RL training.\n- [OpenClaw-RL](https://github.com/Gen-Verse/OpenClaw-RL) – inspiration for our RL design.\n- [awesome-openclaw-skills](https://github.com/VoltAgent/awesome-openclaw-skills) – provides the foundation for our skill bank.\n\n---\n\n## 📄 License\n\nThis project is licensed under the [MIT License](LICENSE).\n","funding_links":[],"categories":["Personal Assistants","AI Agent Frameworks \u0026 SDKs","Python","Trending Repos — 19 March 2026"],"sub_categories":["Cognitive Architecture Frameworks"],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Faiming-lab%2FMetaClaw","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Faiming-lab%2FMetaClaw","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Faiming-lab%2FMetaClaw/lists"}