{"id":50809964,"url":"https://github.com/microsoft/SkillOpt","last_synced_at":"2026-06-16T19:00:40.983Z","repository":{"id":359401196,"uuid":"1232669890","full_name":"microsoft/SkillOpt","owner":"microsoft","description":"SkillOpt is a text-space optimizer that trains reusable natural-language skills for frozen LLM agents through trajectory-driven edits, validation-gated updates, and deployable best_skill.md artifacts.","archived":false,"fork":false,"pushed_at":"2026-06-15T17:12:43.000Z","size":22643,"stargazers_count":7160,"open_issues_count":16,"forks_count":683,"subscribers_count":36,"default_branch":"main","last_synced_at":"2026-06-15T19:15:12.198Z","etag":null,"topics":["agent-skills","self-evolving-agents"],"latest_commit_sha":null,"homepage":"https://aka.ms/skillopt","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/microsoft.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":"SECURITY.md","support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2026-05-08T06:41:01.000Z","updated_at":"2026-06-15T19:15:11.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/microsoft/SkillOpt","commit_stats":null,"previous_names":["microsoft/skillopt"],"tags_count":1,"template":false,"template_full_name":null,"purl":"pkg:github/microsoft/SkillOpt","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/microsoft%2FSkillOpt","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/microsoft%2FSkillOpt/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/microsoft%2FSkillOpt/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/microsoft%2FSkillOpt/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/microsoft","download_url":"https://codeload.github.com/microsoft/SkillOpt/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/microsoft%2FSkillOpt/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":34419350,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-26T15:22:16.424Z","status":"online","status_checked_at":"2026-06-16T02:00:06.860Z","response_time":126,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["agent-skills","self-evolving-agents"],"created_at":"2026-06-13T04:00:28.841Z","updated_at":"2026-06-16T19:00:40.974Z","avatar_url":"https://github.com/microsoft.png","language":"Python","funding_links":[],"categories":["Microsoft Research","📈 Papers - Memory for Agent Evolution","\u003ca name=\"Python\"\u003e\u003c/a\u003ePython"],"sub_categories":["🧭 Reinforcement Learning \u0026 Continual Learning"],"readme":"# SkillOpt: Executive Strategy for Self-Evolving Agent Skills\n\n*Train agent skills like you train neural networks — with epochs, (mini-)batchsize, learning rates, and validation gates — but without touching model weights.*\n\n[![Project Page](https://img.shields.io/badge/Project%20Page-SkillOpt-8dbb3c)](https://microsoft.github.io/SkillOpt/) [![Paper](https://img.shields.io/badge/Paper-arXiv-b31b1b)](https://arxiv.org/abs/2605.23904) [![Project Video](https://img.shields.io/badge/Project%20Video-Watch%20Demo-ff0000)](https://youtu.be/JUBMDTCiM0M) [![PyPI](https://img.shields.io/badge/PyPI-skillopt-green.svg)](https://pypi.org/project/skillopt/) [![Python 3.10+](https://img.shields.io/badge/Python-3.10%2B-blue.svg)](https://www.python.org/) [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE)\n\n\u003e 📖 **For installation, data preparation, training/eval commands, the full configuration reference, and framework internals, see the [Documentation \u0026 Reproduction Guide](https://microsoft.github.io/SkillOpt/docs/guideline.html)** (rendered on GitHub Pages).\n\n---\n\n## News 🔥🔥🔥\n- **[2026-06-15]** 😴 **SkillOpt-Sleep (preview)** — a nightly offline self-evolution companion for local coding agents (Claude Code / Codex / Copilot): review past sessions, replay recurring tasks, and consolidate validated skills behind a held-out gate. See **[`docs/sleep/README.md`](docs/sleep/README.md)** for what it is, how to use it, and results.\n- **[2026-06-03]** 🎉 **[gbrain](https://github.com/garrytan/gbrain), [gbrain-evals](https://github.com/garrytan/gbrain-evals/blob/main/docs/benchmarks/2026-06-03-skillopt.md), and [darwin-skill](https://github.com/alchaincyf/darwin-skill) have all integrated SkillOpt.**\n- **[2026-06-02]** 🎉 **SkillOpt [v0.1.0](https://github.com/microsoft/SkillOpt/releases/tag/v0.1.0) is now available on [PyPI](https://pypi.org/project/skillopt/)!** Install with `pip install skillopt`. This initial release includes the full training loop (rollout → reflect → aggregate → select → update → evaluate), multi-backend support (OpenAI / Azure / Claude / Qwen / MiniMax), six built-in benchmarks, and WebUI dashboard.\n\n---\n\n## Overview\n\nModern agent skills are usually hand-crafted, generated one-shot by a strong\nLLM, or evolved through loosely controlled self-revision — none of which\nbehaves like a deep-learning optimizer for the skill itself, and none of\nwhich reliably improves over its starting point under feedback.\n\n**SkillOpt treats the skill document as the trainable state of a frozen\nagent**, and trains it with the discipline that makes weight-space\noptimization reproducible. A separate optimizer model turns scored rollouts\ninto bounded add / delete / replace edits on a single skill document; a\ncandidate edit is accepted only when it strictly improves a held-out\nvalidation score. A textual learning-rate budget, a rejected-edit buffer,\nand an epoch-wise slow / meta update make skill training stable while\nadding **zero inference-time model calls** at deployment.\n\nThe deployed artifact is a compact `best_skill.md` (typically 300–2,000\ntokens) that runs against the unchanged target model. Across **six\nbenchmarks, seven target models, and three execution harnesses** (direct\nchat, Codex CLI, Claude Code CLI), SkillOpt is best or tied-best on **all\n52 evaluated (model, benchmark, harness) cells** and on GPT-5.5 lifts the\naverage no-skill accuracy by **+23.5 points in direct chat, +24.8 inside\nthe Codex agentic loop, and +19.1 inside Claude Code**. Optimized skill\nartifacts transfer across model scales, between Codex and Claude Code\nharnesses, and to nearby benchmarks without further optimization.\n\nFor the full method, ablations, and per-cell results see the [paper](https://arxiv.org/abs/2605.23904); for a visual walkthrough of the loop see the [project page](https://microsoft.github.io/SkillOpt/); for deeper API / backend / benchmark docs see [`docs/`](docs/).\n\n## 🎬 Demo Video\n\nhttps://github.com/user-attachments/assets/eb12d3bc-371c-467f-904d-91b61f339ed7\n\n\u003cp align=\"center\"\u003e\n  \u003ca href=\"https://youtu.be/JUBMDTCiM0M\"\u003e\u003cb\u003e▶ Watch the full demo on YouTube\u003c/b\u003e\u003c/a\u003e\n\u003c/p\u003e\n\n---\n\n## Extensibility \u0026 WebUI\n\n### Adding a new backend\n\nA backend = a chat / exec target (e.g. `openai_chat`, `claude_chat`,\n`qwen_chat`, `minimax_chat`, `codex_exec`, `claude_code_exec`). See\n[`docs/guide/new-backend.md`](docs/guide/new-backend.md) for the full\ncontract; in short you add a `skillopt/model/\u003cname\u003e_backend.py` module,\nregister it in `skillopt/model/common.py` + `backend_config.py`, and wire\nit through the router in `skillopt/model/__init__.py`. `qwen_backend.py`\nand `minimax_backend.py` are good templates.\n\n### Adding a new benchmark\n\nA benchmark = a `skillopt/envs/\u003cname\u003e/` package with a `dataloader.py`, a\n`rollout.py`, and an `initial.md` seed skill. See\n[`docs/guide/new-benchmark.md`](docs/guide/new-benchmark.md) for the full\ncontract; the simplest reference is `skillopt/envs/searchqa/`.\n\n### WebUI\n\nLaunch the monitoring dashboard (optional):\n\n```bash\npip install -e \".[webui]\"\npython -m skillopt_webui.app\n```\n\n| Flag | Default | Description |\n|---|---|---|\n| `--port` | 7860 | Server port |\n| `--host` | `0.0.0.0` | Bind address |\n| `--share` | off | Create a public Gradio share link |\n\n---\n\n## Citation\n\n```bibtex\n@misc{yang2026skilloptexecutivestrategyselfevolving,\n      title={SkillOpt: Executive Strategy for Self-Evolving Agent Skills}, \n      author={Yifan Yang and Ziyang Gong and Weiquan Huang and Qihao Yang and Ziwei Zhou and Zisu Huang and Yan Li and Xuemei Gao and Qi Dai and Bei Liu and Kai Qiu and Yuqing Yang and Dongdong Chen and Xue Yang and Chong Luo},\n      year={2026},\n      eprint={2605.23904},\n      archivePrefix={arXiv},\n      primaryClass={cs.AI},\n      url={https://arxiv.org/abs/2605.23904}\n}\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmicrosoft%2FSkillOpt","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmicrosoft%2FSkillOpt","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmicrosoft%2FSkillOpt/lists"}