{"id":43453855,"url":"https://github.com/clay-good/spec-gen","last_synced_at":"2026-04-02T19:20:40.586Z","repository":{"id":335698719,"uuid":"1146753014","full_name":"clay-good/spec-gen","owner":"clay-good","description":"Automate the reverse-engineering of your codebase into structured OpenSpec specifications using static analysis and LLM-powered generation to extract business logic, verify architectural accuracy, and maintain a living source of truth.","archived":false,"fork":false,"pushed_at":"2026-03-26T11:44:50.000Z","size":2045,"stargazers_count":61,"open_issues_count":2,"forks_count":10,"subscribers_count":1,"default_branch":"main","last_synced_at":"2026-03-26T18:57:34.629Z","etag":null,"topics":["api-documentation","automated-documentation","claude-ai","code-analysis","code-archaeology","dependency-graph","developer-tools","documentation-generator","domain-driven-design","gemini","gpt-4","infrastructure-as-code","llm","openspec","reverse-engineering","rfc-2119","software-architecture","spec-driven-development","static-analysis","typescript"],"latest_commit_sha":null,"homepage":"https://www.npmjs.com/package/spec-gen-cli","language":"TypeScript","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/clay-good.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":"AGENTS.md","dco":null,"cla":null}},"created_at":"2026-01-31T16:15:38.000Z","updated_at":"2026-03-26T03:49:49.000Z","dependencies_parsed_at":null,"dependency_job_id":null,"html_url":"https://github.com/clay-good/spec-gen","commit_stats":null,"previous_names":["clay-good/spec-gen"],"tags_count":5,"template":false,"template_full_name":null,"purl":"pkg:github/clay-good/spec-gen","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/clay-good%2Fspec-gen","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/clay-good%2Fspec-gen/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/clay-good%2Fspec-gen/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/clay-good%2Fspec-gen/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/clay-good","download_url":"https://codeload.github.com/clay-good/spec-gen/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/clay-good%2Fspec-gen/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":30995508,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-03-26T18:07:05.776Z","status":"ssl_error","status_checked_at":"2026-03-26T18:07:05.331Z","response_time":114,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["api-documentation","automated-documentation","claude-ai","code-analysis","code-archaeology","dependency-graph","developer-tools","documentation-generator","domain-driven-design","gemini","gpt-4","infrastructure-as-code","llm","openspec","reverse-engineering","rfc-2119","software-architecture","spec-driven-development","static-analysis","typescript"],"created_at":"2026-02-03T02:52:51.445Z","updated_at":"2026-04-02T19:20:40.552Z","avatar_url":"https://github.com/clay-good.png","language":"TypeScript","readme":"# spec-gen\n\nReverse-engineer [OpenSpec](https://github.com/Fission-AI/OpenSpec) specifications from existing codebases, then keep them in sync as code evolves.\n\n## The Problem\n\nMost software has no specification. The code is the spec, scattered across thousands of files, tribal knowledge, and stale documentation. Tools like `openspec init` create empty scaffolding, but someone still has to write everything. By the time specs are written manually, the code has already changed.\n\nspec-gen automates this. It analyzes your codebase through static analysis, generates structured specifications using an LLM, and continuously detects when code and specs fall out of sync. It also exposes the analysis as an MCP server so AI agents can navigate your codebase with GraphRAG-style retrieval — semantic search combined with call-graph expansion and spec-linked peer discovery.\n\n## Quick Start\n\n```bash\n# Install from npm\nnpm install -g spec-gen-cli\n\n# Navigate to your project\ncd /path/to/your-project\n\n# Run the pipeline\nspec-gen init       # Detect project type, create config\nspec-gen analyze    # Static analysis (no API key needed)\nspec-gen generate   # Generate specs (requires API key)\nspec-gen drift      # Check for spec drift\n\n# Troubleshoot setup issues\nspec-gen doctor     # Check environment and configuration\n```\n\n\u003cdetails\u003e\n\u003csummary\u003eInstall from source\u003c/summary\u003e\n\n```bash\ngit clone https://github.com/clay-good/spec-gen\ncd spec-gen\nnpm install \u0026\u0026 npm run build \u0026\u0026 npm link\n```\n\n\u003c/details\u003e\n\n\u003cdetails\u003e\n\u003csummary\u003eNix/NixOS\u003c/summary\u003e\n\n**Run directly:**\n\n```bash\nnix run github:clay-good/spec-gen -- init\nnix run github:clay-good/spec-gen -- analyze\nnix run github:clay-good/spec-gen -- generate\n```\n\n**Temporary shell:**\n\n```bash\nnix shell github:clay-good/spec-gen\nspec-gen --version\n```\n\n**System flake integration:**\n\n```nix\n{\n  inputs.spec-gen.url = \"github:clay-good/spec-gen\";\n\n  outputs = { self, nixpkgs, spec-gen }: {\n    nixosConfigurations.myhost = nixpkgs.lib.nixosSystem {\n      modules = [{\n        environment.systemPackages = [ spec-gen.packages.x86_64-linux.default ];\n      }];\n    };\n  };\n}\n```\n\n**Development:**\n\n```bash\ngit clone https://github.com/clay-good/spec-gen\ncd spec-gen\nnix develop\nnpm run dev\n```\n\n\u003c/details\u003e\n\n## What It Does\n\n**1. Analyze** (no API key needed)\n\nScans your codebase using pure static analysis:\n- Walks the directory tree, respects .gitignore, scores files by significance\n- Parses imports and exports to build a dependency graph (TypeScript, JavaScript, Python, Go, Rust, Ruby, Java, C++, Swift)\n- Detects HTTP cross-language edges: matches `fetch`/`axios`/`ky`/`got` calls in JS/TS files to FastAPI/Flask/Django route definitions in Python files, creating cross-language dependency edges with `exact`, `path`, or `fuzzy` confidence\n- Resolves Python absolute imports (`from services.retriever import X`) to local files\n- Synthesizes cross-file dependency edges from call-graph data for languages without file-level imports (Swift, C++), so the dependency graph and viewer are meaningful even in single-module projects\n- Clusters related files into structural business domains automatically\n- Produces structured context that makes LLM generation more accurate\n\n**2. Generate** (API key required)\n\nSends the analysis context to an LLM to produce specifications:\n- Stage 1: Project survey and categorization\n- Stage 2: Entity extraction (core data models)\n- Stage 3: Service analysis (business logic)\n- Stage 4: API extraction (HTTP endpoints)\n- Stage 5: Architecture synthesis (overall structure)\n- Stage 6: ADR enrichment (Architecture Decision Records, with `--adr`)\n\n**3. Verify** (API key required)\n\nTests generated specs by predicting file contents from specs alone, then comparing predictions to actual code. Reports an accuracy score and identifies gaps.\n\n**4. Drift Detection** (no API key needed)\n\nCompares git changes against spec file mappings to find divergence:\n- **Gap**: Code changed but its spec was not updated\n- **Stale**: Spec references deleted or renamed files\n- **Uncovered**: New files with no matching spec domain\n- **Orphaned**: Spec declares files that no longer exist\n- **ADR gap**: Code changed in a domain referenced by an ADR\n- **ADR orphaned**: ADR references domains that no longer exist in specs\n\n## Architecture\n\n```mermaid\ngraph TD\n    subgraph CLI[\"CLI Layer\"]\n        CMD[spec-gen commands]\n    end\n\n    subgraph API[\"Programmatic API\"]\n        API_INIT[specGenInit]\n        API_ANALYZE[specGenAnalyze]\n        API_GENERATE[specGenGenerate]\n        API_VERIFY[specGenVerify]\n        API_DRIFT[specGenDrift]\n        API_RUN[specGenRun]\n    end\n\n    subgraph Core[\"Core Layer\"]\n        direction TB\n\n        subgraph Init[\"Init\"]\n            PD[Project Detector]\n            CM[Config Manager]\n        end\n\n        subgraph Analyze[\"Analyze -- no API key\"]\n            FW[File Walker] --\u003e SS[Significance Scorer]\n            SS --\u003e IP[Import Parser]\n            IP --\u003e DG[Dependency Graph]\n            SS --\u003e HR[HTTP Route Parser]\n            HR --\u003e|cross-language edges| DG\n            DG --\u003e RM[Repository Mapper]\n            RM --\u003e AG[Artifact Generator]\n        end\n\n        subgraph Generate[\"Generate -- API key required\"]\n            SP[Spec Pipeline] --\u003e FF[OpenSpec Formatter]\n            FF --\u003e OW[OpenSpec Writer]\n            SP --\u003e ADR[ADR Generator]\n        end\n\n        subgraph Verify[\"Verify -- API key required\"]\n            VE[Verification Engine]\n        end\n\n        subgraph Drift[\"Drift -- no API key\"]\n            GA[Git Analyzer] --\u003e SM[Spec Mapper]\n            SM --\u003e DD[Drift Detector]\n            DD -.-\u003e|optional| LE[LLM Enhancer]\n        end\n\n        LLM[LLM Service -- Anthropic / OpenAI / Compatible]\n    end\n\n    CMD --\u003e API_INIT \u0026 API_ANALYZE \u0026 API_GENERATE \u0026 API_VERIFY \u0026 API_DRIFT\n    API_RUN --\u003e API_INIT \u0026 API_ANALYZE \u0026 API_GENERATE\n\n    API_INIT --\u003e Init\n    API_ANALYZE --\u003e Analyze\n    API_GENERATE --\u003e Generate\n    API_VERIFY --\u003e Verify\n    API_DRIFT --\u003e Drift\n\n    Generate --\u003e LLM\n    Verify --\u003e LLM\n    LE -.-\u003e LLM\n\n    AG --\u003e|analysis artifacts| SP\n    AG --\u003e|analysis artifacts| VE\n\n    subgraph Output[\"Output\"]\n        SPECS[openspec/specs/*.md]\n        ADRS[openspec/decisions/*.md]\n        ANALYSIS[.spec-gen/analysis/]\n        REPORT[Drift Report]\n    end\n\n    OW --\u003e SPECS\n    ADR --\u003e ADRS\n    AG --\u003e ANALYSIS\n    DD --\u003e REPORT\n```\n\n## Drift Detection\n\nDrift detection is the core of ongoing spec maintenance. It runs in milliseconds, needs no API key, and works entirely from git diffs and spec file mappings.\n\n```bash\n$ spec-gen drift\n\n  Spec Drift Detection\n\n  Analyzing git changes...\n  Base ref: main\n  Branch: feature/add-notifications\n  Changed files: 12\n\n  Loading spec mappings...\n  Spec domains: 6\n  Mapped source files: 34\n\n  Detecting drift...\n\n   Issues Found: 3\n\n   [ERROR] gap: src/services/user-service.ts\n      Spec: openspec/specs/user/spec.md\n      File changed (+45/-12 lines) but spec was not updated\n\n   [WARNING] uncovered: src/services/email-queue.ts\n      New file has no matching spec domain\n\n   [INFO] adr-gap: openspec/decisions/adr-0001-jwt-auth.md\n      Code changed in domain(s) auth referenced by ADR-001\n\n   Summary:\n     Gaps: 2\n     Uncovered: 1\n     ADR gaps: 1\n```\n\n### ADR Drift Detection\n\nWhen `openspec/decisions/` contains Architecture Decision Records, drift detection automatically checks whether code changes affect domains referenced by ADRs. ADR issues are reported at `info` severity since code changes rarely invalidate architectural decisions. Superseded and deprecated ADRs are excluded.\n\n### LLM-Enhanced Mode\n\nStatic drift detection catches structural changes but cannot tell whether a change actually affects spec-documented behavior. A variable rename triggers the same alert as a genuine behavior change.\n\n`--use-llm` post-processes gap issues by sending each file's diff and its matching spec to the LLM. The LLM classifies each gap as relevant (keeps the alert) or not relevant (downgrades to info). This reduces false positives.\n\n```bash\nspec-gen drift              # Static mode: fast, deterministic\nspec-gen drift --use-llm    # LLM-enhanced: fewer false positives\n```\n\n## CI/CD Integration\n\nspec-gen is designed to run in automated pipelines. The deterministic commands (`init`, `analyze`, `drift`) need no API key and produce consistent results.\n\n### Pre-Commit Hook\n\n```bash\nspec-gen drift --install-hook     # Install\nspec-gen drift --uninstall-hook   # Remove\n```\n\nThe hook runs in static mode (fast, no API key needed) and blocks commits when drift is detected at warning level or above.\n\n### GitHub Actions / CI Pipelines\n\n```yaml\n# .github/workflows/spec-drift.yml\nname: Spec Drift Check\non: [pull_request]\njobs:\n  drift:\n    runs-on: ubuntu-latest\n    steps:\n      - uses: actions/checkout@v4\n        with:\n          fetch-depth: 0    # Full history needed for git diff\n      - uses: actions/setup-node@v4\n        with:\n          node-version: '20'\n      - run: npm install -g spec-gen-cli\n      - run: spec-gen drift --fail-on error --json\n```\n\n```bash\n# Or in any CI script\nspec-gen drift --fail-on error --json    # JSON output, fail on errors only\nspec-gen drift --fail-on warning         # Fail on warnings too\nspec-gen drift --domains auth,user       # Check specific domains\nspec-gen drift --no-color                # Plain output for CI logs\n```\n\n### Deterministic vs. LLM-Enhanced\n\n| | Deterministic (Default) | LLM-Enhanced |\n|---|---|---|\n| **API key** | No | Yes |\n| **Speed** | Milliseconds | Seconds per LLM call |\n| **Commands** | `analyze`, `drift`, `init` | `generate`, `verify`, `drift --use-llm` |\n| **Reproducibility** | Identical every run | May vary |\n| **Best for** | CI, pre-commit hooks, quick checks | Initial generation, reducing false positives |\n\n## LLM Providers\n\nspec-gen supports eight providers. The default is Anthropic Claude.\n\n| Provider | `provider` value | API key env var | Default model |\n|----------|-----------------|-----------------|---------------|\n| Anthropic Claude | `anthropic` *(default)* | `ANTHROPIC_API_KEY` | `claude-sonnet-4-20250514` |\n| OpenAI | `openai` | `OPENAI_API_KEY` | `gpt-4o` |\n| OpenAI-compatible *(Mistral, Groq, Ollama...)* | `openai-compat` | `OPENAI_COMPAT_API_KEY` | `mistral-large-latest` |\n| GitHub Copilot *(via copilot-api proxy)* | `copilot` | *(none)* | `gpt-4o` |\n| Google Gemini | `gemini` | `GEMINI_API_KEY` | `gemini-2.0-flash` |\n| Gemini CLI | `gemini-cli` | *(none)* | *(CLI default)* |\n| Claude Code | `claude-code` | *(none)* | *(CLI default)* |\n| Mistral Vibe | `mistral-vibe` | *(none)* | *(CLI default)* |\n\n### Selecting a provider\n\nSet `provider` (and optionally `model`) in the `generation` block of `.spec-gen/config.json`:\n\n```json\n{\n  \"generation\": {\n    \"provider\": \"openai\",\n    \"model\": \"gpt-4o-mini\",\n    \"domains\": \"auto\"\n  }\n}\n```\n\nOverride the model for a single run:\n```bash\nspec-gen generate --model claude-opus-4-20250514\n```\n\n### OpenAI-compatible servers (Ollama, Mistral, Groq, LM Studio, vLLM...)\n\nUse `provider: \"openai-compat\"` with a base URL and API key:\n\n**Environment variables:**\n```bash\nexport OPENAI_COMPAT_BASE_URL=http://localhost:11434/v1   # Ollama, LM Studio, local servers\nexport OPENAI_COMPAT_API_KEY=ollama                       # any non-empty value for local servers\n                                                          # use your real API key for cloud providers (Mistral, Groq...)\n```\n\n**Config file** (per-project):\n```json\n{\n  \"generation\": {\n    \"provider\": \"openai-compat\",\n    \"model\": \"llama3.2\",\n    \"openaiCompatBaseUrl\": \"http://localhost:11434/v1\",\n    \"domains\": \"auto\"\n  }\n}\n```\n\n**Self-signed certificates** (internal servers, VPN endpoints):\n```bash\nspec-gen generate --insecure\n```\nOr in `config.json`:\n```json\n{\n  \"generation\": {\n    \"provider\": \"openai-compat\",\n    \"openaiCompatBaseUrl\": \"https://internal-llm.corp.net/v1\",\n    \"skipSslVerify\": true,\n    \"domains\": \"auto\"\n  }\n}\n```\n\nWorks with: Ollama, LM Studio, Mistral AI, Groq, Together AI, LiteLLM, vLLM,\ntext-generation-inference, LocalAI, Azure OpenAI, and any `/v1/chat/completions` server.\n\n### GitHub Copilot (via copilot-api proxy)\n\nUse `provider: \"copilot\"` to generate specs using your GitHub Copilot subscription via the\n[copilot-api](https://github.com/ericc-ch/copilot-api) proxy, which exposes an OpenAI-compatible\nendpoint from your Copilot credentials.\n\n**Setup:**\n1. Install and start the copilot-api proxy:\n   ```bash\n   npx copilot-api\n   ```\n   By default it listens on `http://localhost:4141`.\n\n2. Configure spec-gen:\n   ```json\n   {\n     \"generation\": {\n       \"provider\": \"copilot\",\n       \"model\": \"gpt-4o\",\n       \"domains\": \"auto\"\n     }\n   }\n   ```\n\n**Environment variables** (optional):\n```bash\nexport COPILOT_API_BASE_URL=http://localhost:4141/v1   # default\nexport COPILOT_API_KEY=copilot                         # default, only needed if proxy requires auth\n```\n\nNo API key is required — the copilot-api proxy handles authentication via your GitHub Copilot session.\n\n### CLI-based providers (no API key)\n\nThree providers route LLM calls through local CLI tools instead of HTTP APIs. No API key or configuration is needed — just have the CLI installed and on your PATH.\n\n| Provider | CLI binary | Install |\n|----------|-----------|---------|\n| `claude-code` | `claude` | [Claude Code](https://docs.anthropic.com/en/docs/claude-code) (requires Claude Max/Pro subscription) |\n| `gemini-cli` | `gemini` | [Gemini CLI](https://github.com/google-gemini/gemini-cli) (free tier with Google account) |\n| `mistral-vibe` | `vibe` | [Mistral Vibe](https://github.com/mistralai/mistral-vibe) (standalone binary) |\n\n```json\n{\n  \"generation\": {\n    \"provider\": \"claude-code\",\n    \"domains\": \"auto\"\n  }\n}\n```\n\n### Custom base URL for Anthropic or OpenAI\n\nTo redirect the built-in Anthropic or OpenAI provider to a proxy or self-hosted endpoint:\n\n```bash\n# CLI (one-off)\nspec-gen generate --api-base https://my-proxy.corp.net/v1\n\n# Environment variable\nexport ANTHROPIC_API_BASE=https://my-proxy.corp.net/v1\nexport OPENAI_API_BASE=https://my-proxy.corp.net/v1\n```\n\nOr in `config.json` under the `llm` block:\n```json\n{\n  \"llm\": {\n    \"apiBase\": \"https://my-proxy.corp.net/v1\",\n    \"sslVerify\": false\n  }\n}\n```\n\n`sslVerify: false` disables TLS certificate validation -- use only for internal servers with self-signed certificates.\n\nPriority: CLI flags \u003e environment variables \u003e config file \u003e provider defaults.\n\n## Commands\n\n| Command | Description | API Key |\n|---------|-------------|---------|\n| `spec-gen init` | Initialize configuration | No |\n| `spec-gen analyze` | Run static analysis | No |\n| `spec-gen generate` | Generate specs from analysis | Yes |\n| `spec-gen generate --adr` | Also generate Architecture Decision Records | Yes |\n| `spec-gen verify` | Verify spec accuracy | Yes |\n| `spec-gen drift` | Detect spec drift (static) | No |\n| `spec-gen drift --use-llm` | Detect spec drift (LLM-enhanced) | Yes |\n| `spec-gen run` | Full pipeline: init, analyze, generate | Yes |\n| `spec-gen view` | Launch interactive graph \u0026 spec viewer in the browser | No |\n| `spec-gen mcp` | Start MCP server (stdio, for Cline / Claude Code) | No |\n| `spec-gen doctor` | Check environment and configuration for common issues | No |\n| `spec-gen refresh-stories` | Re-annotate story files with stale risk context | No |\n\n### Global Options\n\n```bash\n--api-base \u003curl\u003e       # Custom LLM API base URL (proxy / self-hosted)\n--insecure             # Disable SSL certificate verification\n--config \u003cpath\u003e        # Config file path (default: .spec-gen/config.json)\n-q, --quiet            # Errors only\n-v, --verbose          # Debug output\n--no-color             # Plain text output (enables timestamps)\n```\n\nGenerate-specific options:\n```bash\n--model \u003cname\u003e         # Override LLM model (e.g. gpt-4o-mini, llama3.2)\n```\n\n### Drift Options\n\n```bash\nspec-gen drift [options]\n  --base \u003cref\u003e           # Git ref to compare against (default: auto-detect)\n  --files \u003cpaths\u003e        # Specific files to check (comma-separated)\n  --domains \u003clist\u003e       # Only check specific domains\n  --use-llm              # LLM semantic analysis\n  --json                 # JSON output\n  --fail-on \u003cseverity\u003e   # Exit non-zero threshold: error, warning, info\n  --max-files \u003cn\u003e        # Max changed files to analyze (default: 100)\n  --verbose              # Show detailed issue information\n  --install-hook         # Install pre-commit hook\n  --uninstall-hook       # Remove pre-commit hook\n```\n\n### Generate Options\n\n```bash\nspec-gen generate [options]\n  --model \u003cname\u003e         # LLM model to use\n  --dry-run              # Preview without writing\n  --domains \u003clist\u003e       # Only generate specific domains\n  --merge                # Merge with existing specs\n  --no-overwrite         # Skip existing files\n  --adr                  # Also generate ADRs\n  --adr-only             # Generate only ADRs\n  --reanalyze            # Force fresh analysis even if recent exists\n  --analysis \u003cpath\u003e      # Path to existing analysis directory\n  --output-dir \u003cpath\u003e    # Override openspec output location\n  -y, --yes              # Skip confirmation prompts\n```\n\n### Run Options\n\n```bash\nspec-gen run [options]\n  --force                # Reinitialize even if config exists\n  --reanalyze            # Force fresh analysis even if recent exists\n  --model \u003cname\u003e         # LLM model to use for generation\n  --dry-run              # Show what would be done without making changes\n  -y, --yes              # Skip all confirmation prompts\n  --max-files \u003cn\u003e        # Maximum files to analyze (default: 500)\n  --adr                  # Also generate Architecture Decision Records\n```\n\n### Analyze Options\n\n```bash\nspec-gen analyze [options]\n  --output \u003cpath\u003e        # Output directory (default: .spec-gen/analysis/)\n  --max-files \u003cn\u003e        # Max files (default: 500)\n  --include \u003cglob\u003e       # Additional include patterns\n  --exclude \u003cglob\u003e       # Additional exclude patterns\n  --force                # Force re-analysis (bypass 1-hour cache)\n  --no-embed             # Skip building the semantic vector index (index is built by default when embedding is configured)\n  --reindex-specs        # Re-index OpenSpec specs into the vector index without re-running full analysis\n```\n\n### Verify Options\n\n```bash\nspec-gen verify [options]\n  --samples \u003cn\u003e          # Files to verify (default: 5)\n  --threshold \u003c0-1\u003e      # Minimum score to pass (default: 0.7)\n  --files \u003cpaths\u003e        # Specific files to verify\n  --domains \u003clist\u003e       # Only verify specific domains\n  --verbose              # Show detailed prediction vs actual comparison\n  --json                 # JSON output\n```\n\n### Doctor\n\n`spec-gen doctor` runs a self-diagnostic and surfaces actionable fixes when something is misconfigured or missing:\n\n```bash\nspec-gen doctor          # Run all checks\nspec-gen doctor --json   # JSON output for scripting\n```\n\nChecks performed:\n\n| Check | What it looks for |\n|-------|------------------|\n| Node.js version | ≥ 20 required |\n| Git repository | `.git` directory and `git` binary on PATH |\n| spec-gen config | `.spec-gen/config.json` exists and is parseable |\n| Analysis artifacts | `repo-structure.json` freshness (warns if \u003e24h old) |\n| OpenSpec directory | `openspec/specs/` exists |\n| LLM provider | API key or `claude` CLI detected |\n| Disk space | Warns \u003c 500 MB, fails \u003c 200 MB |\n\nRun `spec-gen doctor` whenever setup instructions aren't working — it tells you exactly what to fix and how.\n\n## Agent Setup\n\n### Passive context vs active tools\n\nAgents have two ways to acquire knowledge about your codebase:\n\n- **Passive (zero friction):** files referenced in `CLAUDE.md` / `.clinerules` are read automatically at session start, before the agent even processes your first message. No decision required, no tool call to make.\n- **Active (friction):** MCP tools must be consciously selected from a list, called, and their output integrated into context. Even when the information would help, agents often skip this step and read files directly instead — it's always the safe fallback.\n\nThis means architectural context delivered passively is far more reliably absorbed than context behind a tool call. `spec-gen analyze` generates `.spec-gen/analysis/CODEBASE.md` specifically for this purpose: a compact, agent-optimized digest (~100 lines) that agents absorb at session start without any decision cost.\n\n### What CODEBASE.md contains\n\nGenerated from the static analysis artifacts, it surfaces what an agent most needs before touching code:\n\n- **Entry points** — functions with no internal callers (where execution starts)\n- **Critical hubs** — highest fan-in functions (most risky to modify)\n- **Spec domains** — which `openspec/specs/` domains exist and what they cover\n- **Most coupled files** — high in-degree in the dependency graph (touch with care)\n- **God functions / oversized orchestrators** — complexity hotspots\n- **Layer violations** — if any\n\nThis is structural signal, not prose. It complements `openspec/specs/overview/spec.md`, which provides the functional view (what the system does, what domains exist). Together they give agents both the architectural topology and the business intent.\n\n### Setup\n\nAfter running `spec-gen analyze`, wire the generated digest into your agent's context:\n\n**Claude Code** — add to `CLAUDE.md`:\n\n```markdown\n@.spec-gen/analysis/CODEBASE.md\n@openspec/specs/overview/spec.md\n\n## spec-gen MCP tools — when to use them\n\n| Situation | Tool |\n|-----------|------|\n| Starting any new task | `orient` — returns functions, files, specs, call paths, and insertion points in one call |\n| Don't know which file/function handles a concept | `search_code` |\n| Need call topology across many files | `get_subgraph` / `analyze_impact` |\n| Planning where to add a feature | `suggest_insertion_points` |\n| Reading a spec before writing code | `get_spec` |\n| Checking if code still matches spec | `check_spec_drift` |\n| Finding spec requirements by meaning | `search_specs` |\n\nFor all other cases (reading a file, grepping, listing files) use native tools directly.\n```\n\n**Cline / Roo Code / Kilocode** — create `.clinerules/spec-gen.md`:\n\n```markdown\n# spec-gen\n\nspec-gen provides static analysis artifacts and MCP tools to help you navigate this codebase.\nAlways use these before writing or modifying code.\n\n## Before starting any task\n\n- Read `.spec-gen/analysis/CODEBASE.md` — architectural digest: entry points, critical hubs,\n  god functions, most-coupled files, and available spec domains. Generated locally by `spec-gen analyze`.\n- Read `openspec/specs/overview/spec.md` — functional domain map: what the system does,\n  which domains exist, data-flow requirements.\n\n## MCP tools — use these instead of grep/read when exploring\n\n- **Orient first**: call `orient` at the start of every task — it returns relevant functions,\n  files, specs, call paths, and insertion points in one shot.\n- **Finding code**: use `search_code` when you don't know which file or function handles a concept.\n- **Call topology**: use `get_subgraph` or `analyze_impact` when you need to understand\n  how calls flow across multiple files (not just a single file).\n- **Adding a feature**: call `suggest_insertion_points` before deciding where to add code —\n  it accounts for the dependency graph, not just filenames.\n- **Reading specs**: call `get_spec \u003cdomain\u003e` before writing code in that domain;\n  use `search_specs` to find requirements by meaning when you don't know the domain name.\n- **Checking drift**: call `check_spec_drift` after modifying a file to verify it still\n  matches its spec — do not skip this step before opening a PR.\n\nUse native tools (Read, Grep, Glob) only for cases not covered above.\n```\n\n`CODEBASE.md` gives the agent passive architectural context. `overview/spec.md` gives the functional domain map. The table tells it when the passive context isn't enough and an active MCP tool call is warranted.\n\n\u003e **Tip:** `spec-gen analyze` prints these snippets after every run as a reminder.\n\n\u003e **Note:** `.spec-gen/analysis/` is git-ignored — each developer generates it locally. Re-run `spec-gen analyze` after significant structural changes to keep the digest current.\n\n---\n\n## MCP Server\n\n`spec-gen mcp` starts spec-gen as a [Model Context Protocol](https://modelcontextprotocol.io/) server over stdio, exposing static analysis as tools that any MCP-compatible AI agent (Cline, Roo Code, Kilocode, Claude Code, Cursor...) can call directly -- no API key required.\n\n### Setup\n\n**Claude Code** -- add a `.mcp.json` at your project root:\n\n```json\n{\n  \"mcpServers\": {\n    \"spec-gen\": {\n      \"command\": \"spec-gen\",\n      \"args\": [\"mcp\"]\n    }\n  }\n}\n```\n\nor for local development:\n\n```json\n{\n  \"mcpServers\": {\n    \"spec-gen\": {\n      \"command\": \"node\",\n      \"args\": [\"/absolute/path/to/spec-gen/dist/cli/index.js\", \"mcp\"]\n    }\n  }\n}\n```\n\n**Cline / Roo Code / Kilocode** -- add the same block under `mcpServers` in the MCP settings JSON of your editor.\n\n### Watch mode (keep search_code and orient fresh)\n\nBy default the MCP server reads `llm-context.json` from the last `analyze` run. With `--watch-auto`, it also watches source files for changes and incrementally re-indexes signatures so `search_code` and `orient` reflect your latest edits without waiting for the next commit.\n\nAdd `--watch-auto` to your MCP config args:\n\n```json\n{\n  \"mcpServers\": {\n    \"spec-gen\": {\n      \"command\": \"spec-gen\",\n      \"args\": [\"mcp\", \"--watch-auto\"]\n    }\n  }\n}\n```\n\nThe watcher starts automatically on the first tool call — no hardcoded path needed. It re-extracts signatures for any changed source file and patches `llm-context.json` within ~500 ms of a save. If an embedding server is reachable, it also re-embeds changed functions into the vector index automatically. The call graph is not rebuilt on every change; it stays current via the [post-commit hook](#cicd-integration) (`spec-gen analyze --force`).\n\n| Option | Default | Description |\n|---|---|---|\n| `--watch-auto` | off | Auto-detect project root from first tool call |\n| `--watch \u003cdir\u003e` | — | Watch a fixed directory (alternative to `--watch-auto`) |\n| `--watch-debounce \u003cms\u003e` | 400 | Delay before re-indexing after a file change |\n\n### Cline / Roo Code / Kilocode\n\nFor editors with MCP support, after adding the `mcpServers` block to your settings, download the slash command workflows:\n\n```bash\nmkdir -p .clinerules/workflows\ncurl -sL https://raw.githubusercontent.com/clay-good/spec-gen/main/examples/cline-workflows/spec-gen-analyze-codebase.md -o .clinerules/workflows/spec-gen-analyze-codebase.md\ncurl -sL https://raw.githubusercontent.com/clay-good/spec-gen/main/examples/cline-workflows/spec-gen-check-spec-drift.md -o .clinerules/workflows/spec-gen-check-spec-drift.md\ncurl -sL https://raw.githubusercontent.com/clay-good/spec-gen/main/examples/cline-workflows/spec-gen-plan-refactor.md -o .clinerules/workflows/spec-gen-plan-refactor.md\ncurl -sL https://raw.githubusercontent.com/clay-good/spec-gen/main/examples/cline-workflows/spec-gen-execute-refactor.md -o .clinerules/workflows/spec-gen-execute-refactor.md\ncurl -sL https://raw.githubusercontent.com/clay-good/spec-gen/main/examples/cline-workflows/spec-gen-implement-feature.md -o .clinerules/workflows/spec-gen-implement-feature.md\ncurl -sL https://raw.githubusercontent.com/clay-good/spec-gen/main/examples/cline-workflows/spec-gen-refactor-codebase.md -o .clinerules/workflows/spec-gen-refactor-codebase.md\n```\n\nAvailable commands:\n\n| Command | What it does |\n|---------|-------------|\n| `/spec-gen-analyze-codebase` | Runs `analyze_codebase`, summarises the results (project type, file count, top 3 refactor issues, detected domains), shows the call graph highlights, and suggests next steps. |\n| `/spec-gen-check-spec-drift` | Runs `check_spec_drift`, presents issues by severity (gap / stale / uncovered / orphaned-spec), shows per-kind remediation commands, and optionally drills into affected file signatures. |\n| `/spec-gen-plan-refactor` | Runs static analysis, picks the highest-priority target with coverage gate, assesses impact and call graph, then writes a detailed plan to `.spec-gen/refactor-plan.md`. No code changes. |\n| `/spec-gen-execute-refactor` | Reads `.spec-gen/refactor-plan.md`, establishes a green baseline, and applies each planned change one at a time -- with diff verification and test run after every step. Optional final step covers dead-code detection and naming alignment (requires `spec-gen generate`). |\n| `/spec-gen-implement-feature` | Plans and implements a new feature with full architectural context: architecture overview, OpenSpec requirements, insertion points, implementation, and drift check. |\n| `/spec-gen-refactor-codebase` | Convenience redirect that runs `/spec-gen-plan-refactor` followed by `/spec-gen-execute-refactor`. |\n\nAll six commands ask which directory to use, call the MCP tools directly, and guide you through the results without leaving the editor. They work in any editor that supports the `.clinerules/workflows/` convention.\n\n### Claude Skills\n\nFor Claude Code, copy the skill files to `.claude/skills/` in your project:\n\n```bash\nmkdir -p .claude/skills\ncurl -sL https://raw.githubusercontent.com/clay-good/spec-gen/main/skills/claude-spec-gen.md -o .claude/skills/claude-spec-gen.md\ncurl -sL https://raw.githubusercontent.com/clay-good/spec-gen/main/skills/openspec-skill.md -o .claude/skills/openspec-skill.md\n```\n\n**Spec-Gen Skill** (`claude-spec-gen.md`) — Code archaeology skill that guides Claude through:\n- Project type detection and domain identification\n- Entity extraction, service analysis, API extraction\n- Architecture synthesis and OpenSpec spec generation\n\n**OpenSpec Skill** (`openspec-skill.md`) — Skill for working with OpenSpec specifications:\n- Semantic spec search with `search_specs`\n- List domains with `list_spec_domains`\n- Navigate requirements and scenarios\n\n### Tools\n\nAll tools run on **pure static analysis** -- no LLM quota consumed.\n\n**Analysis**\n\n| Tool | Description | Requires prior analysis |\n|------|-------------|:---:|\n| `analyze_codebase` | Run full static analysis: repo structure, dependency graph, call graph (hub functions, entry points, layer violations), and top refactoring priorities. Results cached for 1 hour (`force: true` to bypass). | No |\n| `get_call_graph` | Hub functions (high fan-in), entry points (no internal callers), and architectural layer violations. Supports TypeScript, JavaScript, Python, Go, Rust, Ruby, Java, C++, Swift. | Yes |\n| `get_signatures` | Compact function/class signatures per file. Filter by path substring with `filePattern`. Useful for understanding a module's public API without reading full source. | Yes |\n| `get_duplicate_report` | Detect duplicate code: Type 1 (exact clones), Type 2 (structural -- renamed variables), Type 3 (near-clones with Jaccard similarity \u003e= 0.7). Groups sorted by impact. | Yes |\n\n**Refactoring**\n\n| Tool | Description | Requires prior analysis |\n|------|-------------|:---:|\n| `get_refactor_report` | Prioritized list of functions with structural issues: unreachable code, hub overload (high fan-in), god functions (high fan-out), SRP violations, cyclic dependencies. | Yes |\n| `analyze_impact` | Deep impact analysis for a specific function: fan-in/fan-out, upstream call chain, downstream critical path, risk score (0-100), blast radius, and recommended strategy. | Yes |\n| `get_low_risk_refactor_candidates` | Safest functions to refactor first: low fan-in, low fan-out, not a hub, no cyclic involvement. Best starting point for incremental, low-risk sessions. | Yes |\n| `get_leaf_functions` | Functions that make no internal calls (leaves of the call graph). Zero downstream blast radius. Sorted by fan-in by default -- most-called leaves have the best unit-test ROI. | Yes |\n| `get_critical_hubs` | Highest-impact hub functions ranked by criticality. Each hub gets a stability score (0-100) and a recommended approach: extract, split, facade, or delegate. | Yes |\n| `get_god_functions` | Detect god functions (high fan-out, likely orchestrators) in the project or in a specific file, and return their call-graph neighborhood. Use this to identify which functions need to be refactored and understand what logical blocks to extract. | Yes |\n\n**Navigation**\n\n| Tool | Description | Requires prior analysis |\n|------|-------------|:---:|\n| `orient` | **Single entry point for any new task.** Given a natural-language task description, returns in one call: relevant functions, source files, spec domains, call neighbourhoods, insertion-point candidates, and matching spec sections. Start here. | Yes (+ embedding) |\n| `get_subgraph` | Depth-limited subgraph centred on a function. Direction: `downstream` (what it calls), `upstream` (who calls it), or `both`. Output as JSON or Mermaid diagram. | Yes |\n| `get_architecture_overview` | High-level cluster map: roles (entry layer, orchestrator, core utilities, API layer, internal), inter-cluster dependencies, global entry points, and critical hubs. No LLM required. | Yes |\n| `get_function_skeleton` | Noise-stripped view of a source file: logs, inline comments, and non-JSDoc block comments removed. Signatures, control flow, return/throw, and call expressions preserved. Returns reduction %. | No |\n| `get_function_body` | Return the exact source code of a named function in a file. | No |\n| `get_file_dependencies` | Return the file-level import dependencies for a given source file (imports, imported-by, or both). | Yes |\n| `trace_execution_path` | Find all call-graph paths between two functions (DFS, configurable depth/max-paths). Use this when debugging: \"how does request X reach function Y?\" Returns shortest path, all paths sorted by hops, and a step-by-step chain per path. | Yes |\n| `suggest_insertion_points` | Semantic search over the vector index to find the best existing functions to extend or hook into when implementing a new feature. Returns ranked candidates with role and strategy. Falls back to BM25 keyword search when no embedding server is configured. | Yes (+ embedding) |\n| `search_code` | Natural-language semantic search over indexed functions. Returns the closest matches by meaning with similarity score, call-graph neighbourhood enrichment, and spec-linked peer functions. Falls back to BM25 keyword search when no embedding server is configured. | Yes (+ embedding) |\n\n**Specs**\n\n| Tool | Description | Requires prior analysis |\n|------|-------------|:---:|\n| `get_spec` | Read the full content of an OpenSpec domain spec by domain name. | Yes (generate) |\n| `get_mapping` | Requirement-\u003efunction mapping produced by `spec-gen generate`. Shows which functions implement which spec requirements, confidence level, and orphan functions with no spec coverage. | Yes (generate) |\n| `get_decisions` | List or search Architecture Decision Records (ADRs) stored in `openspec/decisions/`. Optional keyword query. | Yes (generate) |\n| `check_spec_drift` | Detect code changes not reflected in OpenSpec specs. Compares git-changed files against spec coverage maps. Issues: gap / stale / uncovered / orphaned-spec / adr-gap. | Yes (generate) |\n| `search_specs` | Semantic search over OpenSpec specifications to find requirements, design notes, and architecture decisions by meaning. Returns linked source files for graph highlighting. Use this when asked \"which spec covers X?\" or \"where should we implement Z?\". Requires a spec index built with `spec-gen analyze` or `--reindex-specs`. | Yes (generate) |\n| `list_spec_domains` | List all OpenSpec domains available in this project. Use this to discover what domains exist before doing a targeted `search_specs` call. | Yes (generate) |\n| `generate_change_proposal` | Given a story/task description, chains `orient` + `search_specs` + `analyze_impact` to produce a structured change proposal at `openspec/changes/{slug}/proposal.md` with affected domains, risk scores, and insertion points. | Yes |\n| `annotate_story` | Reads a story/task markdown file, runs structural analysis, and patches a `## Risk Context` section in-place with domains, risk scores, blocking refactors, and insertion points. | Yes |\n\n### Parameters\n\n**`orient`**\n```\ndirectory  string   Absolute path to the project directory\ntask       string   Natural-language description of the task, e.g. \"add rate limiting to the API\"\nlimit      number   Max relevant functions to return (default: 5, max: 20)\n```\n\n**`analyze_codebase`**\n```\ndirectory  string   Absolute path to the project directory\nforce      boolean  Force re-analysis even if cache is fresh (default: false)\n```\n\n**`get_refactor_report`**, **`get_call_graph`**\n```\ndirectory  string   Absolute path to the project directory\n```\n\n**`get_signatures`**\n```\ndirectory    string   Absolute path to the project directory\nfilePattern  string   Optional path substring filter (e.g. \"services\", \".py\")\n```\n\n**`get_subgraph`**\n```\ndirectory     string   Absolute path to the project directory\nfunctionName  string   Function name to centre on (case-insensitive partial match)\ndirection     string   \"downstream\" | \"upstream\" | \"both\"  (default: \"downstream\")\nmaxDepth      number   BFS traversal depth limit  (default: 3)\nformat        string   \"json\" | \"mermaid\"  (default: \"json\")\n```\n\n*Note: If no exact name match is found, `get_subgraph` falls back to semantic search (when a vector index is available) to find the most similar function.*\n\n**`get_mapping`**\n```\ndirectory    string    Absolute path to the project directory\ndomain       string    Optional domain filter (e.g. \"auth\", \"crawler\")\norphansOnly  boolean   Return only orphan functions (default: false)\n```\n\n**`get_duplicate_report`**\n```\ndirectory  string   Absolute path to the project directory\n```\n\n**`check_spec_drift`**\n```\ndirectory  string    Absolute path to the project directory\nbase       string    Git ref to compare against (default: auto-detect main/master)\nfiles      string[]  Specific files to check (default: all changed files)\ndomains    string[]  Only check these spec domains (default: all)\nfailOn     string    Minimum severity to report: \"error\" | \"warning\" | \"info\" (default: \"warning\")\nmaxFiles   number    Max changed files to analyze (default: 100)\n```\n\n**`list_spec_domains`**\n```\ndirectory  string   Absolute path to the project directory\n```\n\n**`analyze_impact`**\n```\ndirectory  string   Absolute path to the project directory\nsymbol     string   Function or method name (exact or partial match)\ndepth      number   Traversal depth for upstream/downstream chains (default: 2)\n```\n\n*Note: If no exact name match is found, `analyze_impact` falls back to semantic search (when a vector index is available) to find the most similar function.*\n\n**`get_low_risk_refactor_candidates`**\n```\ndirectory    string   Absolute path to the project directory\nlimit        number   Max candidates to return (default: 5)\nfilePattern  string   Optional path substring filter (e.g. \"services\", \".py\")\n```\n\n**`get_leaf_functions`**\n```\ndirectory    string   Absolute path to the project directory\nlimit        number   Max results to return (default: 20)\nfilePattern  string   Optional path substring filter\nsortBy       string   \"fanIn\" (default) | \"name\" | \"file\"\n```\n\n**`get_critical_hubs`**\n```\ndirectory  string   Absolute path to the project directory\nlimit      number   Max hubs to return (default: 10)\nminFanIn   number   Minimum fan-in threshold to be considered a hub (default: 3)\n```\n\n**`get_architecture_overview`**\n```\ndirectory  string   Absolute path to the project directory\n```\n\n**`get_function_skeleton`**\n```\ndirectory  string   Absolute path to the project directory\nfilePath   string   Path to the file, relative to the project directory\n```\n\n**`get_function_body`**\n```\ndirectory     string   Absolute path to the project directory\nfilePath      string   Path to the file, relative to the project directory\nfunctionName  string   Name of the function to extract\n```\n\n**`get_file_dependencies`**\n```\ndirectory  string   Absolute path to the project directory\nfilePath   string   Path to the file, relative to the project directory\ndirection  string   \"imports\" | \"importedBy\" | \"both\"  (default: \"both\")\n```\n\n**`trace_execution_path`**\n```\ndirectory       string   Absolute path to the project directory\nentryFunction   string   Name of the starting function (case-insensitive partial match)\ntargetFunction  string   Name of the target function (case-insensitive partial match)\nmaxDepth        number   Maximum path length in hops (default: 6)\nmaxPaths        number   Maximum number of paths to return (default: 10, max: 50)\n```\n\n**`get_decisions`**\n```\ndirectory  string   Absolute path to the project directory\nquery      string   Optional keyword to filter ADRs by title or content\n```\n\n**`get_spec`**\n```\ndirectory  string   Absolute path to the project directory\ndomain     string   Domain name (e.g. \"auth\", \"user\", \"api\")\n```\n\n**`get_god_functions`**\n```\ndirectory        string   Absolute path to the project directory\nfilePath         string   Optional: restrict search to this file (relative path)\nfanOutThreshold  number   Minimum fan-out to be considered a god function (default: 8)\n```\n\n**`suggest_insertion_points`**\n```\ndirectory    string   Absolute path to the project directory\ndescription  string   Natural-language description of the feature to implement\nlimit        number   Max candidates to return (default: 5)\nlanguage     string   Filter by language: \"TypeScript\" | \"Python\" | \"Go\" | ...\n```\n\n**`search_code`**\n```\ndirectory  string   Absolute path to the project directory\nquery      string   Natural-language query, e.g. \"authenticate user with JWT\"\nlimit      number   Max results (default: 10)\nlanguage   string   Filter by language: \"TypeScript\" | \"Python\" | \"Go\" | ...\nminFanIn   number   Only return functions with at least this many callers\n```\n\n**`search_specs`**\n```\ndirectory  string   Absolute path to the project directory\nquery      string   Natural language query, e.g. \"email validation workflow\"\nlimit      number   Maximum number of results to return (default: 10)\ndomain     string   Filter by domain name (e.g. \"auth\", \"analyzer\")\nsection    string   Filter by section type: \"requirements\" | \"purpose\" | \"design\" | \"architecture\" | \"entities\"\n```\n\n**`generate_change_proposal`**\n```\ndirectory     string   Absolute path to the project directory\ndescription   string   Natural-language description of the change (story, intent, or spec delta)\nslug          string   URL-safe identifier for the proposal (e.g. \"add-payment-retry\")\nstoryContent  string   Optional full story markdown to embed in the proposal\n```\n\n**`annotate_story`**\n```\ndirectory      string   Absolute path to the project directory\nstoryFilePath  string   Path to the story file (relative to project root or absolute)\ndescription    string   Natural-language summary of the story for structural analysis\n```\n\n### Typical workflow\n\n**Scenario A -- Initial exploration**\n```\n1. analyze_codebase({ directory })                    # repo structure + call graph + top issues\n2. get_call_graph({ directory })                      # hub functions + layer violations\n3. get_duplicate_report({ directory })                # clone groups to consolidate\n4. get_refactor_report({ directory })                 # prioritized refactoring candidates\n```\n\n**Scenario B -- Targeted refactoring**\n```\n1. analyze_impact({ directory, symbol: \"myFunction\" })       # risk score + blast radius + strategy\n2. get_subgraph({ directory, functionName: \"myFunction\",     # Mermaid call neighbourhood\n                  direction: \"both\", format: \"mermaid\" })\n3. get_low_risk_refactor_candidates({ directory,             # safe entry points to extract first\n                                      filePattern: \"myFile\" })\n4. get_leaf_functions({ directory, filePattern: \"myFile\" })  # zero-risk extraction targets\n```\n\n**Scenario C -- Spec maintenance**\n```\n1. check_spec_drift({ directory })                    # code changes not reflected in specs\n2. get_mapping({ directory, orphansOnly: true })      # functions with no spec coverage\n```\n\n**Scenario D -- Starting a new task (fastest orientation)**\n```\n1. orient({ directory, task: \"add rate limiting to the API\" })\n   # Returns in one call:\n   #   - relevant functions (semantic search or BM25 fallback)\n   #   - source files and spec domains that cover them\n   #   - call-graph neighbourhood for each top function\n   #   - best insertion-point candidates\n   #   - spec-linked peer functions (cross-graph traversal)\n   #   - matching spec sections\n2. get_spec({ directory, domain: \"...\" })             # read full spec before writing code\n3. check_spec_drift({ directory })                    # verify after implementation\n```\n\n**Scenario E -- Agentic workflow (risk-gated implementation)**\n```\n1. generate_change_proposal({ directory, description, slug })  # proposal.md with risk + domains\n2. annotate_story({ directory, storyFilePath, description })   # patch story with risk_context\n3. orient({ directory, task: \"...\" })                          # scope functions + insertion points\n4. analyze_impact({ directory, symbol: \"topFunction\" })        # confirm risk before coding\n5. check_spec_drift({ directory })                             # verify after implementation\n```\n\nSee [docs/agentic-workflows/](docs/agentic-workflows/) for full integration patterns with BMAD, Mistral Vibe, spec-kit, and Claude Code.\n\n## Interactive Graph Viewer\n\n`spec-gen view` launches a local React app that visualises your codebase analysis and lets you explore spec requirements side-by-side with the dependency graph.\n\n```bash\n# Run analysis first (if not already done)\nspec-gen analyze\n\n# Launch the viewer (opens browser automatically)\nspec-gen view\n\n# Options\nspec-gen view --port 4000          # custom port (default: 5173)\nspec-gen view --host 0.0.0.0       # expose on LAN\nspec-gen view --no-open            # don't open browser automatically\nspec-gen view --analysis \u003cpath\u003e    # custom analysis dir (default: .spec-gen/analysis/)\nspec-gen view --spec \u003cpath\u003e        # custom spec dir (default: ./openspec/specs/)\n```\n\n### Views\n\n| View | Description |\n|------|-------------|\n| **Clusters** | Colour-coded architectural clusters with expandable member nodes. Falls back to directory clusters for languages without import edges (Swift, C++) |\n| **Flat** | Force-directed dependency graph (all nodes). Import edges are solid; call edges (Swift/C++ synthesised, or HTTP cross-language) are cyan dashed |\n| **Classes** | Class/struct inheritance and call graph. Nodes coloured by language or connected component; isolated nodes hidden. Component-aware force layout keeps related classes together |\n| **Architecture** | High-level cluster map: role-coloured boxes, inter-cluster dependency arrows |\n| **Classes** | Component-aware force layout of class/struct relationships with coloured groupings |\n\n### Diagram Chat\n\nThe right sidebar includes a **Diagram Chat** panel powered by an LLM agent. The chat can access all analysis tools and interact with the graph:\n\n- Ask questions about your codebase in natural language\n- Graph functions and requirements mentioned in answers are automatically highlighted\n- Clusters containing highlighted nodes auto-expand to reveal the nodes\n- Select a node in the chat to view its details tab\n\nExample queries:\n- \"What are the most critical functions?\"\n- \"Where would I add a new API endpoint?\"\n- \"Show me the impact of changing the authentication service\"\n\nThe chat requires an LLM API key (same provider configuration as `spec-gen generate`). Viewer-only operations like graph browsing, skeleton view, and search do not require an API key.\n\n### Right panel tabs (select a node to activate)\n\n| Tab | Content |\n|-----|---------|\n| **Node** | File metadata: exports, language, score |\n| **Links** | Direct callers and callees |\n| **Blast** | Downstream impact radius |\n| **Spec** | Requirements linked to the selected file -- body, domain, confidence |\n| **Skeleton** | Noise-stripped source: logs and comments removed, structure preserved |\n| **Info** | Global stats and top-ranked files |\n\n### Search\n\nThe search bar filters all three views simultaneously (text match on name, path, exports, tags). If a vector index was built with `--embed`, typing \u003e= 3 characters also queries the semantic index and shows the top 5 function matches in a dropdown.\n\n### Automatic data loading\n\nThe viewer auto-loads all available data on startup:\n\n| Endpoint | Source | Required? |\n|----------|--------|-----------|\n| `/api/dependency-graph` | `.spec-gen/analysis/dependency-graph.json` | Yes |\n| `/api/llm-context` | `.spec-gen/analysis/llm-context.json` | No |\n| `/api/refactor-priorities` | `.spec-gen/analysis/refactor-priorities.json` | No |\n| `/api/mapping` | `.spec-gen/analysis/mapping.json` | No |\n| `/api/spec-requirements` | `openspec/specs/**/*.md` + `mapping.json` | No |\n| `/api/skeleton?file=` | Source file on disk | No |\n| `/api/search?q=` | `.spec-gen/analysis/vector-index/` | No (`--embed`) |\n\nRun `spec-gen generate` to produce `mapping.json` and the spec files. Once present, the **Spec** tab shows the full requirement body for each selected file.\n\n### View Options\n\n```bash\nspec-gen view [options]\n  --analysis \u003cpath\u003e    Analysis directory (default: .spec-gen/analysis/)\n  --spec \u003cpath\u003e        Spec files directory (default: ./openspec/specs/)\n  --port \u003cn\u003e           Port (default: 5173)\n  --host \u003chost\u003e        Bind host (default: 127.0.0.1; use 0.0.0.0 for LAN)\n  --no-open            Skip automatic browser open\n```\n\n## Output\n\nspec-gen writes to the OpenSpec directory structure:\n\n```\nopenspec/\n  config.yaml                # Project metadata\n  specs/\n    overview/spec.md         # System overview\n    architecture/spec.md     # Architecture\n    auth/spec.md             # Domain: Authentication\n    user/spec.md             # Domain: User management\n    api/spec.md              # API specification\n  decisions/                 # With --adr flag\n    index.md                 # ADR index\n    adr-0001-*.md            # Individual decisions\n```\n\nEach spec uses RFC 2119 keywords (SHALL, MUST, SHOULD), Given/When/Then scenarios, and technical notes linking to implementation files.\n\n### Analysis Artifacts\n\nStatic analysis output is stored in `.spec-gen/analysis/`:\n\n| File | Description |\n|------|-------------|\n| `repo-structure.json` | Project structure and metadata |\n| `dependency-graph.json` | Import/export relationships, HTTP cross-language edges (JS/TS → Python), and synthesised call edges for Swift/C++ |\n| `llm-context.json` | Context prepared for LLM (signatures, call graph) |\n| `dependencies.mermaid` | Visual dependency graph |\n| `SUMMARY.md` | Human-readable analysis summary |\n| `call-graph.json` | Function-level call graph (8 languages: TS/JS, Python, Go, Rust, Ruby, Java, C++, Swift) |\n| `refactor-priorities.json` | Refactoring issues by file and function |\n| `mapping.json` | Requirement-\u003efunction mapping (produced by `generate`) |\n| `vector-index/` | LanceDB semantic index (produced by `--embed`) |\n\n`spec-gen analyze` also writes **`ARCHITECTURE.md`** to your project root -- a Markdown overview of module clusters, entry points, and critical hubs, refreshed on every run.\n\n## Semantic Search \u0026 GraphRAG\n\n`spec-gen analyze` builds a vector index over all functions in the call graph, enabling natural-language search via the `search_code`, `orient`, and `suggest_insertion_points` MCP tools, and the search bar in the viewer.\n\n### GraphRAG retrieval expansion\n\nSemantic search is only the starting point. spec-gen combines three retrieval layers into every search result — this is what makes it genuinely useful for AI agents navigating unfamiliar codebases:\n\n1. **Semantic seed** — dense vector search (or BM25 keyword fallback) finds the top-N functions closest in meaning to the query.\n2. **Call-graph expansion** — BFS up to depth 2 follows callee edges from every seed function, pulling in the files those functions depend on. During `generate`, this ensures the LLM sees the full call neighbourhood, not just the most obvious files.\n3. **Spec-linked peer functions** — each seed function's spec domain is looked up in the requirement→function mapping. Functions from the same spec domain that live in *different files* are surfaced as `specLinkedFunctions`. This crosses the call-graph boundary: implementations that share a spec requirement but are not directly connected by calls are retrieved automatically.\n\nThe result: a single `orient` or `search_code` call returns not just \"functions that mention this concept\" but the interconnected cluster of code and specs that collectively implement it. Agents spend less time chasing cross-file references manually and more time making changes with confidence.\n\n### Embedding configuration\n\nProvide an OpenAI-compatible embedding endpoint (Ollama, OpenAI, Mistral, etc.) via environment variables or `.spec-gen/config.json`:\n\n**Environment variables:**\n```bash\nEMBED_BASE_URL=https://api.openai.com/v1\nEMBED_MODEL=text-embedding-3-small\nEMBED_API_KEY=sk-...         # optional for local servers\n\n# Then run (embedding is automatic when configured):\nspec-gen analyze\n```\n\n**Config file (`.spec-gen/config.json`):**\n```json\n{\n  \"embedding\": {\n    \"baseUrl\": \"http://localhost:11434/v1\",\n    \"model\": \"nomic-embed-text\",\n    \"batchSize\": 64\n  }\n}\n```\n\n- `batchSize`: Number of texts to embed per API call (default: 64)\n\nThe index is stored in `.spec-gen/analysis/vector-index/` and is automatically used by the viewer's search bar and the `search_code` / `suggest_insertion_points` MCP tools.\n\n## Configuration\n\n`spec-gen init` creates `.spec-gen/config.json`:\n\n```json\n{\n  \"version\": \"1.0.0\",\n  \"projectType\": \"nodejs\",\n  \"openspecPath\": \"./openspec\",\n  \"analysis\": {\n    \"maxFiles\": 500,\n    \"includePatterns\": [],\n    \"excludePatterns\": []\n  },\n  \"generation\": {\n    \"model\": \"claude-sonnet-4-20250514\",\n    \"domains\": \"auto\"\n  }\n}\n```\n\n### Environment Variables\n\n| Variable | Provider | Description |\n|----------|----------|-------------|\n| `ANTHROPIC_API_KEY` | `anthropic` | Anthropic API key |\n| `ANTHROPIC_API_BASE` | `anthropic` | Custom base URL (proxy / self-hosted) |\n| `OPENAI_API_KEY` | `openai` | OpenAI API key |\n| `OPENAI_API_BASE` | `openai` | Custom base URL (Azure, proxy...) |\n| `OPENAI_COMPAT_API_KEY` | `openai-compat` | API key for OpenAI-compatible server |\n| `OPENAI_COMPAT_BASE_URL` | `openai-compat` | Base URL, e.g. `https://api.mistral.ai/v1` |\n| `GEMINI_API_KEY` | `gemini` | Google Gemini API key |\n| `COPILOT_API_BASE_URL` | `copilot` | Base URL of the copilot-api proxy (default: `http://localhost:4141/v1`) |\n| `COPILOT_API_KEY` | `copilot` | API key if the proxy requires auth (default: `copilot`) |\n| `EMBED_BASE_URL` | embedding | Base URL for the embedding API (e.g. `http://localhost:11434/v1`) |\n| `EMBED_MODEL` | embedding | Embedding model name (e.g. `nomic-embed-text`) |\n| `EMBED_API_KEY` | embedding | API key for the embedding service (defaults to `OPENAI_API_KEY`) |\n| `DEBUG` | -- | Enable stack traces on errors |\n| `CI` | -- | Auto-detected; enables timestamps in output |\n\n## Requirements\n\n- Node.js 20+\n- API key for `generate`, `verify`, and `drift --use-llm` -- set the env var for your chosen provider:\n  ```bash\n  export ANTHROPIC_API_KEY=sk-ant-...       # Anthropic (default)\n  export OPENAI_API_KEY=sk-...              # OpenAI\n  export OPENAI_COMPAT_API_KEY=ollama       # OpenAI-compatible local server\n  export GEMINI_API_KEY=...                 # Google Gemini\n  ```\n  Or use a CLI-based provider (`claude-code`, `gemini-cli`, `mistral-vibe`) which requires no API key — just the CLI tool on your PATH.\n- `analyze`, `drift`, and `init` require no API key\n\n## Supported Languages\n\n| Language | Signatures | Call Graph | Notes |\n|----------|-----------|------------|-------|\n| TypeScript / JavaScript | Full | Full | Best results — rich type info |\n| Python | Full | Full | `self`/`cls` intra-class + import resolution |\n| Go | Full | Full | |\n| Rust | Full | Full | |\n| Ruby | Full | Full | |\n| Java | Full | Full | |\n| C++ | Full | Full | No intra-module imports — cross-file edges synthesised from call graph |\n| Swift | Full | Full | No intra-module imports — cross-file edges synthesised from call graph |\n\n## Usage Options\n\n**CLI Tool** (recommended):\n```bash\nspec-gen init \u0026\u0026 spec-gen analyze \u0026\u0026 spec-gen generate \u0026\u0026 spec-gen drift --install-hook\n```\n\n**Claude Code Skill**: Copy `skills/claude-spec-gen.md` to `.claude/skills/` in your project.\n\n**OpenSpec Skill**: Copy `skills/openspec-skill.md` to your OpenSpec skills directory.\n\n**Direct LLM Prompting**: Use `AGENTS.md` as a system prompt for any LLM.\n\n**Programmatic API**: Import spec-gen as a library in your own tools.\n\n## Programmatic API\n\nspec-gen exposes a typed Node.js API for integration into other tools (like [OpenSpec CLI](https://github.com/Fission-AI/OpenSpec)). Every CLI command has a corresponding API function that returns structured results instead of printing to the console.\n\n```bash\nnpm install spec-gen-cli\n```\n\n```typescript\nimport { specGenAnalyze, specGenDrift, specGenRun } from 'spec-gen';\n\n// Run the full pipeline\nconst result = await specGenRun({\n  rootPath: '/path/to/project',\n  adr: true,\n  onProgress: (event) =\u003e console.log(`[${event.phase}] ${event.step}`),\n});\nconsole.log(`Generated ${result.generation.report.filesWritten.length} specs`);\n\n// Check for drift\nconst drift = await specGenDrift({\n  rootPath: '/path/to/project',\n  failOn: 'warning',\n});\nif (drift.hasDrift) {\n  console.warn(`${drift.summary.total} drift issues found`);\n}\n\n// Static analysis only (no API key needed)\nconst analysis = await specGenAnalyze({\n  rootPath: '/path/to/project',\n  maxFiles: 1000,\n});\nconsole.log(`Analyzed ${analysis.repoMap.summary.analyzedFiles} files`);\n```\n\n### API Functions\n\n| Function | Description | API Key |\n|----------|-------------|---------|\n| `specGenInit(options?)` | Initialize config and openspec directory | No |\n| `specGenAnalyze(options?)` | Run static analysis | No |\n| `specGenGenerate(options?)` | Generate specs from analysis | Yes |\n| `specGenVerify(options?)` | Verify spec accuracy | Yes |\n| `specGenDrift(options?)` | Detect spec-to-code drift | No* |\n| `specGenRun(options?)` | Full pipeline: init + analyze + generate | Yes |\n| `specGenGetSpecRequirements(options?)` | Read requirement blocks from generated specs | No |\n\n\\* `specGenDrift` requires an API key only when `llmEnhanced: true`.\n\nAll functions accept an optional `onProgress` callback for status updates and throw errors instead of calling `process.exit`. See [src/api/types.ts](src/api/types.ts) for full option and result type definitions.\n\n### Error handling\n\nAll API functions throw `Error` on failure. Wrap calls in try-catch for production use:\n\n```typescript\nimport { specGenRun } from 'spec-gen-cli';\n\ntry {\n  const result = await specGenRun({ rootPath: '/path/to/project' });\n  console.log(`Done — ${result.generation.report.filesWritten.length} specs written`);\n} catch (err) {\n  if ((err as Error).message.includes('API key')) {\n    console.error('Set ANTHROPIC_API_KEY or OPENAI_API_KEY');\n  } else {\n    console.error('spec-gen failed:', (err as Error).message);\n  }\n}\n```\n\n### Reading generated spec requirements\n\nAfter running `specGenGenerate`, you can programmatically query the requirement-to-function mapping:\n\n```typescript\nimport { specGenGetSpecRequirements } from 'spec-gen-cli';\n\nconst { requirements } = await specGenGetSpecRequirements({ rootPath: '/path/to/project' });\nfor (const [key, req] of Object.entries(requirements)) {\n  console.log(`${key}: ${req.title} (${req.specFile})`);\n}\n```\n\n## Examples\n\n| Example | Description |\n|---------|-------------|\n| [examples/openspec-analysis/](examples/openspec-analysis/) | Static analysis output from `spec-gen analyze` |\n| [examples/openspec-cli/](examples/openspec-cli/) | Specifications generated with `spec-gen generate` |\n| [examples/drift-demo/](examples/drift-demo/) | Sample project configured for drift detection |\n| [examples/bmad/](examples/bmad/) | BMAD Method integration — agents, tasks, templates for brownfield codebases |\n| [examples/mistral-vibe/](examples/mistral-vibe/) | Mistral Vibe skills for analyze, generate, plan-refactor, implement-story |\n| [examples/spec-kit/](examples/spec-kit/) | spec-kit extension with `before_implement` / `after_implement` hooks |\n| [examples/gsd/](examples/gsd/) | Get-Shit-Done Claude Code commands for orient + drift |\n\n## Development\n\n```bash\nnpm install          # Install dependencies\nnpm run dev          # Development mode (watch)\nnpm run build        # Build\nnpm run test:run     # Run tests (2200+ unit tests)\nnpm run typecheck    # Type check\n```\n\n2200+ unit tests covering static analysis, call graph (including Swift), refactor analysis, spec mapping, drift detection, LLM enhancement, ADR generation, MCP handlers, change proposals, cross-file edge synthesis, and the full CLI.\n\n## Links\n\n- [OpenSpec](https://github.com/Fission-AI/OpenSpec) - Spec-driven development framework\n- [Architecture](docs/ARCHITECTURE.md) - Internal design and module organization\n- [Algorithms](docs/ALGORITHMS.md) - Analysis algorithms\n- [OpenSpec Integration](docs/OPENSPEC-INTEGRATION.md) - How spec-gen integrates with OpenSpec\n- [OpenSpec Format](docs/OPENSPEC-FORMAT.md) - Spec format reference\n- [Philosophy](docs/PHILOSOPHY.md) - \"Archaeology over Creativity\"\n- [Agentic Workflows](docs/agentic-workflows/) - Integration patterns for BMAD, Mistral Vibe, spec-kit, GSD\n- [Troubleshooting](docs/TROUBLESHOOTING.md) - Common issues and solutions\n- [AGENTS.md](AGENTS.md) - LLM system prompt for direct prompting\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fclay-good%2Fspec-gen","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fclay-good%2Fspec-gen","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fclay-good%2Fspec-gen/lists"}