https://github.com/hec-ovi/research-skill
Persistent project-scoped knowledge base for Claude Code and other SKILL.md-compatible code CLIs: deep research findings with progressive disclosure and contrarian-pass investigation. Installs via git clone, npx skills add, or /plugin install
https://github.com/hec-ovi/research-skill
agent-skill agent-skills agentskills claude-code claude-skill deep-research knowledge-base memory npx-skills progressive-disclosure research skill-md subagent
Last synced: about 2 months ago
JSON representation
Persistent project-scoped knowledge base for Claude Code and other SKILL.md-compatible code CLIs: deep research findings with progressive disclosure and contrarian-pass investigation. Installs via git clone, npx skills add, or /plugin install
- Host: GitHub
- URL: https://github.com/hec-ovi/research-skill
- Owner: hec-ovi
- License: mit
- Created: 2026-04-25T18:47:26.000Z (about 2 months ago)
- Default Branch: main
- Last Pushed: 2026-04-25T20:09:02.000Z (about 2 months ago)
- Last Synced: 2026-04-25T20:32:53.147Z (about 2 months ago)
- Topics: agent-skill, agent-skills, agentskills, claude-code, claude-skill, deep-research, knowledge-base, memory, npx-skills, progressive-disclosure, research, skill-md, subagent
- Size: 28.3 KB
- Stars: 1
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- License: LICENSE
Awesome Lists containing this project
README
research-skill
Persistent project-scoped store for deep research findings, with progressive disclosure and contrarian-pass investigation.
---
## What this is
A Claude Code skill that gives you a persistent, project-scoped store for deep research findings.
Stop re-researching the same topics across sessions. Stop polluting conversation context with raw web search dumps. The skill maintains a structured local knowledge base under `/.research/`, looks it up before fetching the web, and uses progressive disclosure to load only what's actually needed.
---
## Built for compaction and large-research recall
Long Claude Code sessions run out of context. `/compact` summarizes older turns and drops the rest, so findings from a deep research thread evaporate and the next question re-triggers the same web searches.
This skill makes the data layer outlive the chat. Research written today survives `/compact`, `/clear`, IDE restarts, and machine moves. The next session reads `INDEX.md` first (a tiny dispatcher), matches the topic, and pulls only the matched entry's `## Summary` section into context. The full body stays on disk until you actually need it.
Loading tiers, cheapest first:
| Tier | Loads | Approx tokens | When |
|---|---|---|---|
| 1 | `INDEX.md` | 100 to 500 | Every retrieval |
| 2 | Entry's `## Summary` only | 50 to 200 | When INDEX shows a match |
| 3 | Full `FINDINGS.md` body | 500 to 3000 | When the summary doesn't cover it |
Heavy research artifacts become cheap to recall: you only pay for the tier you need.
---
## What's distinctive
- **Project-scoped, not global.** Each repo has its own research store, kept private (gitignored by default).
- **Progressive disclosure.** Index, then summary, then full body, in that order. Most lookups never load the full entry.
- **Conflict-handling history.** When findings change, old claims move to a `## Discarded approaches` table with reasons; never silently overwritten. Prevents re-trying refuted approaches.
- **Subagent-isolated investigation.** Heavy WebSearch / WebFetch traffic runs in a separate `general-purpose` subagent (Opus 4.7 by default). Your main context stays clean.
- **Async, non-blocking.** The investigation subagent runs in background mode (`run_in_background: true`); your conversation with Claude Code stays interactive while research happens. Findings save and announce themselves on the completion notification. No frozen CLI.
- **Cognitive phases.** Decompose, Gather, Validate, **Contrarian pass**, Synthesize. The contrarian pass actively searches for "why this is wrong" rather than confirming. It earns its keep.
---
## How findings reach your conversation
When the Investigation subagent finishes, its full structured return (Summary, Findings, contrarian objection, sources) is injected into the main agent's context as a task-notification message. No file round-trip, no tail-the-log polling. The main agent parses the return directly and writes the data layer.
Why this matters:
- **No raw web-search dump pollution.** The main agent only sees the agent's clean synthesized output, never the raw web search results or fetch responses. Those live in a separate transcript file the main agent is forbidden to read.
- **Storage is deterministic.** The required output format maps 1:1 to the FINDINGS.md schema. Parsing is mechanical, not interpretive.
- **Conversation stays interactive.** The subagent runs in background mode (`run_in_background: true`), so you keep working while it researches; the structured output arrives as a notification when the agent completes.
---
## Install
Three install routes, all global. No registration, approval, or login required.
### 1. `npx skills add` (cross-tool, any code CLI that implements the open SKILL.md format)
```bash
npx skills add hec-ovi/research-skill
```
### 2. Claude Code plugin marketplace
```
/plugin marketplace add hec-ovi/research-skill
/plugin install research@research-skill
```
This uses Claude Code's built-in marketplace mechanism to install the plugin from the maintainer's GitHub repo. It is not Anthropic's first-party catalog.
### 3. Direct git clone (simplest, works anywhere)
```bash
# Personal (across all your projects)
git clone https://github.com/hec-ovi/research-skill ~/.claude/skills/research
# Or project-only
git clone https://github.com/hec-ovi/research-skill /.claude/skills/research
```
Claude Code picks up new skills live, no restart needed.
---
## Usage
The skill auto-activates when you ask a research-style question. You can also invoke it explicitly:
```
/research
```
Examples:
- *"What's the latest TypeScript ORM for edge runtime in 2026?"*
- *"Compare Bun vs Node cold-starts for serverless"*
- *"/research drizzle-type-generation"*
---
## Data layout
The skill writes to your project, not your home dir:
```
/.research/
├── INDEX.md # dispatcher: topic table, scanned first
└── /
├── FINDINGS.md # entry: frontmatter + summary + findings + history
└── raw/ # optional: pasted PDFs, whitepapers, etc.
```
`INDEX.md` is the dispatcher, equivalent to `RESOLVER.md` in the GBrain pattern. The agent reads it first, then loads only the matched entry's `## Summary` section. Full entries and raw documents only load on demand.
`.research/` is deliberately outside `/.claude/` to dodge Claude Code's hard-coded sensitive-path guard, which prompts on every read or write to anything under `.claude/` regardless of `permissions.allow` settings.
---
## When NOT to use
- Plan-stage notes
- Small facts or one-line preferences
- Code-level decisions tied to one file
- Casual lookups answerable from a single source
- A substitute for a single WebSearch / WebFetch
If the question fits in one search plus 1 to 2 sentences, you don't need this skill.
---
## Influences and citations
Built explicitly on three open patterns; credit where due.
### Agent Skills specification
Frontmatter and folder layout follow the open [Agent Skills specification](https://agentskills.io/specification) (Apache 2.0 / CC-BY-4.0). Portable across Claude Code and any other code CLI that implements the SKILL.md format.
### Grok deep-research multi-agent pattern (xAI)
The Investigation phase walks a 5-step cognitive workflow (Decompose, Gather, Validate, Contrarian pass, Synthesize) adapted from xAI's published [Multi-Agent architecture](https://docs.x.ai/developers/model-capabilities/text/multi-agent) and the [DeepSearch announcement](https://x.ai/news/grok-3). xAI ships 4 specialized agents (Captain, Harper, Benjamin, Lucas) on a shared backbone; this skill condenses those into cognitive phases a single subagent walks, since the Claude Code harness does not currently expose subagent continuation (`SendMessage` unavailable as of April 2026).
The Contrarian pass (phase 4) is the standout borrowed element: actively searching for "why this is wrong" rather than confirming. In an A/B test on a celebrity-fronted AI tool legitimacy question, the contrarian pass surfaced significant controversy that a minimal-brief baseline missed.
### GBrain RESOLVER pattern (Garry Tan)
`INDEX.md` acts as a dispatcher in the same role as [`RESOLVER.md`](https://github.com/garrytan/gbrain/blob/master/skills/RESOLVER.md) in [GBrain](https://github.com/garrytan/gbrain). The INDEX is scanned first; full entries load only on match. Progressive disclosure tiers borrow GBrain's "thin harness, fat skills" philosophy ([`THIN_HARNESS_FAT_SKILLS.md`](https://github.com/garrytan/gbrain/blob/master/docs/ethos/THIN_HARNESS_FAT_SKILLS.md)).
---
## Schema
Every entry's `FINDINGS.md` has structured frontmatter (`topic`, `created`, `last_verified`, `status`, `related`, `sources`, `raw`) and a body with `## Summary`, `## Findings`, `## Discarded approaches`, `## Open questions`, `## Timeline`. See [`SKILL.md`](SKILL.md) for the full schema and rules.
---
## Requirements
- A code CLI that implements the [SKILL.md format](https://agentskills.io/specification) (Claude Code, or any other compatible client)
- For the Investigation phase: an Opus-class model accessible to the spawning agent (the skill defaults to spawning subagents at `model: "opus"`)
---
## Recommended setup
### Pin subagents to Opus
The Investigation phase needs reasoning depth. The skill spawns subagents with `model: "opus"`, but the calling agent has to actually pass that parameter on every spawn. To make it systematic across all your Claude Code sessions, configure your environment to default subagents to Opus.
Two practical approaches:
- **Hook (strongest)**: add a `PreToolUse` hook on the `Agent` tool in `~/.claude/settings.json` that blocks any spawn whose `model` field is not `opus`. The hook runs before the tool dispatches, so a non-opus spawn never reaches the API.
- **Convention (lightest)**: add a one-line note to your `~/.claude/CLAUDE.md`: *"Every Agent tool call MUST pass `model: \"opus\"`."* Claude reads CLAUDE.md every session.
Smaller models work fine for the main conversation. The contrarian pass and synthesis steps in Investigation specifically depend on Opus-class reasoning depth; smaller models tend to skip the contrarian phase or produce shallow syntheses.
---
## Roadmap
### Current activation footprint: ~5,500 tokens, on the heavier side
When the skill activates, the full `SKILL.md` body loads into the main agent's context. As of v0.2.2 the activation cost is approximately 4,500 to 5,500 tokens (depending on tokenizer). The skill registration metadata (frontmatter only, always loaded) is a separate ~130 tokens.
Comparison points:
- Most Vercel `skills.sh` reference skills sit at 500 to 2,000 tokens
- Anthropic reference skills typically run 1,500 to 3,000 tokens
- This skill is roughly double that
The reason is that the Investigation phase is a substantive procedure (5 cognitive phases verbatim, brief checklist, citation rules, required output format, gap handling) and the Storage phase has its own validation rules. The didactic content is real; it earns its keep when Investigation actually runs. But on Retrieval-only calls (the most common path), the agent loads all of it just to get to the loading-hierarchy and lookup-procedure sections.
### Planned: thin-dispatcher refactor (deferred)
The clean architectural answer is to apply progressive disclosure recursively, the same pattern the skill already applies to research data. Specifically: split the single `SKILL.md` into a thin dispatcher plus phase-specific procedure files that load only when their phase is active.
Target structure:
```
research/
├── SKILL.md # thin dispatcher: when, where, which procedure to load
└── procedures/
├── retrieval.md # loading hierarchy, lookup procedure, INDEX patterns
├── investigation.md # cognitive phases, brief checklist, citation rules, output format
└── storage.md # FINDINGS schema, Review-before-storing, conflict handling
```
How it would work mechanically:
1. The agent activates the skill and reads `SKILL.md` (cheap, ~1,500 to 2,000 tokens).
2. `SKILL.md` names which procedure file to load for each phase: *"For Retrieval, read `procedures/retrieval.md`. For Investigation, read `procedures/investigation.md` AFTER deciding mode in Retrieval phase 5. For Storage, read `procedures/storage.md` after Investigation returns."*
3. The agent uses the `Read` tool to pull only the procedure file relevant to the current phase. A pure Retrieval call (read INDEX, sed Summary, answer) never touches `investigation.md` and never pays for it.
This is the same *"thin harness, fat skills"* pattern GBrain uses (`RESOLVER.md` as the dispatcher, individual skill files loaded on demand). Applied here, it's a natural fit because the three phases (Retrieval, Investigation, Storage) are already cleanly separated in the workflow, and the heaviest procedure (Investigation) is also the least-frequent path. Most calls are Retrieval-only.
Expected post-refactor footprint:
- `SKILL.md` (always loaded on activation): ~1,500 to 2,000 tokens
- `procedures/retrieval.md` (loaded on every research-style question): ~800 to 1,200 tokens
- `procedures/investigation.md` (loaded only when fresh research is needed): ~1,500 to 2,000 tokens
- `procedures/storage.md` (loaded only when actually writing): ~800 to 1,200 tokens
A typical Retrieval-only call would pay ~2,500 to 3,200 tokens (SKILL.md + retrieval.md), down from the current ~5,000. An Investigation call would pay roughly the same as today (all phases involved), but at least the cost would be honest: you pay for what you use.
### Why not yet
The skill is working, the size is heavy but not blocking, and there has been no demand from users yet. The repo just launched on 2026-04-25; the only confirmed user is the maintainer, and the maintainer has not hit context-budget pressure on this skill in real workflows. The refactor is a real restructure: probably half a day of careful editing, plus end-to-end testing on every phase (Retrieval new entry, Retrieval merge, Investigation new entry mode, Investigation merge mode, Storage paste path), plus updating the install routes (the symlink at `skills/research/SKILL.md` would need to extend to the procedures folder; the marketplace.json plugin descriptor would need to confirm subdirectory loading is honored), plus rewriting CHANGELOG and bumping to a minor version (likely `v0.3.0`).
The refactor will be triggered when any of these signals lands:
- A user reports the activation cost as a real friction in their context budget
- A new feature pushes `SKILL.md` past 6,000 tokens
- The procedure content grows organically to the point where the dispatcher spine already feels redundant
- Someone files an issue or PR proposing the split
Until one of those triggers, the single-file structure is the right call: everything is in one place, the file is readable end-to-end, and the load-bearing optimization (progressive disclosure of the actual research data via INDEX, then Summary, then full body) already works as designed. Optimizing the SKILL.md itself before there is felt friction is premature engineering.
If you want to discuss the refactor or volunteer feedback on activation cost in your own usage, open an issue at [github.com/hec-ovi/research-skill/issues](https://github.com/hec-ovi/research-skill/issues).
---
## License
[MIT](LICENSE). Free to use, modify, fork, distribute. Attribution appreciated, not required.