https://github.com/k-kolomeitsev/data-structure-protocol
Graph-based long-term memory skill for AI (LLM) coding agents — faster context, fewer tokens, safer refactors
https://github.com/k-kolomeitsev/data-structure-protocol
agent-skills ai calude claude-code code-generation code-review codex codex-skills cursor cursor-ai gpt openai prompt-engineering skills structured-data
Last synced: 28 days ago
JSON representation
Graph-based long-term memory skill for AI (LLM) coding agents — faster context, fewer tokens, safer refactors
- Host: GitHub
- URL: https://github.com/k-kolomeitsev/data-structure-protocol
- Owner: k-kolomeitsev
- License: apache-2.0
- Created: 2026-02-20T12:54:49.000Z (3 months ago)
- Default Branch: main
- Last Pushed: 2026-03-22T15:40:28.000Z (about 2 months ago)
- Last Synced: 2026-03-23T06:38:51.851Z (about 2 months ago)
- Topics: agent-skills, ai, calude, claude-code, code-generation, code-review, codex, codex-skills, cursor, cursor-ai, gpt, openai, prompt-engineering, skills, structured-data
- Language: Python
- Homepage:
- Size: 91.8 KB
- Stars: 23
- Watchers: 0
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
- awesome-agent-skills - k-kolomeitsev/data-structure-protocol - Graph-based long-term memory skill for AI (LLM) coding agents — faster context, fewer tokens, safer refactors (Sponsors ❤️ / Community Skills)
- awesome-openclaw-skills - k-kolomeitsev/data-structure-protocol - based long-term memory skill for AI (LLM) coding agents — faster context, fewer tokens, safer ... | - | (Code & Developer Tools)
README
[](https://github.com/k-kolomeitsev/data-structure-protocol)
[](LICENSE)
[](https://python.org)
[]()
[]()
[]()
# Data Structure Protocol (DSP)
**The missing memory layer for AI-assisted development**
---
## The problem
Your agent re-reads the same codebase every session. **DSP fixes that.**
Every time you start a new task, your AI coding agent spends the first 5–15 minutes "getting oriented" — scanning files, tracing imports, figuring out what depends on what. On large projects this becomes a constant tax on tokens and attention. Context is rebuilt from scratch, every single time.
DSP is a graph-based long-term structural memory stored in `.dsp/`. It gives agents a persistent, versionable map of your codebase — entities, dependencies, public APIs, and the *reasons* behind every connection — so they can pick up exactly where they left off.
> **DSP is not another workflow framework.** It's the persistent structural memory layer that's missing from every AI coding workflow.
---
## Install
**macOS / Linux:**
```bash
curl -fsSL https://raw.githubusercontent.com/k-kolomeitsev/data-structure-protocol/main/install.sh | bash
```
**Windows:**
```powershell
irm https://raw.githubusercontent.com/k-kolomeitsev/data-structure-protocol/main/install.ps1 | iex
```
**Codex:**
```
$skill-installer install https://github.com/k-kolomeitsev/data-structure-protocol/tree/main/skills/data-structure-protocol
```
---
## What you get
- **Agent stops re-learning your project every session** — structural context persists across tasks, sessions, and even team members
- **Dependency discovery in seconds, not minutes** — graph traversal replaces full-repo scanning
- **Impact analysis before refactors** — know what breaks before you touch it
- **Safer changes on brownfield codebases** — hidden couplings become visible edges in the graph
- **Works with Claude Code, Cursor, Codex — no lock-in** — DSP is an agent skill, not a platform
- **Git-native and versionable** — `.dsp/` is plain text, diffs cleanly, reviews like code
> **Honest trade-off:** bootstrapping DSP on a large project takes real effort (time, tokens, discipline). It pays back over the project lifetime through lower per-task token usage, faster discovery, and more predictable agent behavior.
---
## How it works
```
┌──────────────────────┐
│ Codebase │
│ (files + assets) │
└──────────┬───────────┘
│ create/update graph as you work
▼
┌──────────────────────┐
│ DSP Builder / CLI │
│ (dsp-cli.py) │
└──────────┬───────────┘
│ writes
▼
┌──────────────────────┐
│ .dsp/ │
│ entity graph + whys │
└──────────┬───────────┘
│ reads/searches/traverses
▼
┌──────────────────────┐
│ LLM Orchestrator │
│ (your agent + skill) │
└──────────────────────┘
```
As you work, DSP builds a lightweight graph of your codebase: modules, functions, dependencies, and public APIs. Each connection carries a `why` — the reason it exists. Your agent reads this graph instead of re-scanning the repo, navigates structure through graph traversal, and keeps the graph updated as code evolves.
The graph lives in `.dsp/` — plain text files that commit, diff, and merge like any other source artifact.
---
## Quick start
### Option A: Start from the boilerplate (fastest)
[**dsp-boilerplate**](https://github.com/k-kolomeitsev/dsp-boilerplate) is a production-ready fullstack starter — **NestJS 11 + React 19 + Vite 7** in Docker Compose, with a **fully initialized DSP graph**, pre-configured skills for all agents, Cursor rules, git hooks, and CI.
```bash
git clone https://github.com/k-kolomeitsev/dsp-boilerplate.git my-project
cd my-project
docker-compose up -d
```
Everything is wired: `.dsp/` graph with two roots (backend + frontend), `@dsp` markers in all source files, DSP skills for Cursor, Claude Code, and Codex. You can start coding and the agent already knows the entire project structure.
### Option B: Add DSP to any project
#### 1. Initialize
```bash
python dsp-cli.py --root . init
```
#### 2. Create entities
```bash
python dsp-cli.py --root . create-object "src/app.ts" "Main application entrypoint"
# → obj-a1b2c3d4
python dsp-cli.py --root . create-function "src/app.ts#start" "Starts the HTTP server" --owner obj-a1b2c3d4
# → func-7f3a9c12
python dsp-cli.py --root . add-import obj-a1b2c3d4 obj-deadbeef "HTTP routing"
```
#### 3. Navigate
```bash
python dsp-cli.py --root . search "authentication"
python dsp-cli.py --root . find-by-source "src/auth/index.ts"
python dsp-cli.py --root . get-children obj-a1b2c3d4 --depth 2
```
#### 4. Impact analysis
```bash
python dsp-cli.py --root . get-parents obj-a1b2c3d4 --depth inf
python dsp-cli.py --root . get-recipients obj-a1b2c3d4
```
> Before any refactor, run `get-parents` or `get-recipients` to see everything that depends on the entity you're about to change.
---
## Supported agents
DSP installs as a skill for your agent. Pick your agent and scope.
Don't have a coding agent yet? Install one first:
| Agent | Install |
|---|---|
| **Claude Code** | `npm i -g @anthropic-ai/claude-code` — [docs](https://docs.anthropic.com/en/docs/claude-code/setup) |
| **Cursor** | [cursor.com/downloads](https://www.cursor.com/downloads) — [docs](https://docs.cursor.com) |
| **Codex CLI** | `npm i -g @openai/codex` — [docs](https://developers.openai.com/codex/cli) \| [github](https://github.com/openai/codex) |
### macOS / Linux
| Agent | Project Install | Global Install |
|---|---|---|
| **Cursor** | `curl -fsSL https://raw.githubusercontent.com/k-kolomeitsev/data-structure-protocol/main/install.sh \| bash -s -- cursor` | `curl -fsSL https://raw.githubusercontent.com/k-kolomeitsev/data-structure-protocol/main/install.sh \| bash -s -- --global cursor` |
| **Claude Code** | `curl -fsSL https://raw.githubusercontent.com/k-kolomeitsev/data-structure-protocol/main/install.sh \| bash -s -- claude` | `curl -fsSL https://raw.githubusercontent.com/k-kolomeitsev/data-structure-protocol/main/install.sh \| bash -s -- --global claude` |
| **Codex** | `curl -fsSL https://raw.githubusercontent.com/k-kolomeitsev/data-structure-protocol/main/install.sh \| bash -s -- codex` | `curl -fsSL https://raw.githubusercontent.com/k-kolomeitsev/data-structure-protocol/main/install.sh \| bash -s -- --global codex` |
### Windows
```powershell
# Project-level (current directory)
irm https://raw.githubusercontent.com/k-kolomeitsev/data-structure-protocol/main/install.ps1 | iex
# With specific agent
powershell -ExecutionPolicy Bypass -File install.ps1 -Agent cursor
powershell -ExecutionPolicy Bypass -File install.ps1 -Agent claude
powershell -ExecutionPolicy Bypass -File install.ps1 -Agent codex
# Global (user-level)
powershell -ExecutionPolicy Bypass -File install.ps1 -Agent cursor -Global
```
### Codex (alternative)
```
$skill-installer install https://github.com/k-kolomeitsev/data-structure-protocol/tree/main/skills/data-structure-protocol
```
> **Project install** puts the skill in your repo (`.cursor/skills/`, `.claude/skills/`, `.codex/skills/`).
> **Global install** puts it in your home directory so it's available across all projects.
---
## DSP vs alternatives
Modern agents already know how to plan, write tests, verify, and ship. They don't need process wrappers. What they lack is **memory**.
| | **DSP** | **GSD** | **Superpowers** |
|---|---|---|---|
| **Core idea** | Persistent structural memory | Process/confidence wrapper | Engineering discipline (TDD) |
| **What it solves** | Agent has no memory of project between sessions | Agent doesn't follow structured workflow | Agent might skip tests/planning |
| **Is the problem real?** | Yes — no model has built-in project memory | Diminishing — modern models plan and verify natively | Diminishing — modern models know TDD when prompted |
| **Persistent memory** | Full graph across sessions | None | None |
| **Impact analysis** | Built-in (graph traversal) | No | No |
| **Brownfield** | First-class | One-time scan | No explicit support |
| **Overhead** | Low | Medium | Medium |
> Modern agents are smarter than most mid-level engineers. They plan, they test, they verify. They just can't remember your project. DSP is the fix. [Detailed comparison with GSD](./docs/comparisons/dsp-vs-gsd.md) | [Detailed comparison with Superpowers](./docs/comparisons/dsp-vs-superpowers.md)
---
## Core concepts
| Concept | What it is |
|---|---|
| **Entity** | A node in the graph. Either an **Object** (module/file/class/config/external dep) or a **Function** (function/method/handler) |
| **UID** | Stable identifier (`obj-<8hex>`, `func-<8hex>`). File paths are attributes, not identity — entities survive renames and moves |
| **imports** | Outgoing edges — what this entity uses, with a `why` for each connection |
| **shared** | Public API of an object — what it exposes to consumers |
| **exports/** | Reverse index — who imports this entity and why (incoming edges) |
| **TOC** | Per-entrypoint table of contents listing all reachable entities from a root |
UID markers anchor identity in source code:
```ts
// @dsp func-7f3a9c12
export function calculateTotal(items: Item[]): number { /* ... */ }
```
```python
# @dsp func-3c19ab8e
def process_payment(order):
...
```
---
## Storage format
`.dsp/` is plain text in a deterministic directory layout:
```
.dsp/
├── TOC # Table of contents (single root)
├── TOC- # One TOC per root (multi-root projects)
├── obj-a1b2c3d4/ # Object entity
│ ├── description # source, kind, purpose
│ ├── imports # imported UIDs (one per line)
│ ├── shared # exported/shared UIDs (one per line)
│ └── exports/ # reverse index
│ ├── # why the whole object is imported
│ └── / # per shared entity
│ ├── description # what is exported
│ └── # why this shared is imported
└── func-7f3a9c12/ # Function entity
├── description
├── imports
└── exports/
└── # ownership link
```
Full specification: [`ARCHITECTURE.md`](./ARCHITECTURE.md)
---
## Git hooks & CI
DSP ships with hooks that keep the graph in sync with your code:
| Hook | What it does | LLM required |
|---|---|---|
| **pre-commit** | Checks staged files against DSP graph — flags new files without entities, deleted files still referenced, orphans | No |
| **pre-push** | Full graph integrity — orphan detection, cycle detection, stats summary | No |
| **Agent-assisted review** | Deep semantic analysis of changes against DSP entities, dependency impact | Yes |
Install hooks:
```bash
./hooks/install-hooks.sh # macOS/Linux
.\hooks\install-hooks.ps1 # Windows
```
See [`hooks/`](./hooks/) for configuration, standalone scripts, and GitHub Actions integration.
---
## Integration packs
Ready-made configurations for each supported agent:
| Agent | Skill location |
|---|---|
| **Cursor** | `.cursor/skills/data-structure-protocol/` |
| **Claude Code** | `.claude/skills/data-structure-protocol/` |
| **Codex** | `.codex/skills/data-structure-protocol/` |
Each integration includes the skill instructions (`SKILL.md`), CLI (`dsp-cli.py`), and reference docs. See [`integrations/`](./integrations/) for agent-specific setup guides.
---
## Documentation
| Document | Description |
|---|---|
| [**dsp-boilerplate**](https://github.com/k-kolomeitsev/dsp-boilerplate) | Fullstack boilerplate (NestJS + React + Docker Compose) with DSP pre-initialized — the fastest way to start |
| [**GETTING_STARTED.md**](./GETTING_STARTED.md) | Step-by-step guide from install to first impact analysis |
| [**ARCHITECTURE.md**](./ARCHITECTURE.md) | Full protocol specification — entity model, storage format, operations |
| [**docs/comparisons/**](./docs/comparisons/) | Detailed comparisons with GSD, Superpowers, and other tools |
| [**docs/workflows/**](./docs/workflows/) | Workflow guides — bootstrap, brownfield adoption, team usage |
| [**integrations/**](./integrations/) | Agent-specific integration guides and configurations |
---
## Contributing
Contributions are welcome. Areas where help is most valuable:
- **Architecture spec** — improving [`ARCHITECTURE.md`](./ARCHITECTURE.md)
- **CLI** — keeping `dsp-cli.py` aligned with the spec
- **Skill instructions** — refining [`SKILL.md`](./skills/data-structure-protocol/SKILL.md) for agent clarity
- **New integrations** — adding support for more agents and editors
- **Documentation** — examples, workflow guides, comparisons
Please keep changes minimal, explicit, and consistent with the "minimal sufficient context" philosophy.
---
## License
Apache License 2.0 — see [`LICENSE`](./LICENSE).