Data Structure Protocol (DSP)
Graph-based long-term memory skill for AI (LLM) coding agents — faster context, fewer tokens, safer refactors
Data Structure Protocol (DSP)
The missing memory layer for AI-assisted development
The problem
Your agent re-reads the same codebase every session. DSP fixes that.
Every time you start a new task, your AI coding agent spends the first 5–15 minutes "getting oriented" — scanning files, tracing imports, figuring out what depends on what. On large projects this becomes a constant tax on tokens and attention. Context is rebuilt from scratch, every single time.
DSP is a graph-based long-term structural memory stored in .dsp/. It gives agents a persistent, versionable map of your codebase — entities, dependencies, public APIs, and the reasons behind every connection — so they can pick up exactly where they left off.
DSP is not another workflow framework. It's the persistent structural memory layer that's missing from every AI coding workflow.
Install
macOS / Linux:
curl -fsSL https://raw.githubusercontent.com/k-kolomeitsev/data-structure-protocol/main/install.sh | bash
Windows:
irm https://raw.githubusercontent.com/k-kolomeitsev/data-structure-protocol/main/install.ps1 | iex
Codex:
$skill-installer install https://github.com/k-kolomeitsev/data-structure-protocol/tree/main/skills/data-structure-protocol
What you get
- Agent stops re-learning your project every session — structural context persists across tasks, sessions, and even team members
- Dependency discovery in seconds, not minutes — graph traversal replaces full-repo scanning
- Impact analysis before refactors — know what breaks before you touch it
- Safer changes on brownfield codebases — hidden couplings become visible edges in the graph
- Works with Claude Code, Cursor, Codex — no lock-in — DSP is an agent skill, not a platform
- Git-native and versionable —
.dsp/is plain text, diffs cleanly, reviews like code
Honest trade-off: bootstrapping DSP on a large project takes real effort (time, tokens, discipline). It pays back over the project lifetime through lower per-task token usage, faster discovery, and more predictable agent behavior.
How it works
┌──────────────────────┐
│ Codebase │
│ (files + assets) │
└──────────┬───────────┘
│ create/update graph as you work
▼
┌──────────────────────┐
│ DSP Builder / CLI │
│ (dsp-cli.py) │
└──────────┬───────────┘
│ writes
▼
┌──────────────────────┐
│ .dsp/ │
│ entity graph + whys │
└──────────┬───────────┘
│ reads/searches/traverses
▼
┌──────────────────────┐
│ LLM Orchestrator │
│ (your agent + skill) │
└──────────────────────┘
As you work, DSP builds a lightweight graph of your codebase: modules, functions, dependencies, and public APIs. Each connection carries a why — the reason it exists. Your agent reads this graph instead of re-scanning the repo, navigates structure through graph traversal, and keeps the graph updated as code evolves.
The graph lives in .dsp/ — plain text files that commit, diff, and merge like any other source artifact.
Quick start
Option A: Start from the boilerplate (fastest)
dsp-boilerplate is a production-ready fullstack starter — NestJS 11 + React 19 + Vite 7 in Docker Compose, with a fully initialized DSP graph, pre-configured skills for all agents, Cursor rules, git hooks, and CI.
git clone https://github.com/k-kolomeitsev/dsp-boilerplate.git my-project
cd my-project
docker-compose up -d
Everything is wired: .dsp/ graph with two roots (backend + frontend), @dsp markers in all source files, DSP skills for Cursor, Claude Code, and Codex. You can start coding and the agent already knows the entire project structure.
Option B: Add DSP to any project
1. Initialize
python dsp-cli.py --root . init
2. Create entities
python dsp-cli.py --root . create-object "src/app.ts" "Main application entrypoint"
# → obj-a1b2c3d4
python dsp-cli.py --root . create-function "src/app.ts#start" "Starts the HTTP server" --owner obj-a1b2c3d4
# → func-7f3a9c12
python dsp-cli.py --root . add-import obj-a1b2c3d4 obj-deadbeef "HTTP routing"
3. Navigate
python dsp-cli.py --root . search "authentication"
python dsp-cli.py --root . find-by-source "src/auth/index.ts"
python dsp-cli.py --root . get-children obj-a1b2c3d4 --depth 2
4. Impact analysis
python dsp-cli.py --root . get-parents obj-a1b2c3d4 --depth inf
python dsp-cli.py --root . get-recipients obj-a1b2c3d4
Before any refactor, run
get-parentsorget-recipientsto see everything that depends on the entity you're about to change.
Supported agents
DSP installs as a skill for your agent. Pick your agent and scope.
Don't have a coding agent yet? Install one first:
| Agent | Install |
|---|---|
| Claude Code | npm i -g @anthropic-ai/claude-code — docs |
| Cursor | cursor.com/downloads — docs |
| Codex CLI | npm i -g @openai/codex — docs | github |
macOS / Linux
| Agent | Project Install | Global Install |
|---|---|---|
| Cursor | curl -fsSL https://raw.githubusercontent.com/k-kolomeitsev/data-structure-protocol/main/install.sh | bash -s -- cursor | curl -fsSL https://raw.githubusercontent.com/k-kolomeitsev/data-structure-protocol/main/install.sh | bash -s -- --global cursor |
| Claude Code | curl -fsSL https://raw.githubusercontent.com/k-kolomeitsev/data-structure-protocol/main/install.sh | bash -s -- claude | curl -fsSL https://raw.githubusercontent.com/k-kolomeitsev/data-structure-protocol/main/install.sh | bash -s -- --global claude |
| Codex | curl -fsSL https://raw.githubusercontent.com/k-kolomeitsev/data-structure-protocol/main/install.sh | bash -s -- codex | curl -fsSL https://raw.githubusercontent.com/k-kolomeitsev/data-structure-protocol/main/install.sh | bash -s -- --global codex |
Windows
# Project-level (current directory)
irm https://raw.githubusercontent.com/k-kolomeitsev/data-structure-protocol/main/install.ps1 | iex
# With specific agent
powershell -ExecutionPolicy Bypass -File install.ps1 -Agent cursor
powershell -ExecutionPolicy Bypass -File install.ps1 -Agent claude
powershell -ExecutionPolicy Bypass -File install.ps1 -Agent codex
# Global (user-level)
powershell -ExecutionPolicy Bypass -File install.ps1 -Agent cursor -Global
Codex (alternative)
$skill-installer install https://github.com/k-kolomeitsev/data-structure-protocol/tree/main/skills/data-structure-protocol
Project install puts the skill in your repo (
.cursor/skills/,.claude/skills/,.codex/skills/). Global install puts it in your home directory so it's available across all projects.
DSP vs alternatives
Modern agents already know how to plan, write tests, verify, and ship. They don't need process wrappers. What they lack is memory.
| DSP | GSD | Superpowers | |
|---|---|---|---|
| Core idea | Persistent structural memory | Process/confidence wrapper | Engineering discipline (TDD) |
| What it solves | Agent has no memory of project between sessions | Agent doesn't follow structured workflow | Agent might skip tests/planning |
| Is the problem real? | Yes — no model has built-in project memory | Diminishing — modern models plan and verify natively | Diminishing — modern models know TDD when prompted |
| Persistent memory | Full graph across sessions | None | None |
| Impact analysis | Built-in (graph traversal) | No | No |
| Brownfield | First-class | One-time scan | No explicit support |
| Overhead | Low | Medium | Medium |
Modern agents are smarter than most mid-level engineers. They plan, they test, they verify. They just can't remember your project. DSP is the fix. Detailed comparison with GSD | Detailed comparison with Superpowers
Core concepts
| Concept | What it is |
|---|---|
| Entity | A node in the graph. Either an Object (module/file/class/config/external dep) or a Function (function/method/handler) |
| UID | Stable identifier (obj-<8hex>, func-<8hex>). File paths are attributes, not identity — entities survive renames and moves |
| imports | Outgoing edges — what this entity uses, with a why for each connection |
| shared | Public API of an object — what it exposes to consumers |
| exports/ | Reverse index — who imports this entity and why (incoming edges) |
| TOC | Per-entrypoint table of contents listing all reachable entities from a root |
UID markers anchor identity in source code:
// @dsp func-7f3a9c12
export function calculateTotal(items: Item[]): number { /* ... */ }
# @dsp func-3c19ab8e
def process_payment(order):
...
Storage format
.dsp/ is plain text in a deterministic directory layout:
.dsp/
├── TOC # Table of contents (single root)
├── TOC-<rootUid> # One TOC per root (multi-root projects)
├── obj-a1b2c3d4/ # Object entity
│ ├── description # source, kind, purpose
│ ├── imports # imported UIDs (one per line)
│ ├── shared # exported/shared UIDs (one per line)
│ └── exports/ # reverse index
│ ├── <importer_uid> # why the whole object is imported
│ └── <shared_uid>/ # per shared entity
│ ├── description # what is exported
│ └── <importer_uid> # why this shared is imported
└── func-7f3a9c12/ # Function entity
├── description
├── imports
└── exports/
└── <owner_uid> # ownership link
Full specification: ARCHITECTURE.md
Git hooks & CI
DSP ships with hooks that keep the graph in sync with your code:
| Hook | What it does | LLM required |
|---|---|---|
| pre-commit | Checks staged files against DSP graph — flags new files without entities, deleted files still referenced, orphans | No |
| pre-push | Full graph integrity — orphan detection, cycle detection, stats summary | No |
| Agent-assisted review | Deep semantic analysis of changes against DSP entities, dependency impact | Yes |
Install hooks:
./hooks/install-hooks.sh # macOS/Linux
.\hooks\install-hooks.ps1 # Windows
See hooks/ for configuration, standalone scripts, and GitHub Actions integration.
Integration packs
Ready-made configurations for each supported agent:
| Agent | Skill location |
|---|---|
| Cursor | .cursor/skills/data-structure-protocol/ |
| Claude Code | .claude/skills/data-structure-protocol/ |
| Codex | .codex/skills/data-structure-protocol/ |
Each integration includes the skill instructions (SKILL.md), CLI (dsp-cli.py), and reference docs. See integrations/ for agent-specific setup guides.
Documentation
| Document | Description |
|---|---|
| dsp-boilerplate | Fullstack boilerplate (NestJS + React + Docker Compose) with DSP pre-initialized — the fastest way to start |
| GETTING_STARTED.md | Step-by-step guide from install to first impact analysis |
| ARCHITECTURE.md | Full protocol specification — entity model, storage format, operations |
| docs/comparisons/ | Detailed comparisons with GSD, Superpowers, and other tools |
| docs/workflows/ | Workflow guides — bootstrap, brownfield adoption, team usage |
| integrations/ | Agent-specific integration guides and configurations |
Contributing
Contributions are welcome. Areas where help is most valuable:
- Architecture spec — improving
ARCHITECTURE.md - CLI — keeping
dsp-cli.pyaligned with the spec - Skill instructions — refining
SKILL.mdfor agent clarity - New integrations — adding support for more agents and editors
- Documentation — examples, workflow guides, comparisons
Please keep changes minimal, explicit, and consistent with the "minimal sufficient context" philosophy.
License
Apache License 2.0 — see LICENSE.
Related Servers
Scout Monitoring MCP
sponsorPut performance and error data directly in the hands of your AI assistant.
Alpha Vantage MCP Server
sponsorAccess financial market data: realtime & historical stock, ETF, options, forex, crypto, commodities, fundamentals, technical indicators, & more
mcp2cli
CLI bridge that wraps MCP servers as bash-invokable commands, recovering ~11K tokens of context window per session https://github.com/rodaddy/mcp2cli
jarp-mcp
Java Archive Reader Protocol MCP server - Give AI agents X-ray vision into compiled Java code by decompiling JAR/WAR/EAR files and Maven/Gradle dependencies
DDEV MCP Server
Manage DDEV projects, enabling LLM applications to interact with local development environments through the MCP protocol.
mcp-pystub
Auto-detect stubbable packages for Python exe builds (PyInstaller/Nuitka) and generate minimal stub code to reduce executable size
Crypto_MCP
A server for cryptographic operations like encryption, decryption, and hashing.
Code Council
Your AI Code Review Council - Get diverse perspectives from multiple AI models in parallel.
Photoshop MCP Server
An MCP server for integrating with and automating Adobe Photoshop using the photoshop-python-api.
PipeCD
Integrate with PipeCD to manage applications and deployments.
Petstore MCP Server & Client
An MCP server and client implementation for the Swagger Petstore API.
CodeSeeker
Advanced code search and transformation powered by ugrep and ast-grep for modern development workflows.