Orchestrate MCP Server

OrchestrateMCP is the planning brain for agentic workflows.

Documentation

OrchestrateMCP

An evidence-backed workflow-design advisor for AI agents. Connect it to ChatGPT, Claude (web), Cursor, or Claude Desktop and it plans safer, more grounded AI workflows — grounded in a registry of tested components, edges, and golden-path playbooks. Read-only, stateless, holds no secrets.

Status: registry of 47 components, 78 edges, 1 stack, 5 routes, 5 playbooks; 16 tools; available over stdio and as a free hosted endpoint (https://mcp.orchestratemcp.dev/mcp).

What it does

OrchestrateMCP exposes a structured registry of:

components  →  the building blocks of AI workflows
edges       →  tested relations between components (requires, safer_with, conflicts_with, …)
stacks      →  opinionated technology choices for different deployment contexts
routes      →  tested paths through the component graph
playbooks   →  golden-path routes with full implementation guidance

When a user describes a workflow goal, the MCP can:

Match the goal to required capabilities and components.
Traverse tested component relationships.
Reuse sections of known golden-path playbooks.
Compose a candidate route when no exact playbook exists.
Score route confidence (coverage, tested edges, stack fit, safety, simplicity).
Return the route as structured implementation context for Cursor or Claude.

What works right now

MCP server runs on stdio (Cursor, Claude Desktop) and over Streamable HTTP / a Cloudflare Worker (ChatGPT, claude.ai) — 16 registered tools.
health_check returns { name, version, registry: { component_count, edge_count, stack_count, route_count, playbook_count, untested_edge_pct } }.
Registry loaded from YAML: 47 components, 78 edges, 1 stack, 5 routes, 5 playbooks.
pnpm verify (typecheck + lint + tests) passes from a clean clone and install.

Requirements

Node.js ≥ 20
pnpm

Local setup

cd orchestratekit-mcp
pnpm install
pnpm verify        # typecheck + tests — must pass before anything else
pnpm dev           # starts the MCP server on stdio

The server reads from stdin and writes JSON-RPC to stdout. All log output goes to stderr.

Connect from Cursor

Copy examples/cursor-mcp.json content into your Cursor workspace MCP config at .cursor/mcp.json. Replace the cwd value with the absolute path to this directory.

{
  "mcpServers": {
    "orchestratekit": {
      "command": "npx",
      "args": ["tsx", "src/server.ts"],
      "cwd": "/absolute/path/to/orchestratekit-mcp"
    }
  }
}

Connect from Claude Desktop

macOS: ~/Library/Application Support/Claude/claude_desktop_config.json Windows: %APPDATA%\Claude\claude_desktop_config.json

{
  "mcpServers": {
    "orchestratekit": {
      "command": "npx",
      "args": ["tsx", "src/server.ts"],
      "cwd": "/absolute/path/to/orchestratekit-mcp"
    }
  }
}

Connect from ChatGPT or claude.ai (hosted)

No install, no terminal — point your AI client at the free hosted endpoint:

https://mcp.orchestratemcp.dev/mcp

Full walkthrough (ChatGPT Developer-Mode connector + claude.ai): docs/CHATGPT_USAGE.md.

Connecting your workflow's services

Connecting the MCP to your client takes no auth (it's a read-only advisor). But the workflows it plans need your credentials for Gmail, Slack, Stripe, your CRM, and so on. For how to provision those safely — least-privilege scopes, secret managers, and managed-auth brokers — see docs/CONNECTION_SETUP.md. OrchestrateMCP never holds a credential.

Scripts

Script	Description
`pnpm dev`	Run server directly with tsx (no build step)
`pnpm build`	Compile to `dist/` with tsup
`pnpm typecheck`	TypeScript type-check only (no emit)
`pnpm test`	Run unit tests with vitest
`pnpm verify`	Run `typecheck` then `test`

Project structure

orchestratekit-mcp/
  src/
    server.ts               Entry point — wires MCP server to stdio transport
    config.ts               Server name and version constants
    tools/
      index.ts              Tool registration (16 tools: health_check + 15 graph/advisor tools)
      composeWorkflowRoute.ts
      listGraphComponents.ts / getGraphComponent.ts
      listGraphEdges.ts / getGraphEdge.ts
      getStackRecommendation.ts
      listKnownRoutes.ts / getRoute.ts
    registry/
      registryLoader.ts     YAML loader with validation, status filtering, cross-ref checks
      componentSchema.ts / edgeSchema.ts / stackSchema.ts / routeSchema.ts / playbookSchema.ts
      registryTypes.ts / registryValidation.ts
    graph/
      capabilityMatcher.ts  Keyword + token matching: goal text → components
      routeComposer.ts      Orchestrates all graph modules into a composed route
      routeScoring.ts       Deterministic 0-100 score with breakdown
      routeOrdering.ts      Topological sort via Kahn's algorithm
      safetyAugmenter.ts    Auto-adds approval gates and audit log
      playbookOverlap.ts    Detects overlap with known playbooks/routes
    docs-index/             Supplementary docs loader (future)
    lib/
      errors.ts             McpToolError class and toErrorResult helper
      logger.ts             Stderr-only logger (stdout reserved for transport)

  registry/
    components/             component YAML files (47 active)
    edges/                  edge/relation YAML files (78 active)
    stacks/                 stack YAML files
    routes/                 route YAML files (5 validated)
    playbooks/              golden-path playbook YAML files (5)

  docs-index/               Supplementary context documents
  examples/
    cursor-mcp.json         Example Cursor MCP config
    claude-desktop-config.json  Example Claude Desktop config
  tests/
    health-check.test.ts

Non-goals (by design)

No first-party credential storage — it recommends secret managers / managed-auth brokers, never holds a secret
No auth / OAuth / accounts — the hosted endpoint is read-only and stateless, nothing to log into
No vector database
No graph database (Neo4j etc.)
No automatic registry updates
No LLM API calls inside MCP tools
No SaaS dashboard
No dependency on OrchestrateLab at runtime

Build order

MAR-35  ✅  Scaffold — done
MAR-37  ✅  Graph registry schemas: components, edges, stacks, routes, playbooks
MAR-38  ✅  Seed workflow graph: 30 components, 47 edges, 1 stack, 5 playbooks
MAR-77  ✅  Graph lookup tools: list/get components, edges, stacks, routes
MAR-78  ✅  compose_workflow_route — deterministic route composer
MAR-49  ✅  Benchmark setup — see docs/BENCHMARKING.md
MAR-88  ✅  Domain-gated capability matcher — eliminates cross-domain false positives
MAR-92  ✅  Registry lint + untested_edge_pct in health_check
MAR-95  ✅  crm_note_write component + research→content bridge edge
MAR-96  ✅  Benchmark protocol v2 — rubric, prompts-v2.yaml, PROTOCOL.md
MAR-97  ✅  Docs truth pass — registry counts, tool count, verify path

Benchmarking

To validate that the workflow graph improves planning quality over vanilla Cursor/Claude, run the manual benchmark described in docs/BENCHMARKING.md.

Quick start:

# v2 protocol — print session guide for all 7 prompts
pnpm tsx scripts/benchmark-template.ts --prompts benchmarks/prompts-v2.yaml --all

# v2 — single prompt
pnpm tsx scripts/benchmark-template.ts --prompts benchmarks/prompts-v2.yaml --prompt p6_email_lead_crm

# v1 (legacy)
pnpm tsx scripts/benchmark-template.ts

Results go in benchmarks/results-YYYY-MM-DD.md.