AgentDesk MCP

Adversarial AI quality review for LLM pipelines. Dual-reviewer consensus with anti-gaming protection. BYOK — works with Claude Code, Claude Desktop, and any MCP client.

AgentDesk MCP — Adversarial AI Review

License: MIT Tests MCP

Quality control for AI pipelines — one MCP tool. Works with Claude Code, Claude Desktop, and any MCP client.

29.5% of teams do NO evaluation of AI outputs. (LangChain Survey) Knowledge workers spend 4.3 hours/week fact-checking AI outputs. (Microsoft 2025)

AgentDesk MCP fixes this. Add independent adversarial review to any AI pipeline in 30 seconds.

Quick Start

npm (recommended)

npx agentdesk-mcp

Claude Code

claude mcp add agentdesk-mcp -- npx agentdesk-mcp

Claude Desktop

{
  "mcpServers": {
    "agentdesk-mcp": {
      "command": "npx",
      "args": ["-y", "agentdesk-mcp"],
      "env": { "ANTHROPIC_API_KEY": "sk-ant-..." }
    }
  }
}

Install from GitHub (alternative)

npm install github:Rih0z/agentdesk-mcp

Requirements

  • ANTHROPIC_API_KEY environment variable (uses your own key — BYOK)

Tools

review_output

Adversarial quality review of any AI-generated output. An independent reviewer assumes the author made mistakes and actively looks for problems.

Input:

ParameterRequiredDescription
outputYesThe AI-generated output to review
criteriaNoCustom review criteria
review_typeNoCategory: code, content, factual, translation, etc.
modelNoReviewer model (default: claude-sonnet-4-6)

Output:

{
  "verdict": "PASS | FAIL | CONDITIONAL_PASS",
  "score": 82,
  "issues": [
    {
      "severity": "high",
      "category": "accuracy",
      "description": "Claim about X is unsupported",
      "suggestion": "Add citation or remove claim"
    }
  ],
  "checklist": [
    {
      "item": "Factual accuracy",
      "status": "pass",
      "evidence": "All statistics match cited sources"
    }
  ],
  "summary": "Overall assessment...",
  "reviewer_model": "claude-sonnet-4-6"
}

review_dual

Dual adversarial review — two independent reviewers assess the output from different angles, then a merge agent combines findings.

  • If either reviewer finds a critical issue → merged verdict is FAIL
  • Takes the lower score
  • Combines and deduplicates all issues

Use for high-stakes outputs where quality is critical.

Same parameters as review_output.

How It Works

  1. Adversarial prompting: The reviewer is instructed to assume mistakes were made. No benefit of the doubt.
  2. Evidence-based checklist: Every PASS item requires specific evidence. Items without evidence are automatically downgraded to FAIL.
  3. Anti-gaming validation: If >30% of checklist items lack evidence, the entire review is forced to FAIL with a capped score of 50.
  4. Structured output: Verdict + numeric score + categorized issues + checklist (not just "looks good").

Use Cases

  • Code review: Check for bugs, security issues, performance problems
  • Content review: Verify accuracy, readability, SEO, audience fit
  • Factual verification: Validate claims in AI-generated text
  • Translation quality: Check accuracy and naturalness
  • Data extraction: Verify completeness and correctness
  • Any AI output: Summaries, reports, proposals, emails, etc.

Why Not Just Ask the Same AI to Review?

Self-review has systematic leniency bias. An LLM reviewing its own output shares the same blind spots that created the errors. Research shows models are 34% more likely to use confident language when hallucinating.

AgentDesk uses a separate reviewer invocation with adversarial prompting — fundamentally different from self-review.

Comparison

FeatureAgentDesk MCPManual promptBraintrustDeepEval
One-tool setupYesNoNoNo
Adversarial reviewYesDIYNoNo
Dual reviewerYesDIYNoNo
Anti-gaming validationYesNoNoNo
No SDK requiredYesYesNoNo
MCP nativeYesNoNoNo

Limitations

  • Prompt injection: Like all LLM-as-judge systems, adversarial inputs could attempt to manipulate reviewer verdicts. The anti-gaming validation layer mitigates superficial gaming, but determined adversarial inputs remain a challenge. For high-stakes use cases, combine with deterministic validation.
  • BYOK cost: Each review_output call makes 1 LLM API call; review_dual makes 3. Factor this into your pipeline costs.

Hosted API (Separate Product)

For teams that prefer HTTP integration, a hosted REST API with additional features (agent marketplace, context learning, workflows) is available at agentdesk-blue.vercel.app.

Development

git clone https://github.com/Rih0z/agentdesk-mcp.git
cd agentdesk-mcp
npm install
npm test        # 35 tests
npm run build

License

MIT


Built by EZARK Consulting | Web Version

Serveurs connexes