ctxai

A version-aware MCP server that prevents AI coding hallucinations by validating suggestions against your actual installed packages.

ctxai

ctxai is a Model Context Protocol (MCP) server that makes AI coding assistants version-aware. It reads your actual installed packages, injects that context into the LLM, and validates every code suggestion against your real environment — catching hallucinated imports, non-existent method calls, and dangerous phantom packages before they reach your editor.


The problem it solves

AI coding assistants hallucinate in three specific ways that are hard to catch:

  1. Package hallucinations — suggesting import helmet from 'helmet' when helmet isn't in your package.json
  2. Method hallucinations — calling prisma.user.findFirstOrThrow() when you're on Prisma v3, where that method doesn't exist yet
  3. Phantom packages — inventing package names like express-mongoose or react-query-utils that don't exist on npm, which threat actors can register as typosquats

All three look like valid code. All three fail at runtime — or worse, install malware. ctxai catches them at suggestion time.


How it works

ctxai exposes four MCP tools that an LLM client (Claude Desktop, Cursor, Kiro, etc.) calls automatically:

get_project_context   →  scan project       →  return version fingerprint
validate_suggestion   →  check code         →  return hallucination warnings
check_package_safety  →  check new packages →  return safety issues
get_package_docs      →  fetch registry     →  return real API info

Tool 1 — get_project_context

Scans your project root and returns a structured fingerprint of every installed package and its exact version:

node: [email protected]
node: @prisma/[email protected]
python: [email protected]
python: [email protected]

This fingerprint is injected into the LLM context before every response, constraining it to only suggest APIs that exist in your installed versions. Results are cached for 5 minutes so repeated calls within a session are instant.

Tool 2 — validate_suggestion

Takes AI-generated code and the fingerprint from Tool 1, then runs three validation layers:

LayerWhat it checksWarning type
1Is every imported package in your dependencies?MISSING_PACKAGE
2Does every method call exist in your installed version?HALLUCINATED_METHOD
3What's the closest real alternative?Suggestion in warning

Returns human-readable output with the exact offending identifier, severity, and a corrected install command or method suggestion.

Example output:

⚠️  Found 1 issue in the suggested code:

🔴 [Missing package] 'helmet' is not listed in your project dependencies.
   → Run 'npm install helmet' to add it, or check if the package name has changed.

The code above cannot run as-is. Fix the missing packages before using it.

Tool 3 — check_package_safety

Checks every new package the AI suggests installing against three safety layers:

LayerWhat it checksIssue type
1Does this package exist on npm / PyPI?PHANTOM_PACKAGE
2Is it suspiciously similar to a popular package?LIKELY_TYPOSQUAT / LIKELY_CONFLATION
3Is it brand new, has no repo, or very few versions?LOW_TRUST_PACKAGE

Packages already in your fingerprint are skipped — you've already made that trust decision.

Example output:

🚨 Found 1 critical issue across 1 new package.

📦 expres
   🚨 [Likely typosquat] 'expres' exists on the registry but is suspiciously
      similar to 'express' (edit distance: 1). This is a known typosquatting pattern.
      → Verify you meant 'express'. If you intentionally want 'expres', inspect
        its source code and maintainers before installing.

🛑 Do NOT install the flagged packages without manual verification.

Tool 4 — get_package_docs

Fetches live metadata from npm or PyPI for a specific package version. Used by the LLM to self-correct after a hallucination is detected — finds the correct method name for the version you actually have installed.


Architecture

ctxai/
├── src/
│   ├── index.ts                      # MCP server — registers all 4 tools
│   ├── formatters.ts                 # Converts typed results → readable MCP strings
│   ├── tools/
│   │   ├── getProjectContext.ts      # Tool 1: scan project + build fingerprint
│   │   ├── validateSuggestion.ts     # Tool 2: 3-layer hallucination validator
│   │   └── getPackageDocs.ts         # Tool 4: live registry metadata
│   ├── utils/
│   │   ├── checkPackageSafety.ts     # Tool 3: phantom/typosquat/trust checker
│   │   ├── registryClient.ts         # Typed npm + PyPI registry clients
│   │   ├── typosquatDetector.ts      # Levenshtein-based typosquat detection
│   │   ├── fuzzy.ts                  # Closest-match suggestions
│   │   ├── npmRegistry.ts            # npm metadata client (used by getPackageDocs)
│   │   └── pypiRegistry.ts           # PyPI metadata client (used by getPackageDocs)
│   ├── parser/
│   │   ├── responseParser.ts         # Extracts imports + method calls from code
│   │   └── fingerprintBuilder.ts     # Formats detected packages into fingerprint
│   ├── detectors/
│   │   ├── index.ts                  # Orchestrates Node + Python detection
│   │   ├── node.ts                   # Reads package.json + TypeScript API surface
│   │   └── python.ts                 # Reads requirements.txt + Python API surface
│   └── cache/
│       └── sessionCache.ts           # In-memory TTL cache (5 min)
└── benchmark/
    ├── run.ts                        # Benchmark runner with hallucination metrics
    └── prompts/                      # 28 test cases (JSON)

Installation

Prerequisites

  • Node.js 18+
  • TypeScript 5+
  • Python 3 (optional, for Python project validation)

Build

cd ctxai
npm install
npm run build

Run

npm start
# or in dev mode (no build step)
npm run dev

The server communicates over stdio, which is the standard MCP transport.


MCP client configuration

Kiro

Add to .kiro/settings/mcp.json in your workspace:

{
  "mcpServers": {
    "ctxai": {
      "command": "node",
      "args": ["/absolute/path/to/ctxai/build/index.js"],
      "disabled": false,
      "autoApprove": [
        "get_project_context",
        "validate_suggestion",
        "check_package_safety",
        "get_package_docs"
      ]
    }
  }
}

Windows + fnm/nvm users: node may not resolve when Kiro launches the server outside your shell session. Use the full path to node.exe instead:

"command": "C:\\Users\\YOU\\AppData\\Roaming\\fnm\\node-versions\\v20.0.0\\installation\\node.exe"

Find your path with: Get-Command node | Select-Object -ExpandProperty Source (PowerShell)

Claude Desktop

Edit ~/Library/Application Support/Claude/claude_desktop_config.json (macOS) or %APPDATA%\Claude\claude_desktop_config.json (Windows):

{
  "mcpServers": {
    "ctxai": {
      "command": "node",
      "args": ["/absolute/path/to/ctxai/build/index.js"]
    }
  }
}

Cursor / VS Code (Cline)

Edit your cline_mcp_settings.json:

{
  "mcpServers": {
    "ctxai": {
      "command": "node",
      "args": ["/absolute/path/to/ctxai/build/index.js"],
      "disabled": false,
      "alwaysAllow": []
    }
  }
}

After adding the config, reload/reconnect MCP servers from the command palette. You should see ctxai with 4 tools listed.


Testing the integration

1. Smoke test — does the server start?

npm run build
node build/index.js
# Expected: ctxai MCP server v0.1.0 running on stdio

2. Manual tool calls via stdio

Test each tool by piping JSON directly to the server:

Tool 1 — scan your project:

echo '{"jsonrpc":"2.0","id":1,"method":"tools/call","params":{"name":"get_project_context","arguments":{"path":"/your/project/path"}}}' \
  | node build/index.js

Tool 2 — catch a missing package:

echo '{"jsonrpc":"2.0","id":2,"method":"tools/call","params":{"name":"validate_suggestion","arguments":{"code":"import helmet from \"helmet\";","contextFingerprint":"node: [email protected]"}}}' \
  | node build/index.js
# Expected: 🔴 [Missing package] 'helmet' is not listed in your project dependencies.

Tool 3 — catch a typosquat:

echo '{"jsonrpc":"2.0","id":3,"method":"tools/call","params":{"name":"check_package_safety","arguments":{"code":"import expres from \"expres\";","contextFingerprint":"node: [email protected]"}}}' \
  | node build/index.js
# Expected: 🚨 [Likely typosquat] 'expres' is suspiciously similar to 'express'

Tool 4 — fetch live docs:

echo '{"jsonrpc":"2.0","id":4,"method":"tools/call","params":{"name":"get_package_docs","arguments":{"packageName":"express","version":"4.18.2","registry":"npm"}}}' \
  | node build/index.js

3. Run the benchmark

npm run benchmark

Expected output:

════════════════════════════════════════════════════════════
  HALLUCINATION REDUCTION METRICS
════════════════════════════════════════════════════════════

  Benchmark accuracy     100.0%  (28/28 tests match expected)

  Detection rate         100.0%
  False positives            0
  Precision              100.0%
  F1 Score               100.0%

Benchmark

ctxai ships with 28 test cases covering every validation scenario. Results include hallucination reduction metrics — detection rate, precision, and F1 score — so you can measure the impact of any changes.

What the benchmark covers

CategoryPromptsWhat's tested
Happy path11Valid code against correct fingerprint — zero false positives
Node packages6Single and multiple missing npm imports
Python packages5Missing pip packages, hyphen/underscore normalisation, pip name correction
Method hallucination2Methods that don't exist in the installed version (via mock API surface)
Multi-package2Stress tests with 3–5 hallucinated packages at once
Edge cases2Empty fingerprint, prose+code blocks

Adding a test case

Create a JSON file in benchmark/prompts/:

{
  "name": "My Test Case",
  "projectFingerprint": "node: [email protected]",
  "aiGeneratedCode": "import helmet from 'helmet';\nconst app = require('express')();",
  "expectedViolations": 1
}

For method hallucination tests, inject a mock API surface so the test doesn't require real node_modules:

{
  "name": "Prisma - Method Hallucination",
  "projectFingerprint": "node: @prisma/[email protected]",
  "aiGeneratedCode": "const prisma = new PrismaClient();\nawait prisma.user.findFirstOrThrow({ where: { id: 1 } });",
  "apiSurfaceOverrides": {
    "@prisma/client": ["findFirst", "findMany", "create", "update", "delete"]
  },
  "expectedViolations": 1,
  "_note": "findFirstOrThrow was added in Prisma v4 — should be caught on v3"
}

Fields:

FieldRequiredDescription
nameHuman-readable test name
projectFingerprintSimulated installed packages (source: name@version per line)
aiGeneratedCodeThe AI-generated code to validate
expectedViolationsExact number of warnings expected
apiSurfaceOverridesMock API surface for method checks (bypasses node_modules)
_noteInternal documentation, ignored by the runner

Warning and issue types

validate_suggestion warnings

interface ValidationWarning {
  type: "MISSING_PACKAGE" | "HALLUCINATED_METHOD" | "UNKNOWN_PACKAGE";
  severity: "error" | "warning" | "info";
  message: string;           // Human-readable description
  suggestion: string;        // Correct install command or method name
  offender: string;          // The exact identifier that triggered the warning
  packageName?: string;      // Package context (HALLUCINATED_METHOD only)
  installedVersion?: string; // Installed version (HALLUCINATED_METHOD only)
}

check_package_safety issues

interface SafetyIssue {
  type: "PHANTOM_PACKAGE" | "LIKELY_TYPOSQUAT" | "LIKELY_CONFLATION"
      | "LOW_TRUST_PACKAGE" | "SECURITY_HOLD";
  severity: "critical" | "warning" | "info";
  packageName: string;
  ecosystem: "node" | "python";
  message: string;
  suggestion: string;
  meta?: {
    similarTo?: string;      // The popular package it resembles
    editDistance?: number;   // Levenshtein distance to the popular package
    ageInDays?: number;      // How old the package is
    versionCount?: number;   // How many versions it has
    hasRepository?: boolean; // Whether it has a repo link
  }
}

Supported languages and ecosystems

LanguagePackage fileRegistryMethod validation
JavaScript / TypeScriptpackage.jsonnpmVia .d.ts type definitions
Pythonrequirements.txt, pyproject.tomlPyPIVia dir() introspection

Python import → pip name mapping

ctxai knows that Python import names often differ from pip package names and generates correct install commands:

Importpip install
from rest_framework import ...pip install djangorestframework
from PIL import Imagepip install Pillow
import cv2pip install opencv-python
from sklearn import ...pip install scikit-learn
import jwtpip install PyJWT
import yamlpip install PyYAML
from bs4 import ...pip install beautifulsoup4

80+ mappings are built in. See src/tools/validateSuggestion.tsPYTHON_IMPORT_TO_PIP for the full list.


Design decisions

Why four tools instead of one? Each tool has a distinct trigger condition. get_project_context runs once per session. validate_suggestion runs on every code response. check_package_safety runs only when new packages are suggested. get_package_docs runs on-demand for self-correction. Splitting them lets the LLM call only what's needed.

Why MCP? MCP is the emerging standard for giving LLMs structured access to local tools. Any MCP-compatible client gets ctxai for free without custom integrations.

Why a fingerprint string instead of JSON? The fingerprint format (node: [email protected]) is compact, human-readable, and token-efficient. It fits in the LLM context without wasting tokens on JSON syntax.

Why not just use the LLM's training data? Training data is frozen at a cutoff date and doesn't know what's installed in your project. ctxai reads your actual node_modules and requirements.txt at runtime.

Why prefer false negatives over false positives? If ctxai can't determine whether a method exists (no type definitions, no stubs), it stays silent rather than warning. A missed hallucination is less disruptive than a false alarm on valid code.


Contributing

The benchmark is the best place to start. If you find a case where ctxai produces a false positive or misses a hallucination:

  1. Add a prompt JSON to benchmark/prompts/ that reproduces the issue
  2. Set expectedViolations to what the correct behaviour should be
  3. Run npm run benchmark — if it fails, the bug is confirmed
  4. Fix the validator and verify the benchmark goes green

License

MIT

Verwandte Server

NotebookLM Web Importer

Importieren Sie Webseiten und YouTube-Videos mit einem Klick in NotebookLM. Vertraut von über 200.000 Nutzern.

Chrome-Erweiterung installieren