Intercept

Give your AI the ability to read the web. Fetches URLs as clean markdown with 9 fallback strategies. Handles tweets, YouTube, arXiv, PDFs, and regular pages.

intercept-mcp

Give your AI the ability to read the web. One command, no API keys required.

Without it, your AI hits a URL and gets a 403, a wall, or a wall of raw HTML. With intercept, it almost always gets the content — clean markdown, ready to use.

Handles tweets, YouTube videos, arXiv papers, PDFs, and regular web pages. If the first strategy fails, it tries up to 8 more before giving up.

Works with any MCP client: Claude Code, Claude Desktop, Codex, Cursor, Windsurf, Cline, and more.

Install

Claude Code

claude mcp add intercept -s user -- npx -y intercept-mcp

Codex

codex mcp add intercept -- npx -y intercept-mcp

Cursor

Settings → MCP → Add Server:

{
  "mcpServers": {
    "intercept": {
      "command": "npx",
      "args": ["-y", "intercept-mcp"]
    }
  }
}

Windsurf

Settings → MCP → Add Server → same JSON config as above.

Claude Desktop

Add to your claude_desktop_config.json:

{
  "mcpServers": {
    "intercept": {
      "command": "npx",
      "args": ["-y", "intercept-mcp"]
    }
  }
}

Other MCP clients

Any client that supports stdio MCP servers can run npx -y intercept-mcp.

No API keys needed for the fetch tool.

How it works

URLs are processed in three stages:

1. Site-specific handlers

Known URL patterns are routed to dedicated handlers before the fallback pipeline:

PatternHandlerWhat you get
twitter.com/*/status/*, x.com/*/status/*Twitter/XTweet text, author, media, engagement stats
youtube.com/watch?v=*, youtu.be/*YouTubeTitle, channel, duration, views, description
arxiv.org/abs/*, arxiv.org/pdf/*arXivPaper metadata, authors, abstract, categories
*.pdfPDFExtracted text (text-layer PDFs only)

2. Fallback pipeline

If no handler matches (or the handler returns nothing), the URL enters the multi-tier pipeline:

TierFetcherStrategy
1Jina ReaderClean text extraction service
2Wayback + CodetabsArchived version + CORS proxy (run in parallel)
3Raw fetchDirect GET with browser headers
4RSS, CrossRef, Semantic Scholar, HN, RedditMetadata / discussion fallbacks
5OG MetaOpen Graph tags (guaranteed fallback)

Tier 2 fetchers run in parallel. When both succeed, the higher quality result wins. All other tiers run sequentially.

3. Caching

Results are cached in-memory for the session (max 100 entries, LRU eviction). Failed URLs are also cached to prevent re-attempting known-dead URLs.

Tools

fetch

Fetch a URL and return its content as clean markdown.

  • url (string, required) — URL to fetch
  • maxTier (number, optional, 1-5) — Stop at this tier for speed-sensitive cases

search

Search the web and return results.

  • query (string, required) — Search query
  • count (number, optional, 1-20, default 5) — Number of results

Uses Brave Search API if BRAVE_API_KEY is set, otherwise falls back to SearXNG.

Environment variables

VariableRequiredDescription
BRAVE_API_KEYNoBrave Search API key (free tier: 2,000 queries/month)
SEARXNG_URLNoSelf-hosted SearXNG instance URL

The search tool needs at least one backend configured. Public SearXNG instances are rate-limited and unreliable in practice. A free Brave Search API key (2,000 queries/month) is the realistic zero-cost option. Set SEARXNG_URL only if you run your own instance.

The fetch tool works without any keys.

URL normalization

Incoming URLs are automatically cleaned:

  • Strips 60+ tracking params (UTM, click IDs, analytics, A/B testing, etc.)
  • Removes hash fragments
  • Upgrades to HTTPS
  • Cleans AMP artifacts
  • Preserves functional params (ref, format, page, offset, limit)

Content quality detection

Each fetcher result is scored for quality. Automatic fail on:

  • CAPTCHA / Cloudflare challenges
  • Login walls
  • HTTP error pages in body
  • Content under 200 characters

Requirements

  • Node.js >= 18
  • No API keys required for basic use (fetch only)

Related Servers