Intercept
Give your AI the ability to read the web. Fetches URLs as clean markdown with 9 fallback strategies. Handles tweets, YouTube, arXiv, PDFs, and regular pages.
intercept-mcp
Give your AI the ability to read the web. One command, no API keys required.
Without it, your AI hits a URL and gets a 403, a wall, or a wall of raw HTML. With intercept, it almost always gets the content — clean markdown, ready to use.
Handles tweets, YouTube videos, arXiv papers, PDFs, and regular web pages. If the first strategy fails, it tries up to 8 more before giving up.
Works with any MCP client: Claude Code, Claude Desktop, Codex, Cursor, Windsurf, Cline, and more.
Install
Claude Code
claude mcp add intercept -s user -- npx -y intercept-mcp
Codex
codex mcp add intercept -- npx -y intercept-mcp
Cursor
Settings → MCP → Add Server:
{
"mcpServers": {
"intercept": {
"command": "npx",
"args": ["-y", "intercept-mcp"]
}
}
}
Windsurf
Settings → MCP → Add Server → same JSON config as above.
Claude Desktop
Add to your claude_desktop_config.json:
{
"mcpServers": {
"intercept": {
"command": "npx",
"args": ["-y", "intercept-mcp"]
}
}
}
Other MCP clients
Any client that supports stdio MCP servers can run npx -y intercept-mcp.
No API keys needed for the fetch tool.
How it works
URLs are processed in three stages:
1. Site-specific handlers
Known URL patterns are routed to dedicated handlers before the fallback pipeline:
| Pattern | Handler | What you get |
|---|---|---|
twitter.com/*/status/*, x.com/*/status/* | Twitter/X | Tweet text, author, media, engagement stats |
youtube.com/watch?v=*, youtu.be/* | YouTube | Title, channel, duration, views, description |
arxiv.org/abs/*, arxiv.org/pdf/* | arXiv | Paper metadata, authors, abstract, categories |
*.pdf | Extracted text (text-layer PDFs only) |
2. Fallback pipeline
If no handler matches (or the handler returns nothing), the URL enters the multi-tier pipeline:
| Tier | Fetcher | Strategy |
|---|---|---|
| 1 | Jina Reader | Clean text extraction service |
| 2 | Wayback + Codetabs | Archived version + CORS proxy (run in parallel) |
| 3 | Raw fetch | Direct GET with browser headers |
| 4 | RSS, CrossRef, Semantic Scholar, HN, Reddit | Metadata / discussion fallbacks |
| 5 | OG Meta | Open Graph tags (guaranteed fallback) |
Tier 2 fetchers run in parallel. When both succeed, the higher quality result wins. All other tiers run sequentially.
3. Caching
Results are cached in-memory for the session (max 100 entries, LRU eviction). Failed URLs are also cached to prevent re-attempting known-dead URLs.
Tools
fetch
Fetch a URL and return its content as clean markdown.
url(string, required) — URL to fetchmaxTier(number, optional, 1-5) — Stop at this tier for speed-sensitive cases
search
Search the web and return results.
query(string, required) — Search querycount(number, optional, 1-20, default 5) — Number of results
Uses Brave Search API if BRAVE_API_KEY is set, otherwise falls back to SearXNG.
Environment variables
| Variable | Required | Description |
|---|---|---|
BRAVE_API_KEY | No | Brave Search API key (free tier: 2,000 queries/month) |
SEARXNG_URL | No | Self-hosted SearXNG instance URL |
The search tool needs at least one backend configured. Public SearXNG instances are rate-limited and unreliable in practice. A free Brave Search API key (2,000 queries/month) is the realistic zero-cost option. Set SEARXNG_URL only if you run your own instance.
The fetch tool works without any keys.
URL normalization
Incoming URLs are automatically cleaned:
- Strips 60+ tracking params (UTM, click IDs, analytics, A/B testing, etc.)
- Removes hash fragments
- Upgrades to HTTPS
- Cleans AMP artifacts
- Preserves functional params (
ref,format,page,offset,limit)
Content quality detection
Each fetcher result is scored for quality. Automatic fail on:
- CAPTCHA / Cloudflare challenges
- Login walls
- HTTP error pages in body
- Content under 200 characters
Requirements
- Node.js >= 18
- No API keys required for basic use (fetch only)
เซิร์ฟเวอร์ที่เกี่ยวข้อง
Bright Data
ผู้สนับสนุนDiscover, extract, and interact with the web - one interface powering automated access across the public internet.
MCP Query Table
Query financial web tables from sources like iwencai, tdx, and eastmoney using Playwright.
Web Fetch
Fetches and transforms web content, including JavaScript-rendered pages and media files, into various formats.
MCP Server Collector
Discovers and collects MCP servers from the internet.
yt-dlp
Download video and audio from YouTube and other platforms using the yt-dlp tool.
Crawl4AI MCP Server
An MCP server for advanced web crawling, content extraction, and AI-powered analysis using the crawl4ai library.
HTTP Requests
An MCP server for making HTTP requests, enabling LLMs to fetch and process web content.
Real Estate MCP Server
Property search and market analysis from Redfin with neighborhood insights
Claimify
Extracts factual claims from text using the Claimify methodology. Requires an OpenAI API key.
Shufersal MCP Server
Automates shopping on the Shufersal website, enabling LLMs to search for products, create shopping lists, and manage the cart.
Open Crawler MCP Server
A web crawler and content extractor that supports multiple output formats like text, markdown, and JSON.