Read Website Fast
Fast, token-efficient web content extraction that converts websites to clean Markdown. Features Mozilla Readability, smart caching, polite crawling with robots.txt support, and concurrent fetching with minimal dependencies.
@just-every/mcp-read-website-fast
Fast, token-efficient web content extraction for AI agents - converts websites to clean Markdown.
Overview
Existing MCP web crawlers are slow and consume large quantities of tokens. This pauses the development process and provides incomplete results as LLMs need to parse whole web pages.
This MCP package fetches web pages locally, strips noise, and converts content to clean Markdown while preserving links. Designed for Claude Code, IDEs and LLM pipelines with minimal token footprint. Crawl sites locally with minimal dependencies.
Note: This package now uses @just-every/crawl for its core crawling and markdown conversion functionality.
Features
- Fast startup using official MCP SDK with lazy loading for optimal performance
- Content extraction using Mozilla Readability (same as Firefox Reader View)
- HTML to Markdown conversion with Turndown + GFM support
- Smart caching with SHA-256 hashed URLs
- Polite crawling with robots.txt support and rate limiting
- Concurrent fetching with configurable depth crawling
- Stream-first design for low memory usage
- Link preservation for knowledge graphs
- Optional chunking for downstream processing
Installation
Claude Code
claude mcp add read-website-fast -s user -- npx -y @just-every/mcp-read-website-fast
VS Code
code --add-mcp '{"name":"read-website-fast","command":"npx","args":["-y","@just-every/mcp-read-website-fast"]}'
Cursor
cursor://anysphere.cursor-deeplink/mcp/install?name=read-website-fast&config=eyJyZWFkLXdlYnNpdGUtZmFzdCI6eyJjb21tYW5kIjoibnB4IiwiYXJncyI6WyIteSIsIkBqdXN0LWV2ZXJ5L21jcC1yZWFkLXdlYnNpdGUtZmFzdCJdfX0=
JetBrains IDEs
Settings → Tools → AI Assistant → Model Context Protocol (MCP) → Add
Choose “As JSON” and paste:
{"command":"npx","args":["-y","@just-every/mcp-read-website-fast"]}
Or, in the chat window, type /add and fill in the same JSON—both paths land the server in a single step. 
Raw JSON (works in any MCP client)
{
"mcpServers": {
"read-website-fast": {
"command": "npx",
"args": ["-y", "@just-every/mcp-read-website-fast"]
}
}
}
Drop this into your client’s mcp.json (e.g. .vscode/mcp.json, ~/.cursor/mcp.json, or .mcp.json for Claude).
Features
- Fast startup using official MCP SDK with lazy loading for optimal performance
- Content extraction using Mozilla Readability (same as Firefox Reader View)
- HTML to Markdown conversion with Turndown + GFM support
- Smart caching with SHA-256 hashed URLs
- Polite crawling with robots.txt support and rate limiting
- Concurrent fetching with configurable depth crawling
- Stream-first design for low memory usage
- Link preservation for knowledge graphs
- Optional chunking for downstream processing
Available Tools
read_website- Fetches a webpage and converts it to clean markdown- Parameters:
url(required): The HTTP/HTTPS URL to fetchpages(optional): Maximum number of pages to crawl (default: 1, max: 100)
- Parameters:
Available Resources
read-website-fast://status- Get cache statisticsread-website-fast://clear-cache- Clear the cache directory
Development Usage
Install
npm install
npm run build
Single page fetch
npm run dev fetch https://example.com/article
Crawl with depth
npm run dev fetch https://example.com --depth 2 --concurrency 5
Output formats
# Markdown only (default)
npm run dev fetch https://example.com
# JSON output with metadata
npm run dev fetch https://example.com --output json
# Both URL and markdown
npm run dev fetch https://example.com --output both
CLI Options
-p, --pages <number>- Maximum number of pages to crawl (default: 1)-c, --concurrency <number>- Max concurrent requests (default: 3)--no-robots- Ignore robots.txt--all-origins- Allow cross-origin crawling-u, --user-agent <string>- Custom user agent--cache-dir <path>- Cache directory (default: .cache)-t, --timeout <ms>- Request timeout in milliseconds (default: 30000)-o, --output <format>- Output format: json, markdown, or both (default: markdown)
Clear cache
npm run dev clear-cache
Auto-Restart Feature
The MCP server includes automatic restart capability by default for improved reliability:
- Automatically restarts the server if it crashes
- Handles unhandled exceptions and promise rejections
- Implements exponential backoff (max 10 attempts in 1 minute)
- Logs all restart attempts for monitoring
- Gracefully handles shutdown signals (SIGINT, SIGTERM)
For development/debugging without auto-restart:
# Run directly without restart wrapper
npm run serve:dev
Architecture
mcp/
├── src/
│ ├── crawler/ # URL fetching, queue management, robots.txt
│ ├── parser/ # DOM parsing, Readability, Turndown conversion
│ ├── cache/ # Disk-based caching with SHA-256 keys
│ ├── utils/ # Logger, chunker utilities
│ ├── index.ts # CLI entry point
│ ├── serve.ts # MCP server entry point
│ └── serve-restart.ts # Auto-restart wrapper
Development
# Run in development mode
npm run dev fetch https://example.com
# Build for production
npm run build
# Run tests
npm test
# Type checking
npm run typecheck
# Linting
npm run lint
Contributing
Contributions are welcome! Please:
- Fork the repository
- Create a feature branch
- Add tests for new functionality
- Submit a pull request
Troubleshooting
Cache Issues
npm run dev clear-cache
Timeout Errors
- Increase timeout with
-tflag - Check network connectivity
- Verify URL is accessible
Content Not Extracted
- Some sites block automated access
- Try custom user agent with
-uflag - Check if site requires JavaScript (not supported)
License
MIT
Servidores relacionados
Bright Data
patrocinadorDiscover, extract, and interact with the web - one interface powering automated access across the public internet.
ELBADOO INTELLIGENCE HUB
A high-performance x402 intelligence gateway providing 20+ social and web endpoints. Powered by 6-stage routing logic—including cloud-rendered JS fetches, residential IP rotation, and automatic settlement refunds for blocked hosts. Built for AI Agents that require reliable, pay-per-request access to LinkedIn, Reddit, Instagram, and beyond without API key management
Buienradar
Fetches precipitation data for a given latitude and longitude using Buienradar.
Webclaw
Web content extraction for LLM pipelines — clean markdown or structured JSON from any URL using browser-grade TLS fingerprinting, no headless browser required. CLI, REST API, and MCP server.
SearchMCP
Connect any LLM to the internet with the cheapest, most reliable, and developer-friendly search API.
anybrowse
Convert any URL to LLM-ready Markdown via real Chrome browsers. 3 tools: scrape, crawl, search. Free via MCP, pay-per-use via x402.
ScraperCity
B2B lead generation MCP server - Apollo, Google Maps, email finder, skip trace, and 15+ more tools.
Any Browser MCP
Attaches to existing browser sessions using the Chrome DevTools Protocol for automation and interaction.
Mention MCP Server
Monitor web and social media using the Mention API.
Patchright Lite MCP Server
A server that wraps the Patchright SDK to provide stealth browser automation for AI models.
GasBuddy MCP Price Tracker
MCP server to get the cheapest gas prices in a particular city or zip code from gasbuddy.com