Scrapling Fetch MCP
Fetches HTML and markdown from websites with anti-automation measures using Scrapling.
scrapling-fetch-mcp
An MCP server that helps AI assistants access text content from websites that implement bot detection, bridging the gap between what you can see in your browser and what the AI can access.
Intended Use
This tool is optimized for low-volume retrieval of documentation and reference materials (text/HTML only) from websites that implement bot detection. It has not been designed or tested for general-purpose site scraping or data harvesting.
Note: This project was developed in collaboration with Claude Sonnets 3.7 and 4.5, using LLM Context.
Installation
Requirements
- Python 3.10+
- uv package manager
Install
# Install scrapling-fetch-mcp
uv tool install scrapling-fetch-mcp
# Install browser binaries (REQUIRED - large downloads)
uvx --from scrapling-fetch-mcp scrapling install
Important: The browser installation downloads hundreds of MB of data and must complete before first use. If the MCP server times out on first use, the browsers may still be installing in the background. Wait a few minutes and try again.
Setup with Claude Desktop
Add this configuration to your Claude Desktop MCP settings:
MacOS: ~/Library/Application Support/Claude/claude_desktop_config.json
Windows: %APPDATA%\Claude\claude_desktop_config.json
{
"mcpServers": {
"scrapling-fetch": {
"command": "uvx",
"args": ["scrapling-fetch-mcp"]
}
}
}
After updating the config, restart Claude Desktop.
What It Does
This MCP server provides two tools that Claude can use automatically when you ask it to fetch web content:
- Page fetching: Retrieves complete web pages with support for pagination
- Pattern extraction: Finds and extracts specific content using regex patterns
The AI decides which tool to use based on your request. You just ask naturally:
"Can you fetch the docs at https://example.com/api"
"Find all mentions of 'authentication' on that page"
"Get me the installation instructions from their homepage"
Protection Modes
The tools support three levels of bot detection bypass:
- basic: Fast (1-2s), works for most sites
- stealth: Moderate (3-8s), handles more protection
- max-stealth: Maximum (10+s), for heavily protected sites
Claude automatically starts with basic mode and escalates if needed.
Tips for Best Results
- Just ask naturally - Claude handles the technical details
- For large pages, Claude can page through content automatically
- For specific searches, mention what you're looking for and Claude will use pattern matching
- The metadata returned helps Claude decide whether to page or search
Limitations
- Designed for text content only (documentation, articles, references)
- Not for high-volume scraping or data harvesting
- May not work with sites requiring authentication
- Performance varies by site complexity and protection level
Built with Scrapling for web scraping with bot detection bypass.
License
Apache 2.0
Máy chủ liên quan
Bright Data
nhà tài trợDiscover, extract, and interact with the web - one interface powering automated access across the public internet.
Crawl4AI MCP Server
An MCP server for advanced web crawling, content extraction, and AI-powered analysis using the crawl4ai library.
ScreenshotOne
Render website screenshots with ScreenshotOne
Bilibili Comments
Fetch Bilibili video comments in bulk, including nested replies. Requires a Bilibili cookie for authentication.
Browser Use
An AI-driven browser automation server for natural language control and web research, with CLI access.
YouTube Transcript
A zero-setup server to extract transcripts from YouTube videos on any platform.
yt-dlp
Download video and audio content from various websites like YouTube, Facebook, and Tiktok using yt-dlp.
medical-mcp
About An MCP server that provides comprehensive medical information by querying multiple authoritative medical APIs including FDA, WHO, PubMed, Google Scholar, and RxNorm.
Intelligent Crawl4AI Agent
An AI-powered web scraping system for high-volume automation and advanced data extraction strategies.
CarDeals-MCP
A Model Context Protocol (MCP) service that indexes and queries car-deal contexts - fast, flexible search for vehicle listings and marketplace data.
Website Snapshot
A MCP server that provides comprehensive website snapshot capabilities using Playwright. This server enables LLMs to capture and analyze web pages through structured accessibility snapshots, network monitoring, and console message collection.