scrape-do-mcp
MCP Server for Scrape.do - Web Scraping & Google Search with anti-bot bypass
scrape-do-mcp
中文文档 | English
An MCP server that wraps Scrape.do's documented APIs in one package: the main scraping API, Google Search API, Amazon Scraper API, Async API, and a Proxy Mode configuration helper.
Official docs: https://scrape.do/documentation/
Coverage
scrape_url: Main Scrape.do API with JS rendering, geo-targeting, session persistence, screenshots, ReturnJSON, browser interactions, cookies, and header forwarding.google_search: Structured Google SERP API withgoogle_domain,location,uule,lr,cr,safe,nfpr,filter, pagination, and optional raw HTML.amazon_product: Amazon PDP endpoint.amazon_offer_listing: Amazon offer listing endpoint.amazon_search: Amazon search/category endpoint.amazon_raw_html: Raw HTML Amazon endpoint with geo-targeting.async_create_job,async_get_job,async_get_task,async_list_jobs,async_cancel_job,async_get_account: Async API coverage with both MCP-friendly aliases and official field names.proxy_mode_config: Builds official Proxy Mode connection details, default parameter strings, and CA certificate references.
Compatibility Notes
scrape_urlsupports both MCP-friendly aliases and official parameter names:render_jsorrendersuper_proxyorsuperscreenshotorscreenShot
google_searchsupports:queryorqcountryorgllanguageorhldomainorgoogle_domainincludeHtmlorinclude_html
async_create_jobaccepts both alias fields liketargets,render,webhookUrland official Async API fields likeTargets,Render,WebhookURL.async_get_job,async_get_task, andasync_cancel_jobaccept bothjobId/taskIdand officialjobID/taskID.async_list_jobsaccepts bothpageSizeand officialpage_size.- For header forwarding in
scrape_url, passheadersplusheader_mode(custom,extra, orforward). - Screenshot responses preserve the official Scrape.do JSON body and also attach MCP image content when screenshots are present.
scrape_urlnow defaults tooutput="raw"to match the official API more closely.scrape_urlincludes response metadata instructuredContent, which helps surfacepureCookies,transparentResponse, and binary responses inside MCP.
Installation
Quick Install
claude mcp add-json scrape-do --scope user '{
"type": "stdio",
"command": "npx",
"args": ["-y", "scrape-do-mcp"],
"env": {
"SCRAPE_DO_TOKEN": "YOUR_TOKEN_HERE"
}
}'
Claude Desktop
Add this to ~/.claude.json:
{
"mcpServers": {
"scrape-do": {
"command": "npx",
"args": ["-y", "scrape-do-mcp"],
"env": {
"SCRAPE_DO_TOKEN": "YOUR_TOKEN_HERE"
}
}
}
}
Get your token at https://app.scrape.do
Available Tools
| Tool | Purpose |
|---|---|
scrape_url | Main Scrape.do scraping API wrapper |
google_search | Structured Google search results |
amazon_product | Amazon PDP structured data |
amazon_offer_listing | Amazon seller offers |
amazon_search | Amazon keyword/category results |
amazon_raw_html | Raw Amazon HTML with geo-targeting |
async_create_job | Create Async API jobs |
async_get_job | Fetch Async job details |
async_get_task | Fetch Async task details |
async_list_jobs | List Async jobs |
async_cancel_job | Cancel Async jobs |
async_get_account | Fetch Async account/concurrency info |
proxy_mode_config | Generate Proxy Mode configuration |
Example Prompts
Scrape https://example.com with render=true and wait for #app.
Search Google for "open source MCP servers" with google_domain=google.co.uk and lr=lang_en.
Get the Amazon PDP for ASIN B0C7BKZ883 in the US with zipcode 10001.
Create an async job for these 20 URLs and give me the job ID.
Development
npm install
npm run build
npm run dev
License
MIT
相关服务器
Bright Data
赞助Discover, extract, and interact with the web - one interface powering automated access across the public internet.
Instagram Downloader
A server to download videos and media from Instagram.
Web Scout
A server for web scraping, searching, and analysis using multiple engines and APIs.
MCP YouTube Extract
Extracts information from YouTube videos and channels using the YouTube Data API.
WebSearch
An advanced web search and content extraction tool powered by the Firecrawl API for web scraping and analysis.
BrowserLoop
Take screenshots and read console logs from web pages using Playwright.
UseScraper
A server for web scraping using the UseScraper API.
WebforAI Text Extractor
Extracts plain text from web pages using WebforAI.
Amazon MCP Server
Scrapes and searches for products on Amazon.
Website Snapshot
A MCP server that provides comprehensive website snapshot capabilities using Playwright. This server enables LLMs to capture and analyze web pages through structured accessibility snapshots, network monitoring, and console message collection.
Financial Data MCP Server
Provides real-time financial market data from Yahoo Finance.