scrape-do-mcp
MCP Server for Scrape.do - Web Scraping & Google Search with anti-bot bypass
scrape-do-mcp
中文文档 | English
An MCP server that wraps Scrape.do's documented APIs in one package: the main scraping API, Google Search API, Amazon Scraper API, Async API, and a Proxy Mode configuration helper.
Official docs: https://scrape.do/documentation/
Coverage
scrape_url: Main Scrape.do API with JS rendering, geo-targeting, session persistence, screenshots, ReturnJSON, browser interactions, cookies, and header forwarding.google_search: Structured Google SERP API withgoogle_domain,location,uule,lr,cr,safe,nfpr,filter, pagination, and optional raw HTML.amazon_product: Amazon PDP endpoint.amazon_offer_listing: Amazon offer listing endpoint.amazon_search: Amazon search/category endpoint.amazon_raw_html: Raw HTML Amazon endpoint with geo-targeting.async_create_job,async_get_job,async_get_task,async_list_jobs,async_cancel_job,async_get_account: Async API coverage with both MCP-friendly aliases and official field names.proxy_mode_config: Builds official Proxy Mode connection details, default parameter strings, and CA certificate references.
Compatibility Notes
scrape_urlsupports both MCP-friendly aliases and official parameter names:render_jsorrendersuper_proxyorsuperscreenshotorscreenShot
google_searchsupports:queryorqcountryorgllanguageorhldomainorgoogle_domainincludeHtmlorinclude_html
async_create_jobaccepts both alias fields liketargets,render,webhookUrland official Async API fields likeTargets,Render,WebhookURL.async_get_job,async_get_task, andasync_cancel_jobaccept bothjobId/taskIdand officialjobID/taskID.async_list_jobsaccepts bothpageSizeand officialpage_size.- For header forwarding in
scrape_url, passheadersplusheader_mode(custom,extra, orforward). - Screenshot responses preserve the official Scrape.do JSON body and also attach MCP image content when screenshots are present.
scrape_urlnow defaults tooutput="raw"to match the official API more closely.scrape_urlincludes response metadata instructuredContent, which helps surfacepureCookies,transparentResponse, and binary responses inside MCP.
Installation
Quick Install
claude mcp add-json scrape-do --scope user '{
"type": "stdio",
"command": "npx",
"args": ["-y", "scrape-do-mcp"],
"env": {
"SCRAPE_DO_TOKEN": "YOUR_TOKEN_HERE"
}
}'
Claude Desktop
Add this to ~/.claude.json:
{
"mcpServers": {
"scrape-do": {
"command": "npx",
"args": ["-y", "scrape-do-mcp"],
"env": {
"SCRAPE_DO_TOKEN": "YOUR_TOKEN_HERE"
}
}
}
}
Get your token at https://app.scrape.do
Available Tools
| Tool | Purpose |
|---|---|
scrape_url | Main Scrape.do scraping API wrapper |
google_search | Structured Google search results |
amazon_product | Amazon PDP structured data |
amazon_offer_listing | Amazon seller offers |
amazon_search | Amazon keyword/category results |
amazon_raw_html | Raw Amazon HTML with geo-targeting |
async_create_job | Create Async API jobs |
async_get_job | Fetch Async job details |
async_get_task | Fetch Async task details |
async_list_jobs | List Async jobs |
async_cancel_job | Cancel Async jobs |
async_get_account | Fetch Async account/concurrency info |
proxy_mode_config | Generate Proxy Mode configuration |
Example Prompts
Scrape https://example.com with render=true and wait for #app.
Search Google for "open source MCP servers" with google_domain=google.co.uk and lr=lang_en.
Get the Amazon PDP for ASIN B0C7BKZ883 in the US with zipcode 10001.
Create an async job for these 20 URLs and give me the job ID.
Development
npm install
npm run build
npm run dev
License
MIT
関連サーバー
Bright Data
スポンサーDiscover, extract, and interact with the web - one interface powering automated access across the public internet.
Plasmate MCP
Agent-native headless browser that converts web pages to structured Semantic Object Model (SOM) JSON -- 4x fewer tokens than raw HTML with lower latency on Claude and GPT-4o.
Horse Racing News
Fetches horse racing news from the thoroughbreddailynews.com RSS feed.
Tech Collector MCP
Collects and summarizes technical articles from sources like Qiita, Dev.to, NewsAPI, and Hacker News using the Gemini API.
Trends MCP
Real-time trend data from Google (Search, Images, News, Shopping), YouTube, TikTok, Reddit, Amazon, Wikipedia, X (Twitter), LinkedIn, Spotify, GitHub, Steam, npm, App Store, news sentiment and web traffic via one MCP connection. Free API key, 20 requests/day, no credit card required.
WebSearch
An advanced web search and content extraction tool powered by the Firecrawl API for web scraping and analysis.
Hacker News
Fetches and parses stories from Hacker News, providing structured data for top, new, ask, show, and job posts.
YouTube Data
Access YouTube video data and transcripts using the YouTube Data API.
anybrowse
Convert any URL to LLM-ready Markdown via real Chrome browsers. 3 tools: scrape, crawl, search. Free via MCP, pay-per-use via x402.
Playwright SSE MCP Server
An MCP server that provides Playwright features for web scraping and browser automation.
MCP Browser Use Secure
A secure MCP server for browser automation with enhanced security features like multi-layered protection and session isolation.