scrape-do-mcp
MCP Server for Scrape.do - Web Scraping & Google Search with anti-bot bypass
scrape-do-mcp
中文文档 | English
An MCP server that wraps Scrape.do's documented APIs in one package: the main scraping API, Google Search API, Amazon Scraper API, Async API, and a Proxy Mode configuration helper.
Official docs: https://scrape.do/documentation/
Coverage
scrape_url: Main Scrape.do API with JS rendering, geo-targeting, session persistence, screenshots, ReturnJSON, browser interactions, cookies, and header forwarding.google_search: Structured Google SERP API withgoogle_domain,location,uule,lr,cr,safe,nfpr,filter, pagination, and optional raw HTML.amazon_product: Amazon PDP endpoint.amazon_offer_listing: Amazon offer listing endpoint.amazon_search: Amazon search/category endpoint.amazon_raw_html: Raw HTML Amazon endpoint with geo-targeting.async_create_job,async_get_job,async_get_task,async_list_jobs,async_cancel_job,async_get_account: Async API coverage with both MCP-friendly aliases and official field names.proxy_mode_config: Builds official Proxy Mode connection details, default parameter strings, and CA certificate references.
Compatibility Notes
scrape_urlsupports both MCP-friendly aliases and official parameter names:render_jsorrendersuper_proxyorsuperscreenshotorscreenShot
google_searchsupports:queryorqcountryorgllanguageorhldomainorgoogle_domainincludeHtmlorinclude_html
async_create_jobaccepts both alias fields liketargets,render,webhookUrland official Async API fields likeTargets,Render,WebhookURL.async_get_job,async_get_task, andasync_cancel_jobaccept bothjobId/taskIdand officialjobID/taskID.async_list_jobsaccepts bothpageSizeand officialpage_size.- For header forwarding in
scrape_url, passheadersplusheader_mode(custom,extra, orforward). - Screenshot responses preserve the official Scrape.do JSON body and also attach MCP image content when screenshots are present.
scrape_urlnow defaults tooutput="raw"to match the official API more closely.scrape_urlincludes response metadata instructuredContent, which helps surfacepureCookies,transparentResponse, and binary responses inside MCP.
Installation
Quick Install
claude mcp add-json scrape-do --scope user '{
"type": "stdio",
"command": "npx",
"args": ["-y", "scrape-do-mcp"],
"env": {
"SCRAPE_DO_TOKEN": "YOUR_TOKEN_HERE"
}
}'
Claude Desktop
Add this to ~/.claude.json:
{
"mcpServers": {
"scrape-do": {
"command": "npx",
"args": ["-y", "scrape-do-mcp"],
"env": {
"SCRAPE_DO_TOKEN": "YOUR_TOKEN_HERE"
}
}
}
}
Get your token at https://app.scrape.do
Available Tools
| Tool | Purpose |
|---|---|
scrape_url | Main Scrape.do scraping API wrapper |
google_search | Structured Google search results |
amazon_product | Amazon PDP structured data |
amazon_offer_listing | Amazon seller offers |
amazon_search | Amazon keyword/category results |
amazon_raw_html | Raw Amazon HTML with geo-targeting |
async_create_job | Create Async API jobs |
async_get_job | Fetch Async job details |
async_get_task | Fetch Async task details |
async_list_jobs | List Async jobs |
async_cancel_job | Cancel Async jobs |
async_get_account | Fetch Async account/concurrency info |
proxy_mode_config | Generate Proxy Mode configuration |
Example Prompts
Scrape https://example.com with render=true and wait for #app.
Search Google for "open source MCP servers" with google_domain=google.co.uk and lr=lang_en.
Get the Amazon PDP for ASIN B0C7BKZ883 in the US with zipcode 10001.
Create an async job for these 20 URLs and give me the job ID.
Development
npm install
npm run build
npm run dev
License
MIT
相關伺服器
Bright Data
贊助Discover, extract, and interact with the web - one interface powering automated access across the public internet.
Github to MCP
Convert GitHub repositories to MCP servers automatically. Extract tools from OpenAPI, GraphQL & REST APIs for Claude Desktop, Cursor, Windsurf, Cline & VS Code. AI-powered code generation creates type-safe TypeScript/Python MCP servers. Zero config setup - just paste a repo URL. Built for AI assistants & LLM tool integration.
Skrapr
An intelligent web scraping tool using AI and browser automation to extract structured data from websites.
Crawl4AI
Web scraping skill for Claude AI. Crawl websites, extract structured data with CSS/LLM strategies, handle dynamic JavaScript content. Built on crawl4ai with complete SDK reference, example scripts, and tests.
YouTube Transcript
Fetches transcripts for YouTube videos.
Playwright MCP
Browser automation using Playwright, enabling LLMs to interact with web pages through structured accessibility snapshots.
Videogame Encyclopedia MCP Server
MPC server dedicated to gather information for videogames
MCP Browser Console Capture Service
A browser automation service for capturing console output, useful for tasks like public sentiment analysis.
Social & Content MCP Server
Trending content from Hacker News, Dev.to, IMDb, podcasts, and Eventbrite
Crawl MCP
An MCP server for crawling WeChat articles. It supports single and batch crawling with multiple output formats, designed for AI tools like Cursor.
Shufersal MCP Server
Automates shopping on the Shufersal website, enabling LLMs to search for products, create shopping lists, and manage the cart.