scrape-do-mcp

MCP Server for Scrape.do - Web Scraping & Google Search with anti-bot bypass

scrape-do-mcp

δΈ­ζ–‡ζ–‡ζ‘£ | English

An MCP server that wraps Scrape.do's documented APIs in one package: the main scraping API, Google Search API, Amazon Scraper API, Async API, and a Proxy Mode configuration helper.

Official docs: https://scrape.do/documentation/

Coverage

  • scrape_url: Main Scrape.do API with JS rendering, geo-targeting, session persistence, screenshots, ReturnJSON, browser interactions, cookies, and header forwarding.
  • google_search: Structured Google SERP API with google_domain, location, uule, lr, cr, safe, nfpr, filter, pagination, and optional raw HTML.
  • amazon_product: Amazon PDP endpoint.
  • amazon_offer_listing: Amazon offer listing endpoint.
  • amazon_search: Amazon search/category endpoint.
  • amazon_raw_html: Raw HTML Amazon endpoint with geo-targeting.
  • async_create_job, async_get_job, async_get_task, async_list_jobs, async_cancel_job, async_get_account: Async API coverage with both MCP-friendly aliases and official field names.
  • proxy_mode_config: Builds official Proxy Mode connection details, default parameter strings, and CA certificate references.

Compatibility Notes

  • scrape_url supports both MCP-friendly aliases and official parameter names:
    • render_js or render
    • super_proxy or super
    • screenshot or screenShot
  • google_search supports:
    • query or q
    • country or gl
    • language or hl
    • domain or google_domain
    • includeHtml or include_html
  • async_create_job accepts both alias fields like targets, render, webhookUrl and official Async API fields like Targets, Render, WebhookURL.
  • async_get_job, async_get_task, and async_cancel_job accept both jobId/taskId and official jobID/taskID.
  • async_list_jobs accepts both pageSize and official page_size.
  • For header forwarding in scrape_url, pass headers plus header_mode (custom, extra, or forward).
  • Screenshot responses preserve the official Scrape.do JSON body and also attach MCP image content when screenshots are present.
  • scrape_url now defaults to output="raw" to match the official API more closely.
  • scrape_url includes response metadata in structuredContent, which helps surface pureCookies, transparentResponse, and binary responses inside MCP.

Installation

Quick Install

claude mcp add-json scrape-do --scope user '{
  "type": "stdio",
  "command": "npx",
  "args": ["-y", "scrape-do-mcp"],
  "env": {
    "SCRAPE_DO_TOKEN": "YOUR_TOKEN_HERE"
  }
}'

Claude Desktop

Add this to ~/.claude.json:

{
  "mcpServers": {
    "scrape-do": {
      "command": "npx",
      "args": ["-y", "scrape-do-mcp"],
      "env": {
        "SCRAPE_DO_TOKEN": "YOUR_TOKEN_HERE"
      }
    }
  }
}

Get your token at https://app.scrape.do

Available Tools

ToolPurpose
scrape_urlMain Scrape.do scraping API wrapper
google_searchStructured Google search results
amazon_productAmazon PDP structured data
amazon_offer_listingAmazon seller offers
amazon_searchAmazon keyword/category results
amazon_raw_htmlRaw Amazon HTML with geo-targeting
async_create_jobCreate Async API jobs
async_get_jobFetch Async job details
async_get_taskFetch Async task details
async_list_jobsList Async jobs
async_cancel_jobCancel Async jobs
async_get_accountFetch Async account/concurrency info
proxy_mode_configGenerate Proxy Mode configuration

Example Prompts

Scrape https://example.com with render=true and wait for #app.
Search Google for "open source MCP servers" with google_domain=google.co.uk and lr=lang_en.
Get the Amazon PDP for ASIN B0C7BKZ883 in the US with zipcode 10001.
Create an async job for these 20 URLs and give me the job ID.

Development

npm install
npm run build
npm run dev

License

MIT

Related Servers