Web Scout

An MCP server for web search and content extraction using DuckDuckGo.

Web Scout MCP Server

An MCP server for web search using DuckDuckGo and content extraction, with support for multiple URLs and memory optimizations.

✨ Features

🔍 DuckDuckGo Search: Fast and privacy-focused web search capability
📄 Content Extraction: Clean, readable text extraction from web pages
🚀 Parallel Processing: Support for extracting content from multiple URLs simultaneously
💾 Memory Optimization: Smart memory management to prevent application crashes
⏱️ Rate Limiting: Intelligent request throttling to avoid API blocks
🛡️ Error Handling: Robust error handling for reliable operation

📦 Installation

Installing via Smithery

To install Web Scout for Claude Desktop automatically via Smithery:

npx -y @smithery/cli install @pinkpixel-dev/web-scout-mcp --client claude

Global Installation

npm install -g @pinkpixel/web-scout-mcp

Local Installation

npm install @pinkpixel/web-scout-mcp

🚀 Usage

Command Line

After installing globally, run:

web-scout-mcp

With MCP Clients

Add this to your MCP client's config.json (Claude Desktop, Cursor, etc.):

{
  "mcpServers": {
    "web-scout": {
      "command": "npx",
      "args": [
        "-y",
        "@pinkpixel/web-scout-mcp@latest"
      ]
    }
  }
}

Environment Variables

Set the WEB_SCOUT_DISABLE_AUTOSTART=1 environment variable when embedding the package and calling createServer() yourself. By default running the published entrypoint (for example node dist/index.js or npx @pinkpixel/web-scout-mcp) automatically bootstraps the stdio transport.

🧰 Tools

The server provides the following MCP tools:

🔍 DuckDuckGoWebSearch

Initiates a web search query using the DuckDuckGo search engine and returns a well-structured list of findings.

Input:

query (string): The search query string
maxResults (number, optional): Maximum number of results to return (default: 10)

Example:

{
  "query": "latest advancements in AI",
  "maxResults": 5
}

Output: A formatted list of search results with titles, URLs, and snippets.

📄 UrlContentExtractor

Fetches and extracts clean, readable content from web pages by removing unnecessary elements like scripts, styles, and navigation.

Input:

url: Either a single URL string or an array of URL strings

Example (single URL):

{
  "url": "https://example.com/article"
}

Example (multiple URLs):

{
  "url": [
    "https://example.com/article1",
    "https://example.com/article2"
  ]
}

Output: Extracted text content from the specified URL(s).

🛠️ Development

# Clone the repository
git clone https://github.com/pinkpixel-dev/web-scout-mcp.git
cd web-scout-mcp

# Install dependencies
npm install

# Build
npm run build

# Run
npm start

📚 Documentation

For more detailed information about the project, check out these resources:

OVERVIEW.md - Technical overview and architecture
CONTRIBUTING.md - Guidelines for contributors
CHANGELOG.md - Version history and changes

📋 Requirements

Node.js >= 18.0.0
npm or yarn

📄 License

This project is licensed under the Apache 2.0 License.

_{Made with ❤️ by Pink Pixel}
_{✨ Dream it, Pixel it ✨}

Servidores relacionados

Bright Data

patrocinador

Discover, extract, and interact with the web - one interface powering automated access across the public internet.

Open Crawler MCP Server

A web crawler and text extractor with robots.txt compliance, rate limiting, and page size protection.

MCP Browser Agent

A browser automation agent using the Model Context Protocol (MCP) to enable browser interactions.

Scrapezy

Turn websites into datasets with Scrapezy

Haunt API

AI-powered web data extraction MCP server — extract structured JSON from any website with natural language prompts.

yt-dlp

Download video and audio content from various websites like YouTube, Facebook, and Tiktok using yt-dlp.

XPath MCP Server

Execute XPath queries on XML content.

UseScraper

A server for web scraping using the UseScraper API.

1001Proxy - Proxy MCP Server for AI Agents

Use Claude, OpenAI Cursor, and any MCP-compatible AI agent to buy and manage proxies using natural language. No custom integrations needed - simply connect your client to the server and start chatting.

Read Website Fast

Fast, token-efficient web content extraction that converts websites to clean Markdown. Features Mozilla Readability, smart caching, polite crawling with robots.txt support, and concurrent fetching with minimal dependencies.

Web-curl

Fetch, extract, and process web and API content. Supports resource blocking, authentication, and Google Custom Search.