Web Scout

Um servidor MCP para busca na web e extração de conteúdo usando DuckDuckGo.

Documentação

Web Scout MCP Server

An MCP server for web search using DuckDuckGo and content extraction, with support for multiple URLs and memory optimizations.

✨ Features

🔍 DuckDuckGo Search: Fast and privacy-focused web search capability
📄 Content Extraction: Clean, readable text extraction from web pages
🚀 Parallel Processing: Support for extracting content from multiple URLs simultaneously
💾 Memory Optimization: Smart memory management to prevent application crashes
⏱️ Rate Limiting: Intelligent request throttling to avoid API blocks
🛡️ Error Handling: Robust error handling for reliable operation

📦 Installation

Installing via Smithery

To install Web Scout for Claude Desktop automatically via Smithery:

npx -y @smithery/cli install @pinkpixel-dev/web-scout-mcp --client claude

Global Installation

npm install -g @pinkpixel/web-scout-mcp

Local Installation

npm install @pinkpixel/web-scout-mcp

🚀 Usage

Command Line

After installing globally, run:

web-scout-mcp

With MCP Clients

Add this to your MCP client's config.json (Claude Desktop, Cursor, etc.):

{
  "mcpServers": {
    "web-scout": {
      "command": "npx",
      "args": [
        "-y",
        "@pinkpixel/web-scout-mcp@latest"
      ]
    }
  }
}

Environment Variables

Set the WEB_SCOUT_DISABLE_AUTOSTART=1 environment variable when embedding the package and calling createServer() yourself. By default running the published entrypoint (for example node dist/index.js or npx @pinkpixel/web-scout-mcp) automatically bootstraps the stdio transport.

🧰 Tools

The server provides the following MCP tools:

🔍 DuckDuckGoWebSearch

Initiates a web search query using the DuckDuckGo search engine and returns a well-structured list of findings.

Input:

query (string): The search query string
maxResults (number, optional): Maximum number of results to return (default: 10)

Example:

{
  "query": "latest advancements in AI",
  "maxResults": 5
}

Output: A formatted list of search results with titles, URLs, and snippets.

📄 UrlContentExtractor

Fetches and extracts clean, readable content from web pages by removing unnecessary elements like scripts, styles, and navigation.

Input:

url: Either a single URL string or an array of URL strings

Example (single URL):

{
  "url": "https://example.com/article"
}

Example (multiple URLs):

{
  "url": [
    "https://example.com/article1",
    "https://example.com/article2"
  ]
}

Output: Extracted text content from the specified URL(s).

🛠️ Development

# Clone the repository
git clone https://github.com/pinkpixel-dev/web-scout-mcp.git
cd web-scout-mcp

# Install dependencies
npm install

# Build
npm run build

# Run
npm start

📚 Documentation

For more detailed information about the project, check out these resources:

OVERVIEW.md - Technical overview and architecture
CONTRIBUTING.md - Guidelines for contributors
CHANGELOG.md - Version history and changes

📋 Requirements

Node.js >= 18.0.0
npm or yarn

📄 License

This project is licensed under the Apache 2.0 License.

_{Made with ❤️ by Pink Pixel}
_{✨ Dream it, Pixel it ✨}