An enhanced MCP server for SearXNG web searching, utilizing a category-aware web-search, web-scraping, and includes a date/time retrieval tool.
A Model Context Protocol (MCP) server for category-aware web search, website scraping, and date/time tools. Designed for seamless integration with SearXNG and modern MCP clients.
Build the Docker image:
docker build -t overtlids/mcp-searxng-enhanced:latest .
Run with your SearXNG instance (Manual Docker Run):
docker run -i --rm --network=host \
-e SEARXNG_ENGINE_API_BASE_URL="http://127.0.0.1:8080/search" \
-e DESIRED_TIMEZONE="America/New_York" \
overtlids/mcp-searxng-enhanced:latest
In this example, SEARXNG_ENGINE_API_BASE_URL
is explicitly set. DESIRED_TIMEZONE
is also explicitly set to America/New_York
, which matches its default value. If an environment variable is not provided using an -e
flag during the docker run
command, the server will automatically use the default value defined in its Dockerfile
(refer to the Environment Variables table below). Thus, if you intend to use the default for DESIRED_TIMEZONE
, you could omit the -e DESIRED_TIMEZONE="America/New_York"
flag. However, SEARXNG_ENGINE_API_BASE_URL
is critical and usually needs to be set to match your specific SearXNG instance's address if the Dockerfile default (http://host.docker.internal:8080/search
) is not appropriate.
Note on Manual Docker Run: This command runs the Docker container independently. If you are using an MCP client (like Cline in VS Code) to manage this server, the client will start its own instance of the container using the settings defined in its own configuration. For the MCP client to use specific environment variables, they must be configured within the client's settings for this server (see below).
Configure your MCP client (e.g., Cline in VS Code):
For your MCP client to correctly manage and run this server, you must define all necessary environment variables within the client's settings for the overtlids/mcp-searxng-enhanced
server. The MCP client will use these settings to construct the docker run
command.
The following is the recommended default configuration for this server within your MCP client's JSON settings (e.g., cline_mcp_settings.json
). This example explicitly lists all environment variables set to their default values as defined in the Dockerfile
. You can copy and paste this directly and then customize any values as needed.
{
"mcpServers": {
"overtlids/mcp-searxng-enhanced": {
"command": "docker",
"args": [
"run", "-i", "--rm", "--network=host",
"-e", "SEARXNG_ENGINE_API_BASE_URL=http://host.docker.internal:8080/search",
"-e", "DESIRED_TIMEZONE=America/New_York",
"-e", "ODS_CONFIG_PATH=/config/ods_config.json",
"-e", "RETURNED_SCRAPPED_PAGES_NO=3",
"-e", "SCRAPPED_PAGES_NO=5",
"-e", "PAGE_CONTENT_WORDS_LIMIT=5000",
"-e", "CITATION_LINKS=True",
"-e", "MAX_IMAGE_RESULTS=10",
"-e", "MAX_VIDEO_RESULTS=10",
"-e", "MAX_FILE_RESULTS=5",
"-e", "MAX_MAP_RESULTS=5",
"-e", "MAX_SOCIAL_RESULTS=5",
"-e", "TRAFILATURA_TIMEOUT=15",
"-e", "SCRAPING_TIMEOUT=20",
"-e", "CACHE_MAXSIZE=100",
"-e", "CACHE_TTL_MINUTES=5",
"-e", "CACHE_MAX_AGE_MINUTES=30",
"-e", "RATE_LIMIT_REQUESTS_PER_MINUTE=10",
"-e", "RATE_LIMIT_TIMEOUT_SECONDS=60",
"-e", "IGNORED_WEBSITES=",
"overtlids/mcp-searxng-enhanced:latest"
],
"timeout": 60
}
}
}
Key Points for MCP Client Configuration:
-e "VARIABLE_NAME=value"
line within the args
array in your MCP client's configuration. For instance, to change SEARXNG_ENGINE_API_BASE_URL
and DESIRED_TIMEZONE
, you would adjust their respective lines.ods_config.json
file can also influence settings (see Configuration Management), environment variables passed by the MCP client take precedence.If you prefer to run the server directly using Python without Docker, follow these steps:
1. Python Installation:
2. Clone the Repository:
git clone https://github.com/OvertliDS/mcp-searxng-enhanced.git
cd mcp-searxng-enhanced
3. Create and Activate a Virtual Environment (Recommended):
# For Linux/macOS
python3 -m venv .venv
source .venv/bin/activate
# For Windows (Command Prompt)
python -m venv .venv
.\.venv\Scripts\activate.bat
# For Windows (PowerShell)
python -m venv .venv
.\.venv\Scripts\Activate.ps1
4. Install Dependencies:
pip install -r requirements.txt
Key dependencies include httpx
, BeautifulSoup4
, pydantic
, trafilatura
, python-dateutil
, cachetools
, and zoneinfo
.5. Ensure SearXNG is Accessible:
http://127.0.0.1:8080/search
).6. Set Environment Variables:
SEARXNG_ENGINE_API_BASE_URL
.export SEARXNG_ENGINE_API_BASE_URL="http://your-searxng-instance:port/search"
export DESIRED_TIMEZONE="America/Los_Angeles"
set SEARXNG_ENGINE_API_BASE_URL="http://your-searxng-instance:port/search"
set DESIRED_TIMEZONE="America/Los_Angeles"
$env:SEARXNG_ENGINE_API_BASE_URL="http://your-searxng-instance:port/search"
$env:DESIRED_TIMEZONE="America/Los_Angeles"
ods_config.json
file (if present in the root directory or at ODS_CONFIG_PATH
) will be used.7. Run the Server:
python mcp_server.py
8. Configuration File (ods_config.json
):
ods_config.json
file in the project's root directory (or the path specified by the ODS_CONFIG_PATH
environment variable). Environment variables will always take precedence over values in this file. Example:
json { "searxng_engine_api_base_url": "http://127.0.0.1:8080/search", "desired_timezone": "America/New_York" }
The following environment variables control the server's behavior. You can set them in your MCP client's configuration (recommended for client-managed servers) or when running Docker manually.
Variable | Description | Default (from Dockerfile) | Notes |
---|---|---|---|
SEARXNG_ENGINE_API_BASE_URL | SearXNG search endpoint | http://host.docker.internal:8080/search | Crucial for server operation |
DESIRED_TIMEZONE | Timezone for date/time tool | America/New_York | E.g., America/Los_Angeles . List of tz database time zones: https://en.wikipedia.org/wiki/List_of_tz_database_time_zones |
ODS_CONFIG_PATH | Path to persistent configuration file | /config/ods_config.json | Typically left as default within the container. |
RETURNED_SCRAPPED_PAGES_NO | Max pages to return per search | 3 | |
SCRAPPED_PAGES_NO | Max pages to attempt scraping | 5 | |
PAGE_CONTENT_WORDS_LIMIT | Max words per scraped page | 5000 | |
CITATION_LINKS | Enable/disable citation events | True | True or False |
MAX_IMAGE_RESULTS | Maximum image results to return | 10 | |
MAX_VIDEO_RESULTS | Maximum video results to return | 10 | |
MAX_FILE_RESULTS | Maximum file results to return | 5 | |
MAX_MAP_RESULTS | Maximum map results to return | 5 | |
MAX_SOCIAL_RESULTS | Maximum social media results to return | 5 | |
TRAFILATURA_TIMEOUT | Content extraction timeout (seconds) | 15 | |
SCRAPING_TIMEOUT | HTTP request timeout (seconds) | 20 | |
CACHE_MAXSIZE | Maximum number of cached websites | 100 | |
CACHE_TTL_MINUTES | Cache time-to-live (minutes) | 5 | |
CACHE_MAX_AGE_MINUTES | Maximum age for cached content (minutes) | 30 | |
RATE_LIMIT_REQUESTS_PER_MINUTE | Max requests per domain per minute | 10 | |
RATE_LIMIT_TIMEOUT_SECONDS | Rate limit tracking window (seconds) | 60 | |
IGNORED_WEBSITES | Comma-separated list of sites to ignore | "" (empty) | E.g., "example.com,another.org" |
The server uses a three-tier configuration approach:
ODS_CONFIG_PATH
, defaults to /config/ods_config.json
)The config file is only updated when:
This ensures that user configurations are preserved between container restarts when no new environment variables are set.
Tool Name | Purpose | Aliases |
---|---|---|
search_web | Web search via SearXNG | search , web_search , find , lookup_web , search_online , access_internet , lookup * |
get_website | Scrape website content | fetch_url , scrape_page , get , load_website , lookup * |
get_current_datetime | Current date/time | current_time , get_time , current_date |
*lookup
is context-sensitive:
url
argument, it maps to get_website
search_web
Web Search
{ "name": "search_web", "arguments": { "query": "open source ai" } }
or using an alias:
{ "name": "search", "arguments": { "query": "open source ai" } }
Category-Specific Search
{ "name": "search_web", "arguments": { "query": "landscapes", "category": "images" } }
Website Scraping
{ "name": "get_website", "arguments": { "url": "example.com" } }
or using an alias:
{ "name": "lookup", "arguments": { "url": "example.com" } }
Current Date/Time
{ "name": "get_current_datetime", "arguments": {} }
or:
{ "name": "current_time", "arguments": {} }
The search_web
tool supports different categories with tailored outputs:
When scraping Reddit content, URLs are automatically converted to use the old.reddit.com domain for better content extraction.
Domain-based rate limiting prevents excessive requests to the same domain within a time window. This prevents overwhelming target websites and potential IP blocking.
Cached website content is automatically validated for freshness based on age. Stale content is refreshed automatically while valid cached content is served quickly.
The server implements a robust error handling system with these exception types:
MCPServerError
: Base exception class for all server errorsConfigurationError
: Raised when configuration values are invalidSearXNGConnectionError
: Raised when connection to SearXNG failsWebScrapingError
: Raised when web scraping failsRateLimitExceededError
: Raised when rate limit for a domain is exceededErrors are properly propagated to the client with informative messages.
SEARXNG_ENGINE_API_BASE_URL
environment variable points to the correct endpoint.RATE_LIMIT_REQUESTS_PER_MINUTE
if you're experiencing too many rate limit errors.TRAFILATURA_TIMEOUT
to allow more time for content processing on complex pages.host.docker.internal
should resolve to the host machine. On Linux, you may need to use the host's IP address instead.Inspired by:
MIT License © 2025 OvertliDS
Web and local search using Brave's Search API
Location services, directions, and place details
Query Amazon Bedrock Knowledge Bases using natural language to retrieve relevant information from your data sources.
Fetch, convert, and search AWS documentation pages, with recommendations for related content.
Search campgrounds around the world on campertunity, check availability, and provide booking links.
Search Engine made for AIs by Exa
Ingest anything from Slack to Gmail to podcast feeds, in addition to web crawling, into a searchable Graphlit project.
RAG Search over your content powered by Inkeep
Search the web using Kagi's search API
Provides AI assistants with direct access to Mastra.ai's complete knowledge base.