Read URL MCP
Extracts web content from a URL and converts it to clean Markdown format.
Read URL MCP
A Model Context Protocol (MCP) server that provides URL reading capabilities, extracting web content and converting it to clean Markdown format.
Features
- Fetch web content from HTTP/HTTPS URLs
- Extract main content using readability algorithm
- Convert HTML to clean Markdown format
- Configurable timeout and content size limits
Installation
uv install
MCP Setting
{
"mcpServers": {
"read_url_mcp": {
"command": "uv",
"args": [
"run",
"--directory",
"<directory>",
"read_url_mcp/mcp_server"
],
"env": {
"PYTHONPATH": "<directory>"
}
}
}
}
Usage
Run the MCP server:
uv run python read_url_mcp/mcp_server.py
Available Tools
readURLMarkdown(url: str)- Fetches URL content and returns it as Markdown
Development
Code Quality
uv run ruff check # Lint code
uv run ruff format # Format code
uv run ruff check --fix # Auto-fix issues
Dependencies
- mcp[cli] - MCP framework
- requests - HTTP client
- readability-lxml - Content extraction
- html2text - HTML to Markdown conversion
Configuration
Server settings in mcp_server.py:
- Timeout: 30 seconds
- Max content length: 1MB
- User agent: my-mcp-tools/1.0
Related Servers
Web Fetch
Fetches and converts web content, ideal for data extraction and web scraping.
MCP Rquest
An MCP server for making advanced HTTP requests with browser emulation, including PDF and HTML to Markdown conversion.
Monad MCP Magic Eden
Retrieve NFT data from the Monad testnet, including holder addresses, collection values, and top-selling collections.
Playwright Server
A server providing Playwright tools for browser automation and web scraping.
SearchMCP
Connect any LLM to the internet with the cheapest, most reliable, and developer-friendly search API.
Crypto News MCP Server
Fetches the latest cryptocurrency news and converts article content from HTML to Markdown.
Puppeteer
Browser automation using Puppeteer, with support for local, Docker, and Cloudflare Workers deployments.
BrowserCat MCP Server
Remote browser automation using the BrowserCat API.
Fetch
Web content fetching and conversion for efficient LLM usage
Playwright
Playwright MCP server