HTML to Markdown MCP
Fetch web pages and convert HTML to clean, formatted Markdown. Handles large pages with automatic file saving to bypass token limits.
HTML to Markdown MCP Server
An MCP (Model Context Protocol) server that converts HTML content to Markdown format using Turndown.js.
Table of Contents
Features
- 🌐 Fetch and convert web pages - Automatically fetch HTML from any URL
- 🔄 Convert HTML to clean, formatted Markdown
- 📝 Preserves formatting (headers, links, code blocks, lists, tables)
- 🗑️ Automatically removes unwanted elements (scripts, styles, etc.)
- 📊 Auto-extracts page titles and metadata
- ⚡ Fast conversion using Turndown.js
- 🔒 SSRF protection - Blocks requests to private/internal networks by default
Installation
npm install -g html-to-markdown-mcp
Or use with npx (no installation required):
npx html-to-markdown-mcp
Usage
With Claude Code
Add the server using the Claude CLI:
claude mcp add --transport stdio html-to-markdown -- npx html-to-markdown-mcp
Or if installed globally:
claude mcp add --transport stdio html-to-markdown -- html-to-markdown-mcp
With Claude Code (Plugin)
This project can also be installed as a Claude Code plugin, which bundles the MCP server and makes it easy to share with teams.
Install directly from GitHub:
/plugin marketplace add levz0r/html-to-markdown-mcp
/plugin install html-to-markdown@levz0r/html-to-markdown-mcp
Or enable for your team by adding to your project's .claude/settings.json:
{
"extraKnownMarketplaces": {
"levz0r/html-to-markdown-mcp": {
"source": {
"source": "github",
"repo": "levz0r/html-to-markdown-mcp"
}
}
},
"enabledPlugins": {
"html-to-markdown@levz0r/html-to-markdown-mcp": true
}
}
With Claude Desktop
Add this server to your Claude Desktop configuration file:
Using npx (recommended):
{
"mcpServers": {
"html-to-markdown": {
"command": "npx",
"args": ["html-to-markdown-mcp"]
}
}
}
Or if installed globally:
{
"mcpServers": {
"html-to-markdown": {
"command": "html-to-markdown-mcp"
}
}
}
With Cursor
Add this server to your Cursor MCP settings file:
Using npx (recommended):
{
"mcpServers": {
"html-to-markdown": {
"command": "npx",
"args": ["html-to-markdown-mcp"]
}
}
}
Or if installed globally:
{
"mcpServers": {
"html-to-markdown": {
"command": "html-to-markdown-mcp"
}
}
}
Configuration methods:
-
Via Cursor Settings (Recommended):
- Open Cursor Settings:
⌘ + ,(macOS) orCtrl + ,(Windows/Linux) - Navigate to File → Preferences → Cursor Settings
- Select the MCP option
- Add a new global MCP server with the configuration above
- Open Cursor Settings:
-
Manual file editing:
- Global:
~/.cursor/mcp.json(available across all projects) - Local:
.cursor/mcp.jsonin your project directory (project-specific)
- Global:
After adding the configuration, restart Cursor for the changes to take effect.
With Codex
Add this server to your Codex configuration using the CLI or by editing the config file:
Option 1: Using Codex CLI (Recommended):
codex mcp add html-to-markdown -- npx -y html-to-markdown-mcp
Or if installed globally:
codex mcp add html-to-markdown -- html-to-markdown-mcp
Option 2: Manual Configuration:
Edit ~/.codex/config.toml and add:
[mcp_servers.html-to-markdown]
command = "npx"
args = ["-y", "html-to-markdown-mcp"]
Or if installed globally:
[mcp_servers.html-to-markdown]
command = "html-to-markdown-mcp"
The configuration file is located at ~/.codex/config.toml on all platforms (macOS, Linux, and Windows).
After updating the configuration, restart Codex or your Codex session for the changes to take effect.
Using Local Development Version
If you're developing or testing locally, you can add the MCP server directly from your local code:
With Claude Code:
claude mcp add --transport stdio html-to-markdown -- node /absolute/path/to/html-to-markdown-mcp/index.js
With Claude Desktop:
{
"mcpServers": {
"html-to-markdown": {
"command": "node",
"args": ["/absolute/path/to/html-to-markdown-mcp/index.js"]
}
}
}
Replace /absolute/path/to/html-to-markdown-mcp with the actual path to your cloned repository.
Available Tools
html_to_markdown
Fetch HTML from a URL or convert provided HTML content to Markdown format. This tool is automatically used by Claude whenever HTML needs to be fetched and converted.
Parameters:
url(string): URL to fetch and convert (eitherurlorhtmlis required)html(string): Raw HTML content to convert (eitherurlorhtmlis required)includeMetadata(boolean, optional): Include metadata header (default: true)maxLength(number, optional): Maximum length of returned content in characters. Content exceeding this will be truncated with a message. Useful for large pages to avoid token limits.saveToFile(string, optional): File path to save the full content. When specified, saves the complete markdown and returns only a summary. Recommended for very large pages.
Example 1: Fetch from URL (Recommended)
{
"url": "https://example.com"
}
Example 2: Convert raw HTML
{
"html": "<h1>Hello World</h1><p>This is a <strong>test</strong>.</p>"
}
Example 3: Fetch large page and save directly to file
{
"url": "https://www.docuseal.com/docs/api",
"saveToFile": "./docuseal-api.md"
}
Example 4: Limit returned content length
{
"url": "https://example.com",
"maxLength": 5000
}
Output:
# Example Domain
**Source:** https://example.com
**Saved:** 2025-10-09T12:00:00.000Z
---
# Example Domain
This domain is for use in illustrative examples...
save_markdown
Save markdown content to a file on disk. Use this to persist converted HTML or any markdown content.
Parameters:
content(string, required): The markdown content to savefilePath(string, required): The file path where the markdown should be saved (can be relative or absolute)
Example:
{
"content": "# My Document\n\nThis is some markdown content.",
"filePath": "./output/document.md"
}
Usage: You can chain both tools together - first convert HTML to markdown, then save the result to a file.
When does it activate?
The MCP server will automatically be used by Claude when you:
- Ask to fetch information from a webpage
- Request to convert HTML to Markdown
- Need to extract content from a URL
- Ask to summarize or analyze a webpage
- Request to save markdown content to a file
Example prompts that trigger it:
- "What's on https://example.com?"
- "Fetch and summarize this article: https://..."
- "Convert this webpage to Markdown"
- "Extract the main content from this URL"
- "Save this webpage as a markdown file"
- "Fetch https://example.com and save it to article.md"
Local Development
If you want to contribute or modify the server:
# Clone the repository
git clone https://github.com/levz0r/html-to-markdown-mcp.git
cd html-to-markdown-mcp
# Install dependencies
npm install
# Run the server
npm start
Testing
Run the test suite using Node's built-in test runner:
# Run all tests
npm test
# Run tests in watch mode (re-runs on file changes)
npm run test:watch
The test suite includes:
- Tool discovery tests
- HTML to markdown conversion tests
- URL fetching tests
- File saving tests
- Truncation and large page handling tests
- SSRF protection tests
- Integration workflow tests
Publishing a New Version
The project uses automated CI/CD for publishing to npm:
-
Update version using npm version scripts:
npm run version:patch # 1.0.0 -> 1.0.1 npm run version:minor # 1.0.0 -> 1.1.0 npm run version:major # 1.0.0 -> 2.0.0 -
Push the tag to trigger automated publishing:
git push && git push --tags -
GitHub Actions will automatically:
- Run all tests
- Publish to npm if tests pass
- Add provenance information to the package
Manual publishing (if needed):
npm run release:patch --otp=<code>
npm run release:minor --otp=<code>
npm run release:major --otp=<code>
Security
SSRF Protection
By default, the server blocks URL requests to private and internal network addresses to prevent Server-Side Request Forgery (SSRF) attacks. This includes:
- Loopback addresses (
127.0.0.0/8,::1) - Private networks (
10.0.0.0/8,172.16.0.0/12,192.168.0.0/16) - Link-local / cloud metadata endpoints (
169.254.0.0/16) - Non-HTTP(S) schemes (
file://,ftp://, etc.)
DNS resolution is checked to prevent bypass via hostnames that resolve to private IPs.
Allowing Local Network Access
If you need to convert HTML from local or internal servers (e.g., a local dev server), you can opt in with the --allow-local flag or the ALLOW_LOCAL_NETWORK environment variable:
# Via CLI flag
npx html-to-markdown-mcp --allow-local
# Via environment variable
ALLOW_LOCAL_NETWORK=true npx html-to-markdown-mcp
Claude Desktop / Cursor configuration with local access:
{
"mcpServers": {
"html-to-markdown": {
"command": "npx",
"args": ["html-to-markdown-mcp", "--allow-local"]
}
}
}
Warning: Only enable local network access if you trust the AI agent's URL inputs. With this flag enabled, the server can reach internal services, localhost ports, and cloud metadata endpoints.
Technical Details
- Protocol: Model Context Protocol (MCP)
- Conversion Library: Turndown.js
- Transport: stdio
- Node.js: ES modules
Related Projects
This server uses the same conversion approach as markdown-printer, a browser extension for saving web pages as Markdown files.
License
MIT
Related Servers
Bright Data
sponsorDiscover, extract, and interact with the web - one interface powering automated access across the public internet.
Query Table
A financial web table crawler using Playwright that queries data from multiple websites with fallback switching.
Wayback Machine
Access the Internet Archive's Wayback Machine to retrieve archived web pages and check for available snapshots of URLs.
CrawlForge MCP
CrawlForge MCP is a production-ready MCP server with 18 web scraping tools for AI agents. It gives Claude, Cursor, and any MCP-compatible client the ability to fetch URLs, extract structured data with CSS/XPath selectors, run deep multi-step research, bypass anti-bot detection with TLS fingerprint randomization, process documents, monitor page changes, and more. Credit-based pricing with a free tier (1,000 credits/month, no credit card required).
CarDeals-MCP
A Model Context Protocol (MCP) service that indexes and queries car-deal contexts - fast, flexible search for vehicle listings and marketplace data.
Social & Content MCP Server
Trending content from Hacker News, Dev.to, IMDb, podcasts, and Eventbrite
Read URL MCP
Extracts web content from a URL and converts it to clean Markdown format.
Bilibili Comments
Fetch Bilibili video comments in bulk, including nested replies. Requires a Bilibili cookie for authentication.
Web Scout
An MCP server for web search and content extraction using DuckDuckGo.
YouTube Transcript Extractor
Extracts transcripts from public YouTube videos.
Crawl MCP
An MCP server for crawling WeChat articles. It supports single and batch crawling with multiple output formats, designed for AI tools like Cursor.