Mozilla Readability Parser
Extracts and transforms webpage content into clean, LLM-optimized Markdown using Mozilla's Readability algorithm.
Mozilla Readability Parser MCP Server
An model context protocol (MCP) server that extracts and transforms webpage content into clean, LLM-optimized Markdown. Returns article title, main content, excerpt, byline and site name. Uses Mozilla's Readability algorithm to remove ads, navigation, footers and non-essential elements while preserving the core content structure. More about MCP.
Features
- Removes ads, navigation, footers and other non-essential content
- Converts clean HTML into well-formatted Markdown (also uses Turndown)
- Returns article metadata (title, excerpt, byline, site name)
- Handles errors gracefully
Why Not Just Fetch?
Unlike simple fetch requests, this server:
- Extracts only relevant content using Mozilla's Readability algorithm
- Eliminates noise like ads, popups, and navigation menus
- Reduces token usage by removing unnecessary HTML/CSS
- Provides consistent Markdown formatting for better LLM processing
- Includes useful metadata about the content
Installation
Installing via Smithery
To install Mozilla Readability Parser for Claude Desktop automatically via Smithery:
npx -y @smithery/cli install server-moz-readability --client claude
Manual Installation
npm install server-moz-readability
Tool Reference
parse
Fetches and transforms webpage content into clean Markdown.
Arguments:
{ "url": { "type": "string", "description": "The website URL to parse", "required": true } }
Returns:
{ "title": "Article title", "content": "Markdown content...", "metadata": { "excerpt": "Brief summary", "byline": "Author information", "siteName": "Source website name" } }
Usage with Claude Desktop
Add to your claude_desktop_config.json:
{ "mcpServers": { "readability": { "command": "npx", "args": ["-y", "server-moz-readability"] } } }
Dependencies
- @mozilla/readability - Content extraction
- turndown - HTML to Markdown conversion
- jsdom - DOM parsing
- axios - HTTP requests
License
MIT
相關伺服器
Bright Data
贊助Discover, extract, and interact with the web - one interface powering automated access across the public internet.
Scrapeless
Integrate real-time Scrapeless Google SERP(Google Search, Google Flight, Google Map, Google Jobs....) results into your LLM applications. This server enables dynamic context retrieval for AI workflows, chatbots, and research tools.
yt-dlp-mcp
Download video and audio from various platforms like YouTube, Facebook, and TikTok using yt-dlp.
Fetch MCP Server
Fetches web content from a URL and converts it from HTML to markdown for easier consumption by LLMs.
Mention MCP Server
Monitor web and social media using the Mention API.
Crawl4AI RAG
Integrates web crawling and Retrieval-Augmented Generation (RAG) into AI agents and coding assistants.
brosh
A browser screenshot tool to capture scrolling screenshots of webpages using Playwright, with support for intelligent section identification and multiple output formats.
just-every/mcp-screenshot-website-fast
High-quality screenshot capture optimized for Claude Vision API. Automatically tiles full pages into 1072x1072 chunks (1.15 megapixels) with configurable viewports and wait strategies for dynamic content.
MCP Query Table
Query financial web tables from sources like iwencai, tdx, and eastmoney using Playwright.
Cloudflare Browser Rendering
Provides web context to LLMs using Cloudflare's Browser Rendering API.