Mozilla Readability Parser
Extracts and transforms webpage content into clean, LLM-optimized Markdown using Mozilla's Readability algorithm.
Mozilla Readability Parser MCP Server
An model context protocol (MCP) server that extracts and transforms webpage content into clean, LLM-optimized Markdown. Returns article title, main content, excerpt, byline and site name. Uses Mozilla's Readability algorithm to remove ads, navigation, footers and non-essential elements while preserving the core content structure. More about MCP.
Features
- Removes ads, navigation, footers and other non-essential content
- Converts clean HTML into well-formatted Markdown (also uses Turndown)
- Returns article metadata (title, excerpt, byline, site name)
- Handles errors gracefully
Why Not Just Fetch?
Unlike simple fetch requests, this server:
- Extracts only relevant content using Mozilla's Readability algorithm
- Eliminates noise like ads, popups, and navigation menus
- Reduces token usage by removing unnecessary HTML/CSS
- Provides consistent Markdown formatting for better LLM processing
- Includes useful metadata about the content
Installation
Installing via Smithery
To install Mozilla Readability Parser for Claude Desktop automatically via Smithery:
npx -y @smithery/cli install server-moz-readability --client claude
Manual Installation
npm install server-moz-readability
Tool Reference
parse
Fetches and transforms webpage content into clean Markdown.
Arguments:
{ "url": { "type": "string", "description": "The website URL to parse", "required": true } }
Returns:
{ "title": "Article title", "content": "Markdown content...", "metadata": { "excerpt": "Brief summary", "byline": "Author information", "siteName": "Source website name" } }
Usage with Claude Desktop
Add to your claude_desktop_config.json:
{ "mcpServers": { "readability": { "command": "npx", "args": ["-y", "server-moz-readability"] } } }
Dependencies
- @mozilla/readability - Content extraction
- turndown - HTML to Markdown conversion
- jsdom - DOM parsing
- axios - HTTP requests
License
MIT
เซิร์ฟเวอร์ที่เกี่ยวข้อง
Bright Data
ผู้สนับสนุนDiscover, extract, and interact with the web - one interface powering automated access across the public internet.
Crawl4AI MCP Server
An MCP server for advanced web crawling, content extraction, and AI-powered analysis using the crawl4ai library.
WebforAI Text Extractor
Extracts plain text from web pages using WebforAI.
MCP Browser Console Capture Service
A browser automation service for capturing console output, useful for tasks like public sentiment analysis.
Crew Risk
A crawler compliance risk assessment system via a simple API.
Web-curl
Fetch, extract, and process web and API content. Supports resource blocking, authentication, and Google Custom Search.
SnapSender
Capture any website as PNG, JPEG, WebP, or PDF with a single tool call.
Google News Trends MCP
Access Google News and Google Trends data without paid APIs.
WebSearch
A web search and content extraction tool using the Firecrawl API for advanced web scraping, searching, and content analysis.
Fetch MCP Server
Fetches web content from a URL and converts it from HTML to markdown for easier consumption by LLMs.
Firecrawl
Extract web data with Firecrawl