Mozilla Readability Parser
Extracts and transforms webpage content into clean, LLM-optimized Markdown using Mozilla's Readability algorithm.
Mozilla Readability Parser MCP Server
An model context protocol (MCP) server that extracts and transforms webpage content into clean, LLM-optimized Markdown. Returns article title, main content, excerpt, byline and site name. Uses Mozilla's Readability algorithm to remove ads, navigation, footers and non-essential elements while preserving the core content structure. More about MCP.
Features
- Removes ads, navigation, footers and other non-essential content
- Converts clean HTML into well-formatted Markdown (also uses Turndown)
- Returns article metadata (title, excerpt, byline, site name)
- Handles errors gracefully
Why Not Just Fetch?
Unlike simple fetch requests, this server:
- Extracts only relevant content using Mozilla's Readability algorithm
- Eliminates noise like ads, popups, and navigation menus
- Reduces token usage by removing unnecessary HTML/CSS
- Provides consistent Markdown formatting for better LLM processing
- Includes useful metadata about the content
Installation
Installing via Smithery
To install Mozilla Readability Parser for Claude Desktop automatically via Smithery:
npx -y @smithery/cli install server-moz-readability --client claude
Manual Installation
npm install server-moz-readability
Tool Reference
parse
Fetches and transforms webpage content into clean Markdown.
Arguments:
{ "url": { "type": "string", "description": "The website URL to parse", "required": true } }
Returns:
{ "title": "Article title", "content": "Markdown content...", "metadata": { "excerpt": "Brief summary", "byline": "Author information", "siteName": "Source website name" } }
Usage with Claude Desktop
Add to your claude_desktop_config.json:
{ "mcpServers": { "readability": { "command": "npx", "args": ["-y", "server-moz-readability"] } } }
Dependencies
- @mozilla/readability - Content extraction
- turndown - HTML to Markdown conversion
- jsdom - DOM parsing
- axios - HTTP requests
License
MIT
Related Servers
Bright Data
sponsorDiscover, extract, and interact with the web - one interface powering automated access across the public internet.
DeepResearch MCP
A powerful research assistant for conducting iterative web searches, analysis, and report generation.
neo-vision
Spatial DOM maps for AI agent browser navigation with anti-bot stealth and human-like behavioral simulation
MCP Web Scraper
A production-ready web scraping platform with ML-powered automation, browser automation via Playwright, and persistent caching.
PlayMCP Browser Automation Server
A server for browser automation using Playwright, providing powerful tools for web scraping, testing, and automation.
YouTube Transcript
An MCP server for extracting and processing transcripts from YouTube videos.
Skrapr
An intelligent web scraping tool using AI and browser automation to extract structured data from websites.
Playlist-MCP
Provides access to the transcripts of any YouTube playlist, configurable via URL.
Web Fetch
Fetches and transforms web content, including JavaScript-rendered pages and media files, into various formats.
Web-curl
Fetch, extract, and process web and API content. Supports resource blocking, authentication, and Google Custom Search.
Urlbox Full Page Screenshots
An MCP server for the Urlbox Screenshot API. It enables your client to take screenshots, generate PDFs, extract HTML/markdown, and more from websites.