Mozilla Readability Parser
Extracts and transforms webpage content into clean, LLM-optimized Markdown using Mozilla's Readability algorithm.
Mozilla Readability Parser MCP Server
An model context protocol (MCP) server that extracts and transforms webpage content into clean, LLM-optimized Markdown. Returns article title, main content, excerpt, byline and site name. Uses Mozilla's Readability algorithm to remove ads, navigation, footers and non-essential elements while preserving the core content structure. More about MCP.
Features
- Removes ads, navigation, footers and other non-essential content
- Converts clean HTML into well-formatted Markdown (also uses Turndown)
- Returns article metadata (title, excerpt, byline, site name)
- Handles errors gracefully
Why Not Just Fetch?
Unlike simple fetch requests, this server:
- Extracts only relevant content using Mozilla's Readability algorithm
- Eliminates noise like ads, popups, and navigation menus
- Reduces token usage by removing unnecessary HTML/CSS
- Provides consistent Markdown formatting for better LLM processing
- Includes useful metadata about the content
Installation
Installing via Smithery
To install Mozilla Readability Parser for Claude Desktop automatically via Smithery:
npx -y @smithery/cli install server-moz-readability --client claude
Manual Installation
npm install server-moz-readability
Tool Reference
parse
Fetches and transforms webpage content into clean Markdown.
Arguments:
{ "url": { "type": "string", "description": "The website URL to parse", "required": true } }
Returns:
{ "title": "Article title", "content": "Markdown content...", "metadata": { "excerpt": "Brief summary", "byline": "Author information", "siteName": "Source website name" } }
Usage with Claude Desktop
Add to your claude_desktop_config.json:
{ "mcpServers": { "readability": { "command": "npx", "args": ["-y", "server-moz-readability"] } } }
Dependencies
- @mozilla/readability - Content extraction
- turndown - HTML to Markdown conversion
- jsdom - DOM parsing
- axios - HTTP requests
License
MIT
관련 서버
Bright Data
스폰서Discover, extract, and interact with the web - one interface powering automated access across the public internet.
MyBrowserAPI
A browser API for interacting with web services like X, Reddit, ChatGPT, and WhatsApp using Puppeteer.
ELBADOO INTELLIGENCE HUB
A high-performance x402 intelligence gateway providing 20+ social and web endpoints. Powered by 6-stage routing logic—including cloud-rendered JS fetches, residential IP rotation, and automatic settlement refunds for blocked hosts. Built for AI Agents that require reliable, pay-per-request access to LinkedIn, Reddit, Instagram, and beyond without API key management
Financial Data MCP Server
Provides real-time financial market data from Yahoo Finance.
LinkRescue
MCP server that exposes LinkRescue's broken link scanning, monitoring, and fix suggestion capabilities to AI agents (Claude, Cursor, etc.).
SteadyFetch
Reliable web fetching for AI agents with retry, circuit breaker, caching, and anti-bot bypass
YouTube Transcript
A zero-setup server to extract transcripts from YouTube videos on any platform.
Firecrawl
Scrape, crawl, and extract data from any website using the Firecrawl API.
transcriptor-mcp
An MCP server (stdio + HTTP/SSE) that fetches video transcripts/subtitles via yt-dlp, with pagination for large responses. Supports YouTube, Twitter/X, Instagram, TikTok, Twitch, Vimeo, Facebook, Bilibili, VK, Dailymotion. Whisper fallback — transcribes audio when subtitles are unavailable (local or OpenAI API). Works with Cursor and other MCP host
Selenium MCP Server
Control web browsers using the Selenium WebDriver for automation and testing.
Skrapr
An intelligent web scraping tool using AI and browser automation to extract structured data from websites.