Extracts and transforms webpage content into clean, LLM-optimized Markdown using the Readability algorithm.
This project is based on the original server-moz-readability implementation of emzimmer. (For the original README documentation, please refer to the original README.md.)
This Python implementation adapts the original concept to run as python based MCP using FastMCP
A Python implementation of the Model Context Protocol (MCP) server that extracts and transforms webpage content into clean, LLM-optimized Markdown.
Unlike simple fetch requests, this server:
git clone https://github.com/jmh108/MCP-server-readability-python.git
cd MCP-server-readability-python
python -m venv venv
source venv/bin/activate # On Windows use: venv\Scripts\activate
pip install -r requirements.txt
fastmcp run server.py
curl -X POST http://localhost:8000/tools/extract_content \
-H "Content-Type: application/json" \
-d '{"url": "https://example.com/article"}'
extract_content
Fetches and transforms webpage content into clean Markdown.
Arguments:
{
"url": {
"type": "string",
"description": "The website URL to parse",
"required": true
}
}
Returns:
{
"content": "Markdown content..."
}
To configure the MCP server, add the following to your MCP settings file:
{
"mcpServers": {
"readability": {
"command": "fastmcp",
"args": ["run", "server.py"],
"env": {}
}
}
}
The server can then be started using the MCP protocol and accessed via the parse
tool.
MIT License - See LICENSE for details.
Render website screenshots with ScreenshotOne
Automate Chrome via its debugging port with session persistence. Requires Chrome to be started with remote debugging enabled.
Control web browsers using the Selenium WebDriver for automation and testing.
Browser automation using Puppeteer, with support for local, Docker, and Cloudflare Workers deployments.
Hyperbrowser is the next-generation platform empowering AI agents and enabling effortless, scalable browser automation.
A web crawling framework that integrates the Model Context Protocol (MCP) with the Colly web scraping library.
A financial web table crawler using Playwright that queries data from multiple websites with fallback switching.
Query financial web tables from sources like iwencai, tdx, and eastmoney using Playwright.
A server for browser automation using Google Chrome, based on the MCP framework.
Use 3,000+ pre-built cloud tools to extract data from websites, e-commerce, social media, search engines, maps, and more