WebforAI Text Extractor
Extracts plain text from web pages using WebforAI.
WebforAI Text Extractor - MCP Server
A Cloudflare Workers-based Model Context Protocol (MCP) server that extracts plain text from web pages using WebforAI.
🌟 What is WebforAI?
WebforAI is a powerful library designed to make web content accessible to AI models. It provides tools to:
- Convert HTML to clean, structured Markdown
- Extract meaningful content from web pages
- Process tables, links, and images intelligently
- Prepare web content for AI consumption
This MCP server leverages WebforAI's capabilities to extract plain text from any web page URL, making it easy to feed web content into AI models through the Model Context Protocol.
📋 Features
- Simple API: Extract text from any web page with a single API call
- Clean Output: Receive well-formatted Markdown text without HTML noise
- Error Handling: Robust error handling for failed requests
- Cloudflare Workers: Serverless deployment with global distribution
- MCP Compatible: Works with any MCP client like Claude Desktop or Cloudflare AI Playground
🚀 Getting Started
Deploy to Cloudflare Workers
This will deploy your MCP server to a URL like: webforai-mcp-server.<your-account>.workers.dev/sse
Local Development
-
Clone this repository:
git clone https://github.com/yutakobayashidev/webforai-mcp-server.git cd webforai-mcp-server -
Install dependencies:
pnpm install -
Start the development server:
pnpm dev -
Your server will be available at
http://localhost:8787
🔧 Using the Text Extraction Tool
The extractWebPageText tool accepts a URL to a web page and returns the extracted text content in markdown format:
{
"url": "https://example.com/page"
}
The response will contain the extracted text in Markdown format, with:
- Links converted to plain text
- Tables converted to plain text
- Images hidden
🔌 Connecting to MCP Clients
Cloudflare AI Playground
- Go to Cloudflare AI Playground
- Enter your deployed MCP server URL (
webforai-mcp-server.<your-account>.workers.dev/sse) - You can now use your text extraction tool directly from the playground!
Claude Desktop
To connect to your MCP server from Claude Desktop:
- Follow Anthropic's Quickstart
- In Claude Desktop go to Settings > Developer > Edit Config
- Update with this configuration:
{
"mcpServers": {
"webforaiExtractor": {
"command": "npx",
"args": [
"mcp-remote",
"http://localhost:8787/sse" // or webforai-mcp-server.your-account.workers.dev/sse
]
}
}
}
- Restart Claude and you should see the text extraction tool become available
📚 Learn More
📄 License
MIT
Related Servers
Bright Data
sponsorDiscover, extract, and interact with the web - one interface powering automated access across the public internet.
Mozilla Readability Parser
Extracts and transforms webpage content into clean, LLM-optimized Markdown using Mozilla's Readability algorithm.
Fetch MCP Server
Fetches web content from a URL and converts it from HTML to markdown for easier consumption by LLMs.
YouTube Video Summarizer MCP
Fetch and summarize YouTube videos by extracting titles, descriptions, and transcripts.
Monad MCP Magic Eden
Retrieve NFT data from the Monad testnet, including holder addresses, collection values, and top-selling collections.
Chrome MCP Server
Control a Chrome browser instance using the Chrome DevTools Protocol (CDP).
MCP Server Collector
Discovers and collects MCP servers from the internet.
CarDeals-MCP
A Model Context Protocol (MCP) service that indexes and queries car-deal contexts - fast, flexible search for vehicle listings and marketplace data.
Mention MCP Server
Monitor web and social media using the Mention API.
MCP URL Format Converter
Fetches content from any URL and converts it to HTML, JSON, Markdown, or plain text.
Firecrawl MCP
Adds powerful web scraping and search capabilities to LLM clients like Cursor and Claude.