Fetches and converts website content to Markdown with AI-powered cleanup, OpenAPI support, and stealth browsing.
Language: English | ηΉι«δΈζ
A powerful Model Context Protocol (MCP) server designed for fetching website content and converting it to Markdown format, making it easier for AI to understand and process website information.
π Enhanced Processing | π OpenAPI Support | βοΈ Smart Analysis | π― Advanced Extraction |
---|---|---|---|
AI-powered content cleanup | OpenAPI 3.x/Swagger 2.0 | Reading time calculation | Main content detection |
Auto ad removal | Professional validation | Word count statistics | Language detection |
Content summarization | Structured API parsing | Smart retry mechanism | Multi-format support |
Feature | Status | Description |
---|---|---|
π§ Enhanced Content Processor | β | AI-powered content cleaning and extraction |
π Smart Analytics | β | Word count, reading time, content summary |
π Language Detection | β | Automatic language identification |
π― Intelligent Retry | β | Smart retry mechanism with exponential backoff |
π Stealth Browser | β | Anti-detection browsing capabilities |
β‘ Rate Limiting | β | Built-in rate limiting and concurrency control |
π§Ή Content Cleanup | β | Remove ads, navigation, and irrelevant content |
π Enhanced Markdown | β | Support for strikethrough, underline, highlights |
π‘ Easiest way: No local installation needed!
Create a my-websites.json
file:
{
"websites": [
{
"name": "your_website",
"url": "https://your-website.com",
"description": "Your Project Website"
},
{
"name": "api_docs",
"url": "https://api.example.com/openapi.json",
"description": "Your API Specification"
}
]
}
Add to .cursor/mcp.json
:
{
"mcpServers": {
"website-to-markdown": {
"command": "npx",
"args": ["-y", "website-to-markdown-mcp"],
"disabled": false,
"env": {
"WEBSITES_CONFIG_PATH": "./my-websites.json"
}
}
}
}
Please list all configured websites
π Done! No installation required!
π‘ Best Practice: Use this method for development or customization!
git clone https://github.com/your-username/website-to-markdown-mcp.git
cd website-to-markdown-mcp
npm install
npm run build
Add to .cursor/mcp.json
:
{
"mcpServers": {
"website-to-markdown": {
"command": "cmd",
"args": ["/c", "node", "./website-to-markdown-mcp/dist/index.js"],
"disabled": false,
"env": {
"WEBSITES_CONFIG_PATH": "./my-websites.json"
}
}
}
}
Every fetched content now includes:
# π Example Website
**Source**: https://example.com
**Website**: example_site - Example Website
**π Reading Time**: 5 minutes
**π’ Word Count**: 1,250 words
**π Language**: English
**π Summary**: This article discusses the latest developments in web technology...
---
[Enhanced Markdown content with better formatting...]
Feature | OpenAPI 3.x | Swagger 2.0 | Description |
---|---|---|---|
π Auto Detection | β | β | Support JSON/YAML formats |
β Professional Validation | β | β | Using @readme/openapi-parser |
π Structured Parsing | β | β | Endpoints, parameters, responses |
π Reference Resolution | β | β | Auto handle $ref references |
π Smart Summary | β | β | Generate API overview |
π Formatted Output | β | β | Readable Markdown |
{
"websites": [
{
"name": "petstore_openapi",
"url": "https://petstore3.swagger.io/api/v3/openapi.json",
"description": "π Swagger Petstore OpenAPI 3.0 Spec (Demo)"
},
{
"name": "petstore_swagger",
"url": "https://petstore.swagger.io/v2/swagger.json",
"description": "π± Swagger Petstore Swagger 2.0 Spec (Demo)"
},
{
"name": "github_api",
"url": "https://raw.githubusercontent.com/github/rest-api-description/main/descriptions/api.github.com/api.github.com.json",
"description": "π GitHub REST API OpenAPI Spec"
}
]
}
β οΈ Important: Some dependencies require Node.js v20.18.1 or higher. Please update your Node.js version if you encounter engine compatibility warnings.
# Global installation
npm install -g website-to-markdown-mcp
# Or use directly with npx (recommended)
npx website-to-markdown-mcp
# 1. Clone repository
git clone https://github.com/your-username/website-to-markdown-mcp.git
cd website-to-markdown-mcp
# 2. Install dependencies
npm install
# 3. Build project
npm run build
graph TD
A[π Check Environment Variable<br/>WEBSITES_CONFIG_PATH] --> B{File exists?}
B -->|Yes| C[β
Load External Config File]
B -->|No| D[π Check Environment Variable<br/>WEBSITES_CONFIG]
D --> E{Valid JSON?}
E -->|Yes| F[β
Load Embedded Config]
E -->|No| G[π Check config.json]
G --> H{File exists?}
H -->|Yes| I[β
Load Local Config]
H -->|No| J[π§ Use Default Config]
π‘ Advantages: Easy to edit, syntax highlighting, version control friendly
Create Configuration File
# Can be placed anywhere
touch my-api-configs.json
Edit Configuration Content
{
"websites": [
{
"name": "my_docs",
"url": "https://docs.example.com",
"description": "π My Documentation Website"
}
]
}
Set Environment Variable
{
"env": {
"WEBSITES_CONFIG_PATH": "./my-api-configs.json"
}
}
{
"mcpServers": {
"website-to-markdown": {
"command": "cmd",
"args": ["/c", "node", "./website-to-markdown-mcp/dist/index.js"],
"disabled": false,
"env": {
"WEBSITES_CONFIG": "{\"websites\":[{\"name\":\"example\",\"url\":\"https://example.com\",\"description\":\"Example Website\"}]}"
}
}
}
}
Directly edit config.json
in the project root directory:
{
"websites": [
{
"name": "local_site",
"url": "https://local.example.com",
"description": "π Local Test Website"
}
]
}
Tool Name | Function | Parameters | Example |
---|---|---|---|
fetch_website | Fetch any website | url : Website URL | Fetch OpenAPI spec files |
list_configured_websites | List configured websites | None | View all available websites |
Each configured website automatically generates corresponding dedicated tools:
fetch_petstore_openapi
- Fetch Petstore OpenAPI 3.0 specfetch_petstore_swagger
- Fetch Petstore Swagger 2.0 specfetch_github_api
- Fetch GitHub API specfetch_tailwind_css
- Fetch Tailwind CSS documentation# Website Title
**Source**: https://example.com
**Website**: example_site - Example Website
**π Reading Time**: 3 minutes
**π’ Word Count**: 650 words
**π Language**: English
**π Summary**: This article provides a comprehensive overview of modern web development practices, covering frontend frameworks, backend technologies, and deployment strategies.
---
[Enhanced cleaned Markdown content with ads removed and main content extracted...]
# π Example API (v2.1.0)
**Source**: https://api.example.com/openapi.json
**OpenAPI Version**: 3.0.3
**Validation Status**: β
Valid
**π Processing Time**: 1.2 seconds
**π’ Endpoints**: 25 endpoints
**π Server Locations**: 3 servers
---
## π API Basic Information
- **API Name**: Example API
- **Version**: 2.1.0
- **OpenAPI Version**: 3.0.3
- **Description**: A powerful example API for modern applications
## π Servers
1. **https://api.example.com**
- π’ Production server
2. **https://staging-api.example.com**
- π§ͺ Testing server
## π οΈ API Endpoints
Total of **25** endpoints:
### π₯ `/users`
- **GET**: Get user list
- **POST**: Create new user
### π `/users/{id}`
- **GET**: Get specific user
- **PUT**: Update user information
- **DELETE**: Delete user
## π§© Components
- **Schemas**: 12 data models
- **Parameters**: 8 reusable parameters
- **Responses**: 15 reusable responses
- **Security Schemes**: 3 security mechanisms
Please fetch the content from https://docs.example.com and convert to markdown
Please use the fetch_petstore_openapi tool to fetch Petstore OpenAPI specification
Please fetch React official documentation content
π Complete Troubleshooting Guide: See TROUBLESHOOTING.md for detailed solutions to common issues.
Error: npm WARN EBADENGINE Unsupported engine
node --version
Error: Cannot find module './db.json'
npm cache clean --force
Q: Configuration changes not taking effect?
Q: JSON format errors?
Detailed logs are output to stderr at startup:
# View debug messages
npm run dev 2> debug.log
Package | Version | Purpose |
---|---|---|
@modelcontextprotocol/sdk | ^1.0.0 | MCP Core Framework |
@readme/openapi-parser | ^4.1.0 | Professional OpenAPI Parsing |
axios | ^1.6.0 | HTTP Request Handling |
cheerio | ^1.0.0 | HTML Parsing Engine |
turndown | ^7.1.2 | HTML to Markdown |
yaml | ^2.8.0 | YAML Format Support |
zod | ^3.22.0 | Data Validation Framework |
playwright | ^1.40.0 | Browser automation |
π Major Feature Updates
π Major Feature Updates
git checkout -b feature/AmazingFeature
)git commit -m 'Add some AmazingFeature'
)git push origin feature/AmazingFeature
)Report issues on the Issues page, please include:
This project is licensed under the MIT License - see the LICENSE file for details.
π¬ Have questions or suggestions? Feel free to open an Issue!
Made by Sun β€οΈ for the Developer Community
Easy web data access. Simplified retrieval of information from websites and online sources.
Scrape Weibo user information, feeds, and perform searches.
Fetches content from deepwiki.com and converts it into LLM-readable markdown.
A MCP server to retrieve up-to-date jobs from company career sites.
Access YouTube video transcripts and translations using the YouTube Translate API.
An MCP server for the Kakuyomu novel posting site, enabling users to search for works, retrieve episode lists, and read content.
Interact with WebScraping.AI for web data extraction and scraping.
Download webpages as markdown files using the r.jina.ai service, with configurable directories and persistent settings.
A server for web research that brings real-time information into AI models and researches any topic.
Fetch Bilibili video comments in bulk, including nested replies. Requires a Bilibili cookie for authentication.