MCP Server Convert

Document conversion MCP server — PDF, DOCX, HTML, EPUB to Markdown with 6 tools and Docker support

mcp-server-convert

A lightweight Model Context Protocol (MCP) server that converts documents to Markdown. Supports PDF, DOCX, HTML, EPUB, CSV, JSON, and plain text files.

Perfect for AI agents that need to ingest and understand document content.

Features

  • 📄 Multi-format support: PDF, DOCX, HTML, EPUB, CSV, JSON, images (via OCR), and plain text
  • 🔧 6 MCP tools: convert_file, convert_url, list_supported_formats, batch_convert, extract_metadata, convert_directory
  • 🐍 Zero external dependencies for core: Uses Python standard library + markdownify for HTML
  • Fast: In-memory processing, no temp files
  • 🐳 Docker-ready: Single Dockerfile, one command deploy

Quick Start

Install & Run

# Clone
git clone https://github.com/demo112/mcp-server-convert.git
cd mcp-server-convert

# Install dependencies
pip install -r requirements.txt

# Run
python -m mcp_server_convert

Configure in Claude Code

Add to your MCP settings (~/.claude/settings.json):

{
  "mcpServers": {
    "convert": {
      "command": "python",
      "args": ["-m", "mcp_server_convert"],
      "cwd": "/path/to/mcp-server-convert"
    }
  }
}

Docker

docker build -t mcp-server-convert .
docker run -i --rm mcp-server-convert

Configure with Docker

{
  "mcpServers": {
    "convert": {
      "command": "docker",
      "args": ["run", "-i", "--rm", "-v", "/path/to/files:/data", "mcp-server-convert"]
    }
  }
}

Tools

convert_file

Convert a local file to Markdown.

Parameters:

  • file_path (string, required): Absolute path to the file
  • max_length (int, optional): Maximum output length in chars (default: 50000)

convert_url

Fetch a URL and convert its content to Markdown.

Parameters:

  • url (string, required): URL to fetch and convert
  • max_length (int, optional): Maximum output length in chars (default: 50000)

batch_convert

Convert multiple files at once.

Parameters:

  • file_paths (array of strings, required): List of file paths
  • max_length_per_file (int, optional): Max length per file (default: 50000)

convert_directory

Convert all supported files in a directory.

Parameters:

  • dir_path (string, required): Path to directory
  • recursive (bool, optional): Include subdirectories (default: true)
  • max_files (int, optional): Maximum files to convert (default: 20)

extract_metadata

Extract metadata from a file without full conversion.

Parameters:

  • file_path (string, required): Path to the file

list_supported_formats

List all supported file extensions and their conversion methods.

Supported Formats

FormatExtensionMethod
PDF.pdfPyMuPDF (fitz)
Word.docxpython-docx
HTML.html, .htmmarkdownify
EPUB.epubebooklib
CSV.csvpandas → markdown table
JSON.jsonFormatted markdown code block
XML.xmlxmltodict → markdown
Excel.xlsxopenpyxl → markdown table
PowerPoint.pptxpython-pptx → markdown slides
Text.txt, .md, .rst, .logDirect passthrough
Images.png, .jpgpytesseract OCR (if available)

Support

If this tool helps your workflow, consider supporting its development:

License

MIT

Related Servers

NotebookLM Web Importer

Import web pages and YouTube videos to NotebookLM with one click. Trusted by 200,000+ users.

Install Chrome Extension