oxidize-pdf MCP Server

Rust-powered PDF toolkit over MCP: create, read, and analyze PDFs; extract text and entities for RAG; convert to Markdown; split/merge/rotate/reorder pages; manage form fields and annotations; encrypt documents. Runs locally via uvx oxidize-mcp.

Documentation

oxidize-pdf

PyPI version CI License: MIT Python Typed MCP

Rust-powered PDF library for Python. Generate, parse, split, merge, and manipulate PDFs with native performance. Ships with a built-in MCP server so AI agents can work with PDFs out of the box.

No C dependencies. No Java. No subprocess calls.

Installation

pip install oxidize-pdf            # Core library
pip install "oxidize-pdf[mcp]"     # + MCP server for AI agents

Platforms: Linux (x86_64, aarch64) | macOS (x86_64, Apple Silicon) | Windows (x86_64) Requires: Python 3.10+

Why oxidize-pdf?

oxidize-pdfPure-Python libsC/Java wrappers
PerformanceNative (compiled Rust)InterpretedNative but heavy
DependenciesZeroVariesPoppler, Java, Ghostscript
Memory safetyRust ownership modelGC-dependentManual / GC
Type stubsFull (mypy/pyright)PartialRare
AI-ready (MCP)Built-inNoNo

MCP Server

Give your AI agent full PDF capabilities in one line:

oxidize-mcp

The built-in Model Context Protocol server exposes 12 tools, 6 resources, and 5 prompts — compatible with Claude, GPT, and any MCP client.

Claude Desktop integration

Add to your claude_desktop_config.json:

{
  "mcpServers": {
    "oxidize-pdf": {
      "command": "oxidize-mcp",
      "env": {
        "OXIDIZE_WORKSPACE": "/path/to/your/pdfs"
      }
    }
  }
}

Available tools

ToolWhat it does
read_pdfRead metadata — page count, version, encryption status, title, author
extract_textExtract text from all pages or a specific page
convert_pdfConvert to markdown, chunks, or RAG-optimized format
create_pdfCreate a new PDF with optional metadata
save_pdfSave a session to disk, with optional encryption
add_contentAdd pages, text, and graphics to a session
annotate_pdfAdd text annotations and highlights
manipulate_pdfSplit, merge, rotate, extract pages, reverse, overlay
manage_formsCreate, fill, read, and validate form fields
secure_pdfEncrypt, check permissions, verify signatures
extract_entitiesExtract structured entities from pages
analyze_pdfValidate structure, detect corruption, check PDF/A compliance

The server also exposes resources (session data, capabilities, version info) and prompts (guided workflows for summarization, data extraction, form filling, and more).

Configuration

OXIDIZE_WORKSPACE=/path/to/pdfs oxidize-mcp

Or start programmatically:

from oxidize_pdf.mcp.server import run
run()

Python API

Create a PDF

from oxidize_pdf import Document, Page, Font, Color

doc = Document()
doc.set_title("My Document")
doc.set_author("Jane Doe")

page = Page.a4()
page.set_font(Font.HELVETICA, 24.0)
page.set_text_color(Color.black())
page.text_at(72.0, 750.0, "Hello from oxidize-pdf!")

page.set_font(Font.TIMES_ROMAN, 12.0)
page.text_at(72.0, 700.0, "Generated with Python + Rust.")

doc.add_page(page)
doc.save("output.pdf")

Parse an existing PDF

from oxidize_pdf import PdfReader

reader = PdfReader.open("document.pdf")
print(f"Pages: {reader.page_count}, Version: {reader.version}")

for i, text in enumerate(reader.extract_text()):
    print(f"--- Page {i + 1} ---")
    print(text)

Operations

from oxidize_pdf import split_pdf, merge_pdfs, rotate_pdf, extract_pages

split_pdf("input.pdf", "output_dir/")                       # Split into individual pages
merge_pdfs(["part1.pdf", "part2.pdf"], "merged.pdf")         # Merge multiple PDFs
rotate_pdf("input.pdf", "rotated.pdf", 90)                   # Rotate all pages
extract_pages("input.pdf", "subset.pdf", [0, 2, 4])          # Extract specific pages

Graphics

from oxidize_pdf import Document, Page, Color

doc = Document()
page = Page.a4()

page.set_fill_color(Color.hex("#3498db"))
page.draw_rect(72.0, 700.0, 200.0, 100.0)
page.fill()

page.set_stroke_color(Color.red())
page.set_line_width(2.0)
page.draw_circle(300.0, 500.0, 50.0)
page.stroke()

doc.add_page(page)
doc.save("graphics.pdf")

Types

from oxidize_pdf import Color, Point, Rectangle, Margins, Font

# Colors
Color.rgb(1.0, 0.0, 0.0)          # RGB
Color.hex("#ff6600")               # Hex
Color.cmyk(0.0, 1.0, 1.0, 0.0)   # CMYK

# Geometry
Point(72.0, 720.0)
Rectangle.from_xywh(72.0, 72.0, 468.0, 648.0)
Margins.uniform(72.0)

# Fonts — all 14 standard PDF fonts
Font.HELVETICA    # Font.HELVETICA_BOLD
Font.TIMES_ROMAN  # Font.TIMES_BOLD
Font.COURIER      # Font.COURIER_BOLD

Error handling

from oxidize_pdf import PdfReader, PdfError, PdfIoError, PdfParseError

try:
    reader = PdfReader.open("missing.pdf")
except PdfIoError as e:
    print(f"I/O error: {e}")
except PdfParseError as e:
    print(f"Parse error: {e}")
except PdfError as e:
    print(f"PDF error: {e}")

Exception hierarchy: PdfError > PdfIoError, PdfParseError, PdfEncryptionError, PdfPermissionError

MCP Server

oxidize-pdf includes an MCP server that exposes PDF capabilities to AI assistants like Claude. Install with the mcp extra:

pip install oxidize-pdf[mcp]

Claude Desktop

Add this to your claude_desktop_config.json:

{
  "mcpServers": {
    "oxidize-pdf": {
      "command": "uvx",
      "args": ["--from", "oxidize-pdf[mcp]", "oxidize-mcp"]
    }
  }
}

Claude Code

claude mcp add oxidize-pdf -- uvx --from "oxidize-pdf[mcp]" oxidize-mcp

Available tools

ToolDescription
read_pdfOpen a PDF and get metadata (pages, version, encryption)
extract_textExtract text content from PDF pages
convert_pdfConvert between PDF versions
analyze_pdfAnalyze structure, fonts, images, and compliance
extract_entitiesExtract images and digital signatures
manipulate_pdfSplit, merge, rotate, extract, and reorder pages
annotate_pdfAdd text annotations, highlights, and stamps
manage_formsCreate, fill, and read PDF form fields
secure_pdfEncrypt, decrypt, and set document permissions
create_pdfCreate a new PDF document with pages
add_pdf_contentAdd text, shapes, and images to pages
save_pdfSave the document to file or bytes

Resources

  • oxidize://fonts — Available built-in PDF fonts
  • oxidize://page-sizes — Standard page sizes with dimensions
  • oxidize://capabilities — Server capabilities and tool listing
  • oxidize://version — Version information
  • oxidize://workspace — PDF files in the workspace directory
  • oxidize://session/{id} — Session data by ID

Known limitations

  • Encryption write support: Document.encrypt() configures encryption parameters but the underlying Rust library does not yet serialize the encryption dictionary to the PDF output. Reading encrypted PDFs works correctly.
  • CPython only: PyPy and GraalPy are not supported.

License

MIT — see LICENSE for details.