MCP OCR Server
An MCP server for Optical Character Recognition (OCR) using the Tesseract engine.
MCP OCR Server
A production-grade OCR server built using MCP (Model Context Protocol) that provides OCR capabilities through a simple interface.
Features
- Extract text from images using Tesseract OCR
- Support for multiple input types:
- Local image files
- Image URLs
- Raw image bytes
- Automatic Tesseract installation
- Support for multiple languages
- Production-ready error handling
Installation
# Using pip
pip install mcp-ocr
# Using uv
uv pip install mcp-ocr
Tesseract will be installed automatically on supported platforms:
- macOS (via Homebrew)
- Linux (via apt, dnf, or pacman)
- Windows (manual installation instructions provided)
Usage
As an MCP Server
- Start the server:
python -m mcp_ocr
- Configure Claude for Desktop:
Add to
~/Library/Application Support/Claude/claude_desktop_config.json:
{
"mcpServers": {
"ocr": {
"command": "python",
"args": ["-m", "mcp_ocr"]
}
}
}
Available Tools
perform_ocr
Extract text from images:
# From file
perform_ocr("/path/to/image.jpg")
# From URL
perform_ocr("https://example.com/image.jpg")
# From bytes
perform_ocr(image_bytes)
get_supported_languages
List available OCR languages:
get_supported_languages()
Development
- Clone the repository:
git clone https://github.com/rjn32s/mcp-ocr.git
cd mcp-ocr
- Set up development environment:
uv venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate
uv pip install -e .
- Run tests:
pytest
Contributing
- Fork the repository
- Create your feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
Security
- Never commit API tokens or sensitive credentials
- Use environment variables or secure credential storage
- Follow GitHub's security best practices
License
This project is licensed under the MIT License - see the LICENSE file for details.
Acknowledgments
Serveurs connexes
GW2 MCP Server
Connects Large Language Models (LLMs) with Guild Wars 2 data sources. Requires a Guild Wars 2 API key for wallet functionality.
Kali MCP Server
A Python MCP Server that connects Large Language Models natively to a comprehensive suite of offensive security tools.
AsusWRT MCP Server
Model Context Protocol server for secure AsusWRT router administration via SSH. Provides 42+ read-only monitoring tools and guarded mutation tools for managing AsusWRT/Merlin routers.
CryptoMinute
AI-powered crypto news intelligence MCP server with 8 tools: news search, narrative analytics, AI-clustered stories, Reddit sentiment, YouTube engagement, historical prices, token metadata, and Telegram flash posts.
Sidekick for InDesign
Lets your AI assistant talk to InDesign. Not about it. Actually control it.
FinMCP
Lightweight TypeScript Finance MCP server wrapping Yahoo Finance APIs. Plug real-time financial data — stocks, options, crypto, earnings — into any AI assistant. No API key. Works via stdio, Docker, or HTTP.
Cast
MCP server for Google Cast — discover devices, play media, control volume, launch apps, and manage queues over stdio
Solana MCP Server
MCP server giving AI agents access to Solana blockchain data — wallet balances, token prices, DeFi yields, and token safety checks.
Hidden Empire
Play a legendary text adventure by talking to your AI — no commands to memorize. The Hidden Empire puts a full underground world of puzzles, treasures, and trolls inside your conversation. Speak naturally: say 'head north,' 'grab the lantern,' or 'what am I carrying?' and your AI handles the rest. Execute multi-move plans in one shot, undo mistakes instantly, and save up to 20 named playthroughs you can resume from any session. Based on the MIT-licensed Zork I source, rebuilt from the ground up for AI-native play.
Pokemon Gen3 Calculator
A damage and status calculator for Pokemon Generation 3.