MCP Screenshot
Captures screenshots and performs OCR text recognition.
MCP Screenshot
An MCP server that captures screenshots and performs OCR text recognition.
Features
- Screenshot capture (left half, right half, full screen)
- OCR text recognition (supports Japanese and English)
- Multiple output formats (JSON, Markdown, vertical, horizontal)
OCR Engines
This server uses two OCR engines:
-
- Primary OCR engine
- High-accuracy Japanese text recognition
- Runs as an API server
-
- Fallback OCR engine
- Used when yomitoku is unavailable
- Supports both Japanese and English recognition
Installation
npx -y @kazuph/mcp-screenshot
Claude Desktop Configuration
Add the following configuration to your claude_desktop_config.json:
{
"mcpServers": {
"screenshot": {
"command": "npx",
"args": ["-y", "@kazuph/mcp-screenshot"],
"env": {
"OCR_API_URL": "http://localhost:8000" // yomitoku API base URL
}
}
}
}
Environment Variables
| Variable Name | Description | Default Value |
|---|---|---|
| OCR_API_URL | yomitoku API base URL | http://localhost:8000 |
Usage Example
You can use it by instructing Claude like this:
Please take a screenshot of the left half of the screen and recognize the text in it.
Tool Specification
capture
Takes a screenshot and performs OCR.
Options:
region: Screenshot area ('left'/'right'/'full', default: 'left')format: Output format ('json'/'markdown'/'vertical'/'horizontal', default: 'markdown')
License
MIT
Author
kazuph
Serveurs connexes
Kone.vc
sponsorMonetize your AI agent with contextual product recommendations
Offorte
Create and send business proposals using AI with Offorte.
PDF Reader
Read text, metadata, and page count from PDF files securely within the project context.
Cycles MCP Server
Runtime budget authority for AI agents — reserve, enforce, and track spend before every LLM call and tool invocation.
MCP Chatbot
An intelligent chatbot for automating tasks like browser control, web searches, and travel planning.
Human Pages
Gives AI agents access to real-world people who listed themselves to be hired by agents. 31 tools including search by skill/location/equipment, job offers, job board listings, in-job messaging, and streaming payments. Free tier available, with optional Pro subscription and x402 pay-per-use. Payments default to crypto (USDC) but are flexible.
Brainstorm
Multi-round AI debates between GPT, DeepSeek, Groq, and Claude — all models argue, critique, and synthesize inside your coding assistant.
FusionAL
Unified MCP gateway that loads 150+ AI tools into Claude Desktop via a single Docker command on Windows. Routes tool calls across specialized servers (Business Intelligence, API Hub, Content Automation, Intelligence) with centralized logging and governance. Built for teams without dedicated platform engineers.
Doc Reading and Converter
A server for reading and converting documents between PDF, DOCX, and Markdown formats using marker-pdf and pandoc.
Jira-pilot
About AI-powered Jira CLI and MCP server for humans and agents manage issues, sprints, boards with interactive wizards, multi-provider AI
MCP Easy Copy
Easily discover and copy available MCP services within Claude Desktop.