MCP Screenshot
Captures screenshots and performs OCR text recognition.
MCP Screenshot
An MCP server that captures screenshots and performs OCR text recognition.
Features
- Screenshot capture (left half, right half, full screen)
- OCR text recognition (supports Japanese and English)
- Multiple output formats (JSON, Markdown, vertical, horizontal)
OCR Engines
This server uses two OCR engines:
-
- Primary OCR engine
- High-accuracy Japanese text recognition
- Runs as an API server
-
- Fallback OCR engine
- Used when yomitoku is unavailable
- Supports both Japanese and English recognition
Installation
npx -y @kazuph/mcp-screenshot
Claude Desktop Configuration
Add the following configuration to your claude_desktop_config.json:
{
"mcpServers": {
"screenshot": {
"command": "npx",
"args": ["-y", "@kazuph/mcp-screenshot"],
"env": {
"OCR_API_URL": "http://localhost:8000" // yomitoku API base URL
}
}
}
}
Environment Variables
| Variable Name | Description | Default Value |
|---|---|---|
| OCR_API_URL | yomitoku API base URL | http://localhost:8000 |
Usage Example
You can use it by instructing Claude like this:
Please take a screenshot of the left half of the screen and recognize the text in it.
Tool Specification
capture
Takes a screenshot and performs OCR.
Options:
region: Screenshot area ('left'/'right'/'full', default: 'left')format: Output format ('json'/'markdown'/'vertical'/'horizontal', default: 'markdown')
License
MIT
Author
kazuph
Servidores relacionados
Things MCP
Integrate with the Things 3 to-do app on macOS.
Shippy MCP
Ship work. Earn royalties.
DalexorMI
Dalexor MI is an advanced project memory system designed to provide AI coding assistants with **Contextual Persistence**. Unlike standard RAG (Retrieval-Augmented Generation) systems that perform surface-level keyword searches, Dalexor MI maps the **logical evolution** of a codebase, tracking how symbols, dependencies, and architectural decisions shift over time.
Adfin
The only platform you need to get paid - all payments in one place, invoicing and accounting reconciliations with Adfin.
Feishu/Lark OpenAPI MCP
Connect AI agents to Feishu/Lark APIs for document processing, conversation management, and calendar scheduling.
MCP Atlassian Server
Integrate Atlassian products like Confluence and Jira with the Model Context Protocol.
PDF Tools
A server for manipulating PDF files, including merging, page extraction, and searching.
2slides
This is the 1st, easiest, and cheapest PPT, slides, presentation AI generation MCP Server in the world.
PM Copilot
Triangulates customer support tickets and feature requests to generate prioritized product plans with convergence scoring and PII scrubbing.
WayStation
A universal remote MCP server that connects to popular productivity tools such as Notion, Monday, AirTable, and many more.