MCP Desktop Automation
Automate desktop actions like mouse control, keyboard input, and taking screenshots.
MCP Desktop Automation
A Model Context Protocol server that provides desktop automation capabilities using RobotJS and screenshot capabilities. This server enables LLMs to control mouse movements, keyboard inputs, and capture screenshots of the desktop environment.
Configuration to use Desktop Automation Server
Here's how to configure Claude Desktop to use the MCP Desktop Automation server:
NPX
{
"mcpServers": {
"desktop-automation": {
"command": "npx",
"args": ["-y", "mcp-desktop-automation"]
}
}
}
Permissions
This server requires system-level permissions to:
- Capture screenshots of your screen
- Control mouse movement and clicks
- Simulate keyboard input
When first running Claude Desktop with this server, you may need to grant these permissions in your operating system's security settings.
Limitations
While this server works with various MCP clients, it has been primarily tested with Claude Desktop.
Important: The current implementation has a 1MB response size limit. For screen captures, this means:
- High-resolution screenshots may exceed this limit and fail
- Testing has shown 800x600 resolution works reliably
- Consider reducing screen resolution or capturing specific screen areas if you encounter issues
Requirements
- Node.js (>=14.x)
Components
Tools
-
get_screen_size
- Gets the screen dimensions
- No input parameters required
-
screen_capture
- Captures the current screen content
- No input parameters required
-
keyboard_press
- Presses a keyboard key or key combination
- Inputs:
key(string, required): Key to press (e.g., 'enter', 'a', 'control')modifiers(array of strings, optional): Modifier keys to hold while pressing the key. Possible values: "control", "shift", "alt", "command"
-
keyboard_type
- Types text at the current cursor position
- Input:
text(string, required): Text to type
-
mouse_click
- Performs a mouse click
- Inputs:
button(string, optional, default: "left"): Mouse button to click. Possible values: "left", "right", "middle"double(boolean, optional, default: false): Whether to perform a double click
-
mouse_move
- Moves the mouse to specified coordinates
- Inputs:
x(number, required): X coordinatey(number, required): Y coordinate
Resources
The server provides access to screenshots:
-
Screenshot List (
screenshot://list)- Lists all available screenshots by name
-
Screenshot Content (
screenshot://{id})- PNG images of captured screenshots
- Accessible via the screenshot ID (timestamp-based naming)
Key Features
- Desktop mouse control
- Keyboard input simulation
- Screen size detection
- Screenshot capabilities
- Simple JSON response format
License
This MCP server is licensed under the MIT License. This means you are free to use, modify, and distribute the software, subject to the terms and conditions of the MIT License. For more details, please see the LICENSE file in the project repository.
관련 서버
Kone.vc
스폰서Monetize your AI agent with contextual product recommendations
Notion
Interact with Notion using its API. This server mirrors the Notion API SDK, allowing LLMs to manage pages, databases, and other Notion content.
PowerPoint Translator
Translate PowerPoint files using AWS Bedrock. Requires AWS credentials to be configured.
Zoho MCP
Zoho MCP is a robust new service from Zoho that allows you to create your own MCP server. You can create your own MCP server to perform complex actions in a host of Zoho applications or third-party services.
Compliance MCP
AI compliance calendar with global regulation tracking, risk assessment, and policy change monitoring
floor plan generator
BuildFloorPlan is an AI floor plan generator for homeowners, interior designers, builders, and small planning teams who need to move from rough input to a reviewable layout faster. It turns short briefs, sketches, images, and PDFs into clearer floor plan outputs in seconds, supports technical 2D layouts, colored presentation-ready plans, and quick 3D previews, and helps users compare layout directions before renovation, client presentation, or internal review. It is designed for fast first drafts, supports editing and refinement workflows, and does not require CAD experience. You can start free with starter credits, and paid plans add more credits, longer history, and commercial usage options.
Trello MCP Server
Uses a Trello board as a knowledge base to store and retrieve code snippets, notes, and other information.
MIST
An AI assistant server for managing notes, Gmail, Calendar, Tasks, and Git.
vidIQ for YouTube
YouTube growth toolkit for AI agents - research keywords, analyze channels, audit videos, pull transcripts, and more to find what works on YouTube.
AndroJack MCP
An MCP server that equips your AI coding assistant with live, verified Android knowledge — so it builds from official sources, not from memory.
OpenFinance
Connect your bank accounts to your AI