Computer Control MCP
Control your computer's mouse, keyboard, and perform OCR using PyAutoGUI and RapidOCR. Works on Windows, with potential support for other platforms.
Computer Control MCP
MCP server that provides computer control capabilities, like mouse, keyboard, OCR, etc. using PyAutoGUI, RapidOCR, ONNXRuntime. Similar to 'computer-use' by Anthropic. With Zero External Dependencies.

Quick Usage (MCP Setup Using uvx)
Note: Running uvx computer-control-mcp@latest for the first time will download python dependencies (around 70MB) which may take some time. Recommended to run this in a terminal before using it as MCP. Subsequent runs will be instant.
{
"mcpServers": {
"computer-control-mcp": {
"command": "uvx",
"args": ["computer-control-mcp@latest"]
}
}
}
OR install globally with pip:
pip install computer-control-mcp
Then run the server with:
computer-control-mcp # instead of uvx computer-control-mcp, so you can use the latest version, also you can `uv cache clean` to clear the cache and `uvx` again to use latest version.
Features
- Control mouse movements and clicks
- Type text at the current cursor position
- Take screenshots of the entire screen or specific windows with optional saving to downloads directory
- Extract text from screenshots using OCR (Optical Character Recognition)
- List and activate windows
- Press keyboard keys
- Drag and drop operations
Available Tools
Mouse Control
click_screen(x: int, y: int): Click at specified screen coordinatesmove_mouse(x: int, y: int): Move mouse cursor to specified coordinatesdrag_mouse(from_x: int, from_y: int, to_x: int, to_y: int, duration: float = 0.5): Drag mouse from one position to anothermouse_down(button: str = "left"): Hold down a mouse button ('left', 'right', 'middle')mouse_up(button: str = "left"): Release a mouse button ('left', 'right', 'middle')
Keyboard Control
type_text(text: str): Type the specified text at current cursor positionpress_key(key: str): Press a specified keyboard keykey_down(key: str): Hold down a specific keyboard key until releasedkey_up(key: str): Release a specific keyboard keypress_keys(keys: Union[str, List[Union[str, List[str]]]]): Press keyboard keys (supports single keys, sequences, and combinations)
Screen and Window Management
take_screenshot(title_pattern: str = None, use_regex: bool = False, threshold: int = 60, scale_percent_for_ocr: int = None, save_to_downloads: bool = False): Capture screen or windowtake_screenshot_with_ocr(title_pattern: str = None, use_regex: bool = False, threshold: int = 10, scale_percent_for_ocr: int = None, save_to_downloads: bool = False): Extract adn return text with coordinates using OCR from screen or windowget_screen_size(): Get current screen resolutionlist_windows(): List all open windowsactivate_window(title_pattern: str, use_regex: bool = False, threshold: int = 60): Bring specified window to foregroundwait_milliseconds(milliseconds: int): Wait for a specified number of milliseconds
Development
Setting up the Development Environment
# Clone the repository
git clone https://github.com/AB498/computer-control-mcp.git
cd computer-control-mcp
# Install in development mode
pip install -e .
# Start server
python -m computer_control_mcp.core
# -- OR --
# Build after `pip install hatch`
hatch build
# Windows
$latest = Get-ChildItem .\dist\*.whl | Sort-Object LastWriteTime -Descending | Select-Object -First 1
pip install $latest.FullName --upgrade
# Non-windows
pip install dist/*.whl --upgrade
# Run
computer-control-mcp
Running Tests
python -m pytest
API Reference
See the API Reference for detailed information about the available functions and classes.
License
MIT
For more information or help
Related Servers
Expense Tracker
Automated expense management with a Supabase backend and hierarchical category support.
FullScope-MCP
An MCP server for content summarization, supporting web scraping, file reading, and direct model calls.
Feishu Project Management
An MCP server for interacting with the Feishu project management system, enabling AI assistants to manage projects.
EndNote MCP Service
Reads EndNote .enl libraries and exposes their contents through the MCP interface.
Cursor Task Manager
An MCP server for task management, time tracking, and workflow automation, integrated with Cursor IDE and a Directus backend.
Dialogoi
An MCP server designed to assist with novel writing, configurable via JSON project files.
Microsoft Word
Create, read, and manipulate Microsoft Word documents.
Market Sizing MCP Server
Provides market research and business analysis by integrating with multiple economic data sources like Alpha Vantage, BLS, and the World Bank.
Paylocity
A server to fetch data from Paylocity API endpoints.
Rezdy Agent
Search marketplace products, manage bookings, and handle customer relationships using the Rezdy Agent API.