omniparser-autogui-mcp
An MCP server that analyzes the screen with OmniParser to automate GUI operations.
omniparser-autogui-mcp
(日本語版はこちら)
This is an MCP server that analyzes the screen with OmniParser and automatically operates the GUI.
Confirmed on Windows.
License notes
This is MIT license, but Excluding submodules and sub packages.
OmniParser's repository is CC-BY-4.0.
Each OmniParser model has a different license (reference).
Installation
- Please do the following:
git clone --recursive https://github.com/NON906/omniparser-autogui-mcp.git
cd omniparser-autogui-mcp
uv sync
set OCR_LANG=en
uv run download_models.py
(Other than Windows, use export instead of set.)
(If you want langchain_example.py to work, uv sync --extra langchain instead.)
- Add this to your
claude_desktop_config.json:
{
"mcpServers": {
"omniparser_autogui_mcp": {
"command": "uv",
"args": [
"--directory",
"D:\\CLONED_PATH\\omniparser-autogui-mcp",
"run",
"omniparser-autogui-mcp"
],
"env": {
"PYTHONIOENCODING": "utf-8",
"OCR_LANG": "en"
}
}
}
}
(Replace D:\\CLONED_PATH\\omniparser-autogui-mcp with the directory you cloned.)
env allows for the following additional configurations:
-
OMNI_PARSER_BACKEND_LOAD
If it does not work with other clients (such as LibreChat), specify1. -
TARGET_WINDOW_NAME
If you want to specify the window to operate, please specify the window name.
If not specified, operates on the entire screen. -
OMNI_PARSER_SERVER
If you want OmniParser processing to be done on another device, specify the server's address and port, such as127.0.0.1:8000.
The server can be started withuv run omniparserserver. -
SSE_HOST,SSE_PORT
If specified, communication will be done via SSE instead of stdio. -
SOM_MODEL_PATH,CAPTION_MODEL_NAME,CAPTION_MODEL_PATH,OMNI_PARSER_DEVICE,BOX_TRESHOLD
These are for OmniParser configuration.
Usually, they are not necessary.
Usage Examples
- Search for "MCP server" in the on-screen browser.
etc.
관련 서버
Kone.vc
스폰서Monetize your AI agent with contextual product recommendations
Leantime MCP Bridge
An MCP proxy bridge for the Leantime project management system, forwarding JSON-RPC messages with proper authentication.
Bear MCP Server
Provides direct access to your Bear notes database for comprehensive note management, bypassing standard API limitations.
SSE Calculator
A stateful calculator server using Server-Sent Events (SSE) for real-time communication.
Shortcut
Manage your Shortcut projects, stories, and epics.
Mousetaile
Anki MCP server
wordpress-mcp
Lightweight WordPress MCP server with 42 tools. Token-optimized responses reduce REST API verbosity by 95%+. Posts, pages, users, plugins, themes, media, taxonomies, comments.
AgentHire
AI job search & hiring MCP server with 55 tools. Search jobs, apply, interview, negotiate offers across 20 countries. No account needed to start.
Home Assistant MCP Server
An MCP server for interacting with Home Assistant. Requires HA_URL and HA_TOKEN environment variables.
XMind MCP
An MCP server for reading and writing local XMind mind map files. Exposes over 25 tools that let any MCP-compatible AI client create, navigate, and edit .xmind files directly on disk.
Cycles MCP Server
Runtime budget authority for AI agents — reserve, enforce, and track spend before every LLM call and tool invocation.