omniparser-autogui-mcp
An MCP server that analyzes the screen with OmniParser to automate GUI operations.
omniparser-autogui-mcp
(日本語版はこちら)
This is an MCP server that analyzes the screen with OmniParser and automatically operates the GUI.
Confirmed on Windows.
License notes
This is MIT license, but Excluding submodules and sub packages.
OmniParser's repository is CC-BY-4.0.
Each OmniParser model has a different license (reference).
Installation
- Please do the following:
git clone --recursive https://github.com/NON906/omniparser-autogui-mcp.git
cd omniparser-autogui-mcp
uv sync
set OCR_LANG=en
uv run download_models.py
(Other than Windows, use export instead of set.)
(If you want langchain_example.py to work, uv sync --extra langchain instead.)
- Add this to your
claude_desktop_config.json:
{
"mcpServers": {
"omniparser_autogui_mcp": {
"command": "uv",
"args": [
"--directory",
"D:\\CLONED_PATH\\omniparser-autogui-mcp",
"run",
"omniparser-autogui-mcp"
],
"env": {
"PYTHONIOENCODING": "utf-8",
"OCR_LANG": "en"
}
}
}
}
(Replace D:\\CLONED_PATH\\omniparser-autogui-mcp with the directory you cloned.)
env allows for the following additional configurations:
-
OMNI_PARSER_BACKEND_LOAD
If it does not work with other clients (such as LibreChat), specify1. -
TARGET_WINDOW_NAME
If you want to specify the window to operate, please specify the window name.
If not specified, operates on the entire screen. -
OMNI_PARSER_SERVER
If you want OmniParser processing to be done on another device, specify the server's address and port, such as127.0.0.1:8000.
The server can be started withuv run omniparserserver. -
SSE_HOST,SSE_PORT
If specified, communication will be done via SSE instead of stdio. -
SOM_MODEL_PATH,CAPTION_MODEL_NAME,CAPTION_MODEL_PATH,OMNI_PARSER_DEVICE,BOX_TRESHOLD
These are for OmniParser configuration.
Usually, they are not necessary.
Usage Examples
- Search for "MCP server" in the on-screen browser.
etc.
İlgili Sunucular
Monday.com
Interact with Monday.com boards, items, updates, and documents.
Dub.co
Interact with the Dub.co API to shorten links, manage custom domains, and track analytics.
PM33 MCP Server
AI-native product management MCP server with 17 tools and 11 resources. WSJF backlog optimization, portfolio scheduling, Monte Carlo forecasting, velocity analytics, competitive intelligence, strategic alignment, PRD generation, sprint management. Integrates with Jira, Linear, and Asana.
Eloa - AI Content Curator
All your RSS feeds aggregated in one place. Eloa highlights what's new, shows the source, and how long ago it was published. Filter by read and unread.
Google MCP Tools
Integrate Google services like Gmail, Calendar, Drive, and Tasks with MCP.
Decent Sampler Drums
Generates Decent Sampler drum kit configurations.
Jira
Integrate with Jira's REST API to manage projects, track issues, and perform analytics.
MetaTrader MCP Server
A Python-based MCP server that allows AI LLMs to execute trades on the MetaTrader 5 platform.
DAISYS
Generate high-quality text-to-speech and text-to-voice outputs using the DAISYS platform.
Markdown to any text
Convert markdown to any text format you want