Deskbrid MCP Server

Linux desktop HAL for AI agents.

Documentation

deskbrid

deskbrid logo

CI License: MIT Rust 2024 Release Discord Nick Launches

deskbrid.patchhive.dev

mcp-name: io.github.coe0718/deskbrid

πŸ“– Documentation | API Reference | Architecture


The HAL your Linux desktop agents are missing.

Deskbrid is a single Rust binary that auto-detects your desktop environment and wraps it into a JSON-over-Unix-socket protocol. GNOME, Hyprland, KDE, COSMIC, Sway, Niri, Wayfire, Labwc, Cinnamon, MATE β€” one daemon, one protocol, one binary.

# Human
deskbrid windows list
deskbrid clipboard read

# Agent (same socket)
{"action": "windows.list"}  β†’  [{"title": "VS Code", "app_id": "code", ...}]

Table of Contents

Why Deskbrid

Every major AI lab is racing to ship desktop agents. AppleScript gives macOS agents native control. Windows has UI Automation. Linux has xdotool β€” which breaks on Wayland, the default display protocol for every major distro.

Deskbrid fills that gap. It auto-detects your compositor and loads the right backend β€” GNOME (Mutter RemoteDesktop DBus), Hyprland (hyprctl + ydotool + grim), KDE (KWin D-Bus + ydotool + spectacle), wlroots-style compositors, or shared X11. Same binary, same protocol, same socket.

Demo: agent focuses VS Code window and types a command via deskbrid

Dashboard

Deskbrid ships with a built-in web dashboard at localhost:20129 β€” system info, monitors, windows, network, audio, clipboard, and an audit log of agent actions, all live:

πŸ”΄ Live Demo β†’

Deskbrid Dashboard β€” Hyprland, system overview with SSE-connected live data

Supported Desktops

DesktopSessionStatusBackend
GNOME 46–50Waylandβœ… SupportedMutter RemoteDesktop + Shell Extension
HyprlandWaylandβœ… Supported (v0.3.0)hyprctl + ydotool + grim
KDE PlasmaWaylandβœ… Supported (v0.4.0)KWin D-Bus + ydotool + spectacle
COSMICWayland⚠️ Partialcosmic-helper + cosmic-randr + ydotool + grim
SwayWaylandβœ… Supportedswaymsg + ydotool + grim
NiriWaylandβœ… Partialniri msg + ydotool + grim + wlr-randr
WayfireWaylandβœ… Supported (no move/resize)wf-ipc + ydotool + grim + wlr-randr
LabwcWaylandβœ… Supported (no move/resize)wlrctl + ydotool + grim + wlr-randr
CinnamonX11βœ… Supported (shared X11)xdotool + wmctrl + xclip + import
MATEX11βœ… Supported (shared X11)xdotool + wmctrl + xclip + import
X11 (generic)X11βœ… Supported (shared X11)xdotool + wmctrl + xclip + import

Deskbrid auto-detects your desktop at startup ($XDG_CURRENT_DESKTOP β†’ process scan β†’ GNOME fallback). No config files, no flags.

See DE Test Matrix for per-action compatibility across all desktops β€” every action, every compositor, tested on real hardware.

Installation

One-liner install (recommended):

bash <(curl -fsSL https://deskbrid.patchhive.dev/install.sh)

Auto-detects your distro and desktop environment, installs dependencies, sets up uinput, and downloads the binary.

Manual installation:

Download the latest release binary from the releases page:

curl -LO https://github.com/coe0718/deskbrid/releases/latest/download/deskbrid
chmod +x deskbrid
sudo mv deskbrid /usr/local/bin/

Or build from source:

git clone https://github.com/coe0718/deskbrid
cd deskbrid
cargo build --release
sudo cp target/release/deskbrid /usr/local/bin/

Desktop Setup

GNOME:

sudo apt install -y grim wl-clipboard python3-gi gstreamer1.0-tools gstreamer1.0-pipewire
deskbrid setup

Hyprland (and other standalone Wayland compositors β€” Sway, Niri, Wayfire, Labwc):

sudo pacman -S grim wl-clipboard ydotool
echo 'KERNEL=="uinput", GROUP="input", MODE="0660"' | sudo tee /etc/udev/rules.d/99-input.rules
sudo usermod -aG input $USER

⚠️ Standalone Wayland compositors don't ship a notification daemon. Deskbrid's notify send will hang without one. Install dunst, mako, or swaync and add it to your compositor's autostart.

KDE Plasma:

sudo apt install spectacle imagemagick wl-clipboard ydotool

Quick Start

deskbrid daemon &

deskbrid windows list          # List open windows
deskbrid clipboard read        # Read clipboard
deskbrid screenshot            # Take screenshot
deskbrid system info           # Get system info
deskbrid windows focus --app code  # Focus VS Code
deskbrid input keyboard type "Hello!"  # Type text

Features

Windows & Workspaces

ActionDescription
windows.listList all open windows
windows.focusFocus a window by app_id or title
windows.getGet details for a specific window
windows.closeRequest window close
windows.minimize/maximizeWindow state control
windows.move_resizeMove and resize windows
windows.tileTile to screen regions
windows.activate_or_launchFocus or launch app
workspaces.*Workspace management
layout_profiles.*Save/restore layouts

Input & Clipboard

ActionDescription
input.keyboard typeType text
input.keyboard keySend keypress
input.keyboard comboSend key combinations
input.mouse.*Mouse control
clipboard.read/writeClipboard access
clipboard.historyClipboard history

Screenshots & Media

ActionDescription
screenshotScreen capture
screenshot.ocrExtract text via Tesseract
screenshot.diffCompare screenshots
mpris.*Media player control
color.pickSample pixel colors

System & Services

ActionDescription
system.infoDesktop information
system.batteryBattery status
system.idleIdle detection
system.powerPower management
service.*systemd units
journal.queryLog inspection
terminal.*PTY sessions
monitor.*Display control

Network & Bluetooth

ActionDescription
network.*WiFi status/connect
bluetooth.*Device pairing/control

Protocol

Deskbrid uses JSON-over-Unix-socket. See PROTOCOL.md for the complete specification.

β†’ {"action": "windows.list"}
← {"type": "response", "status": "ok", "data": [{"title": "VS Code", ...}]}

β†’ {"action": "windows.focus", "window_id": "code"}
← {"type": "response", "status": "ok"}

Events

Subscribe to real-time updates:

{"action": "subscribe", "events": ["file.*"]}

Python Client

from deskbrid import Deskbrid

client = Deskbrid()

# List and focus VS Code
windows = client.windows_list()
code_window = next((w for w in windows if w.app_id == 'code'), None)
if code_window:
    client.focus_window(app_id='code')
    client.type_text("Fixed the bug!\n")

# Subscribe to events
@client.on("file.*")
def on_file_change(event):
    print(f"File changed: {event['path']}")

MCP Integration

Deskbrid exposes a full Model Context Protocol server for AI coding tools:

deskbrid mcp

Claude Desktop (~/.config/Claude/claude_desktop_config.json):

{
  "mcpServers": {
    "deskbrid": {
      "command": "/usr/local/bin/deskbrid",
      "args": ["mcp"]
    }
  }
}

Available MCP tools (20+):

  • list_windows, focus_window
  • type_text, press_keys, mouse_move, mouse_click
  • screenshot, clipboard_read, clipboard_write
  • list_apps, get_accessibility_tree
  • perform_action, set_element_value, get_element_text, click_element
  • doctor, setup_accessibility, capabilities

Compared to Alternatives

ToolWaylandAgent-nativeJSONWindowsInputClipboardScreenshotBluetoothAudio
deskbridβœ…βœ…βœ…βœ…βœ…βœ…βœ…βœ…βœ…
xdotoolβŒβŒβŒβœ…βœ…βŒβŒβŒβŒ
ydotoolβœ…βŒβŒβŒβœ…βŒβŒβŒβŒ
grimβœ…βŒβŒβŒβŒβŒβœ…βŒβŒ
wl-clipboardβœ…βŒβŒβŒβŒβœ…βŒβŒβŒ

License

MIT

How This Started

Deskbrid began with Tuck β€” an autonomous agent that needed to control a real Linux desktop. When the community asked for Hyprland support, Tuck asked Jeremy for a bare Arch Linux box with SSH and sudo. He installed Hyprland himself and built the backend from inside the environment he just configured.

The first working demo was a Telegram message: Tuck focused a window and typed "Hello from the other side" in under 60 seconds. That moment β€” an agent controlling a real desktop through a Unix socket β€” became Deskbrid. It's built for agents first: same protocol for humans on the CLI, same socket for AIs, one binary that works everywhere.