MCP QEMU VM Control

Give your AI full computer access — safely. Let Claude (or any MCP-compatible LLM) see your screen, move the mouse, type on the keyboard, and run commands — all inside an isolated QEMU virtual machine. Perfect for AI-driven automation, testing, and computer-use experiments without risking your host system.

MCP QEMU VM Control

Give your AI full computer access — safely.

Let Claude (or any MCP-compatible LLM) see your screen, move the mouse, type on the keyboard, and run commands — all inside an isolated QEMU virtual machine. Perfect for AI-driven automation, testing, and computer-use experiments without risking your host system.

A Model Context Protocol (MCP) server for controlling QEMU virtual machines via SSH. This server enables LLMs to interact with VMs through mouse/keyboard control, screenshots, and SSH command execution.

Table of Contents

Features

  • Mouse Control - Move cursor and click buttons
  • Keyboard Input - Type text and send key combinations
  • Action Batching - Execute sequences of UI actions in one call
  • Screenshots - Capture and retrieve VM screenshots
  • SSH Command Execution - Run shell commands on the VM
  • File Transfer - Upload and download files via SFTP
  • Project Management - Organize outputs into project folders with logs, results, and advice
  • Advice System - Save and retrieve tips for future LLM sessions

Prerequisites

Host System

  • Python 3.12+
  • uv (recommended) or pip
  • QEMU/KVM with libvirt
  • virt-manager (optional, for GUI management)

VM Requirements

  • Linux with X11 desktop environment
  • SSH server enabled
  • Required packages: openssh, xdotool, scrot, xrandr, xinput

QEMU/libvirt Setup

1. Install virtualization packages

Arch/Manjaro:

sudo pacman -S qemu-full libvirt virt-manager dnsmasq iptables-nft

Debian/Ubuntu:

sudo apt install qemu-kvm libvirt-daemon-system libvirt-clients virt-manager bridge-utils

Fedora:

sudo dnf install @virtualization

2. Configure libvirt

# Enable and start libvirtd
sudo systemctl enable --now libvirtd

# Add your user to libvirt group
sudo usermod -aG libvirt $USER

# Log out and back in, then verify
groups  # should show 'libvirt'

3. Set up the default network

libvirt provides a default NAT network (192.168.122.0/24) that VMs use to communicate with the host:

# Check network status
virsh -c qemu:///system net-list --all

# If 'default' is not active, start it
virsh -c qemu:///system net-start default

# Enable autostart
virsh -c qemu:///system net-autostart default

The default network configuration:

  • Bridge: virbr0
  • Host IP: 192.168.122.1
  • DHCP range: 192.168.122.2 - 192.168.122.254
  • Mode: NAT (VMs can access internet, host can access VMs)

4. Create a VM with virt-manager

  1. Launch virt-manager
  2. Create a new VM (File → New Virtual Machine)
  3. Select installation media (ISO)
  4. Allocate resources:
    • Memory: 4096 MB recommended
    • CPUs: 2+ recommended
  5. Important: Under "Network selection", choose "Virtual network 'default': NAT"
  6. Complete installation

5. Configure the VM

After installing the guest OS:

# Inside the VM - Install required packages

# Arch/Manjaro
sudo pacman -S --needed openssh xdotool scrot xorg-xrandr xorg-xinput

# Debian/Ubuntu
sudo apt install openssh-server xdotool scrot x11-xserver-utils xinput

# Enable SSH
sudo systemctl enable --now sshd

6. Create the automation user

On the VM:

# Create vmrobot user
sudo useradd -m -s /bin/bash vmrobot
sudo passwd vmrobot

# Set up SSH key authentication
sudo -u vmrobot mkdir -p /home/vmrobot/.ssh
sudo -u vmrobot chmod 700 /home/vmrobot/.ssh

On the host:

# Copy your public key to the VM
ssh-copy-id [email protected]

# Or manually add to /home/vmrobot/.ssh/authorized_keys on VM

7. Grant X11 access to vmrobot

The vmrobot user needs permission to access the X display. On the VM, as the user who owns the desktop session:

# Quick fix (run once per session)
xhost +local:vmrobot

# Permanent fix - add to ~/.xprofile or ~/.xinitrc
echo "xhost +local:" >> ~/.xprofile

8. Choose SSH user strategy

There are two approaches for the SSH user:

Option A: Dedicated vmrobot user (default)

  • Safer — limited permissions, can't accidentally break desktop config
  • Requires xhost +local:vmrobot for X11 access (step 7)
  • Set VM_DESKTOP_USER if you need commands that require the desktop user's context (clipboard, password manager, dbus):
    # On the VM, allow vmrobot to run commands as your desktop user
    echo 'vmrobot ALL=(sergey) NOPASSWD: ALL' | sudo tee /etc/sudoers.d/vmrobot-desktop
    sudo chmod 440 /etc/sudoers.d/vmrobot-desktop
    
    Then set VM_DESKTOP_USER=sergey in your config. Use ssh_execute("xclip -selection clipboard -o", as_desktop_user=True).

Option B: SSH directly as the desktop user

  • Simpler — full desktop access out of the box, no xhost or sudo needed
  • Set VM_USER to your desktop username (e.g., sergey)
  • All commands run with full desktop permissions
  • Best for personal/development VMs where isolation isn't a concern

9. Find your VM's IP address

# From the host
virsh -c qemu:///system domifaddr manjaro

# Or from inside the VM
ip addr show | grep "inet 192.168.122"

10. Test the connection

# Test SSH
ssh [email protected]

# Test X11 automation
ssh [email protected] 'DISPLAY=:0 xdotool getmouselocation'

# Test screenshot
ssh [email protected] 'DISPLAY=:0 scrot /tmp/test.png && echo Success'

Installation

1. Clone the repository

git clone https://github.com/Neanderthal/mcp-qemu-vm.git
cd mcp-qemu-vm

2. Install dependencies

Using uv (recommended):

uv venv && source .venv/bin/activate
uv pip install -r requirements.txt

Using pip:

python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt

Configuration

Set environment variables or create a .env file:

VariableDefaultDescription
VM_HOST192.168.122.79VM IP address
VM_USERvmrobotSSH username
VM_PORT22SSH port
VM_DISPLAY:0X11 display
VM_IDENTITY(empty)SSH private key path (optional)
VM_DESKTOP_USER(empty)Desktop session owner, if different from VM_USER
VM_KNOWN_HOSTS(none)SSH known_hosts file path (optional)
VM_CONNECT_TIMEOUT10SSH connection timeout in seconds

See .env.example for a documented template.

Usage

MCP Client Configuration

Add to your MCP client config (e.g., Claude Desktop claude_desktop_config.json):

{
  "qemu-vm-control": {
    "command": "python3",
    "args": ["/path/to/mcp-qemu-vm/server.py"],
    "env": {
      "VM_HOST": "192.168.122.79",
      "VM_USER": "vmrobot",
      "VM_PORT": "22",
      "VM_DISPLAY": ":0"
    }
  }
}

Config file locations:

  • macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
  • Windows: %APPDATA%/Claude/claude_desktop_config.json
  • Linux: ~/.config/Claude/claude_desktop_config.json

Development with MCP Inspector

uv run mcp dev server.py

# With custom environment
VM_HOST=192.168.122.79 VM_USER=vmrobot uv run mcp dev server.py

Running Standalone

python server.py

Tools Reference

Project Management

Projects organize all outputs (screenshots, logs, results, advice) into timestamped folders under data/projects/.

ToolDescription
project_init(name, description)Create a new project (required before screenshots)
project_load(project_path)Load an existing project
project_list()List all projects
project_info()Get current project statistics
project_log(message, level)Add a log entry
project_read_logs(lines, level_filter)Read project logs
project_save_result(filename, content)Save a result file
project_save_advice(title, content)Save tips for future sessions
project_read_advice()Read all saved advice

Mouse & Keyboard

ToolDescription
move_mouse(x, y, mode)Move cursor (mode: "absolute" or "relative")
click(button, count)Click mouse button (left/middle/right)
type_text(text)Type text
press_keys(keys)Press key combo, e.g., ["Ctrl", "L"]
wait(seconds)Pause execution
run_actions(actions)Execute a sequence of actions in one call

Batch Actions Example

[
  {"action": "press_keys", "keys": ["Ctrl", "Shift", "p"]},
  {"action": "wait", "seconds": 0.5},
  {"action": "type_text", "text": "Terminal: Focus Terminal"},
  {"action": "press_keys", "keys": ["Return"]}
]

SSH Operations

ToolDescription
ssh_execute(command, as_desktop_user)Run a shell command on the VM
ssh_upload(local_path, remote_path)Upload file to VM
ssh_download(remote_path, local_path)Download file from VM
ssh_connection_info()Get connection status

Screenshots

ToolDescription
take_screenshot()Capture screenshot (requires active project)

Screenshots are saved to the project's screenshots/ folder and exposed as MCP resources at vm://screenshot/{id}.

Typical Workflow

1. project_init("my-task", "Description")
2. take_screenshot()
3. ... perform VM operations ...
4. project_read_logs()
5. project_save_result("output.txt", data)
6. project_save_advice("Title", "Lessons learned...")

For continuing work:

1. project_list()
2. project_load("data/projects/...")  # Shows any saved advice
3. ... continue work ...

Best Practices for LLM Automation

These lessons were learned from real-world usage and help avoid common pitfalls.

1. Always Screenshot Before Actions

Before ANY interaction:

  1. take_screenshot()
  2. Analyze the image
  3. Identify current focus (which window/field is active)
  4. Only then proceed with actions

Never skip screenshots to "save time" - blind actions lead to errors.

2. Don't Trust Mouse Clicks for Focus

Clicking on a window/terminal does NOT reliably switch focus, especially in:

  • Nested environments (Citrix, remote desktop)
  • High-latency connections
  • Applications with multiple panels (VS Code, IDEs)

Use keyboard shortcuts instead:

[
  {"action": "press_keys", "keys": ["Ctrl", "Shift", "p"]},
  {"action": "wait", "seconds": 0.5},
  {"action": "type_text", "text": "Terminal: Focus Terminal"},
  {"action": "wait", "seconds": 0.3},
  {"action": "press_keys", "keys": ["Return"]},
  {"action": "wait", "seconds": 0.5}
]

Then take_screenshot() to verify before typing.

3. Required Wait Times

After This ActionWait Time
Opening Command Palette0.5s
Typing search text0.3s
Pressing Enter/Return0.5-1.0s
Command execution1.0-2.0s
Window/focus switch0.5s

Never rapid-fire actions - they may arrive out of order.

4. Use Batch Actions

Use run_actions() instead of separate tool calls to reduce latency and ensure ordering:

# Instead of 5 separate calls:
run_actions([
    {"action": "press_keys", "keys": ["Ctrl", "Shift", "p"]},
    {"action": "wait", "seconds": 0.5},
    {"action": "type_text", "text": "command"},
    {"action": "wait", "seconds": 0.3},
    {"action": "press_keys", "keys": ["Return"]}
])

5. SSH Scope Limitation

ssh_execute only reaches the first VM layer. For nested environments (VM → Citrix → Windows), use UI automation to type commands in the visible terminal.

6. Recovery Commands

ProblemSolution
Typed in wrong window (few chars)Escapeu (undo in Vim)
Multiple lines in wrong placeEscapeuuuuuuu
File corruptedEscape:e!Enter (reload)
VS Code revertCtrl+Shift+P → "Revert File"

7. Common Mistakes to Avoid

  1. Typing immediately after clicking terminal (focus may not have switched)
  2. Skipping screenshots to "save time"
  3. Using ssh_execute for nested environment commands
  4. Not waiting between actions
  5. Assuming focus switched without verification

Architecture

┌─────────────┐         SSH          ┌──────────────┐
│             │ ◄──────────────────► │              │
│  MCP Server │                      │   QEMU VM    │
│   (Host)    │                      │   (Linux)    │
│             │                      │              │
└──────┬──────┘                      └──────────────┘
       │                                    │
       │ MCP Protocol                       │
       │ (stdio)                            │
       │                                    │
       ▼                                    ▼
┌─────────────┐                      xdotool, scrot
│  LLM Client │                      X11 automation
│  (Claude)   │
└─────────────┘

Network topology:

┌────────────────────────────────────────────────────┐
│  Host (192.168.122.1)                              │
│  ┌──────────┐                                      │
│  │ virbr0   │◄── NAT bridge                        │
│  └────┬─────┘                                      │
│       │                                            │
│  ┌────┴─────┐                                      │
│  │ QEMU VM  │ 192.168.122.79                       │
│  │ (manjaro)│                                      │
│  └──────────┘                                      │
└────────────────────────────────────────────────────┘

Project Structure

mcp-qemu-vm/
├── server.py           # Main MCP server (single file)
├── pyproject.toml      # Project metadata, ruff & pytest config
├── requirements.txt    # Python dependencies
├── .env.example        # Documented env var template
├── data/
│   └── projects/       # Project folders
│       └── YYYYMMDD-HHMMSS_name/
│           ├── screenshots/
│           ├── logs/
│           ├── results/
│           └── advice/
└── README.md

Troubleshooting

Cannot connect to VM

  1. Check VM is running:

    virsh -c qemu:///system list
    
  2. Check network is active:

    virsh -c qemu:///system net-list
    # If default is inactive:
    virsh -c qemu:///system net-start default
    
  3. Check VM has IP:

    virsh -c qemu:///system domifaddr <vm-name>
    
  4. Test SSH connectivity:

    ssh [email protected]
    

Mouse/keyboard not working

  • Verify xdotool is installed on VM: which xdotool
  • Check X11 display: echo $DISPLAY (should be :0)
  • Test manually: DISPLAY=:0 xdotool getmouselocation

Screenshots failing / X11 Authorization Error

If you see Authorization required, but no authorization protocol specified:

Quick fix (run as X session owner on VM):

xhost +local:vmrobot

Permanent fix - Add to ~/.xprofile:

xhost +local:

Verify access:

# Check current xhost settings
DISPLAY=:0 xhost

# Should show:
# access control enabled, only authorized clients can connect
# LOCAL:

VM network issues

# Restart the default network
virsh -c qemu:///system net-destroy default
virsh -c qemu:///system net-start default

# Check virbr0 bridge exists
ip addr show virbr0

License

MIT

Related

İlgili Sunucular

NotebookLM Web Importer

Web sayfalarını ve YouTube videolarını tek tıkla NotebookLM'e aktarın. 200.000'den fazla kullanıcı tarafından güveniliyor.

Chrome Eklentisini Yükle