podium-mcp Server

Unified mobile E2E MCP β€” 28 tools (mobile-mcp + Maestro + RN-debugger parity)

Documentation

πŸŽ™οΈ podium-mcp

One baton. Every instrument.

A single stdio endpoint with 33 tools for iOS-simulator device control, native UI automation, end-to-end flows, React Native debugging, and WebView DOM inspection β€” one connection instead of several.

License: MIT Node TypeScript MCP Tests CI


podium-mcp agent session β€” one prompt opens Safari on a live iOS simulator, types github.com/hoainho, explores the profile and opens a repository

One prompt β†’ podium drives Safari live β†’ types the URL β†’ explores the profile β†’ opens a repo. Footage captured on a live iPhone 16 Pro simulator.


A podium is where a maestro stands β€” one place to conduct the whole orchestra. This MCP server unifies three capability sets into a single stdio endpoint: device management (via simctl, with graceful adb support), UI inspection, interaction, and declarative end-to-end flows (driven through the Maestro flow engine), and React Native debugging (Metro console logs over CDP + crash reports). Rather than wiring several separate MCP servers into every client config, podium-mcp exposes everything behind one connection, with a shared exec layer, consistent error handling, and a single health-check tool to confirm what toolchain is available on the host machine.

Table of contents

Why

Driving a React Native app end-to-end usually means juggling three MCP servers β€” one for device/app control, one for UI flows, one for Metro/debugger logs β€” each with its own config entry, its own quirks, and its own failure modes. podium-mcp collapses that into one server with:

  • a single execFile-based command runner (no shell β€” arguments are passed verbatim),
  • consistent structured errors (a tool never crashes the server),
  • automatic retry around Maestro's known iOS-driver flakiness,
  • graceful degradation when a toolchain (e.g. adb) is absent.

Requirements

  • macOS with Xcode command-line tools (xcrun, simctl)
  • Node.js β‰₯ 22 (uses native fetch and WebSocket)
  • mobilecli β€” bundled automatically as an npm dependency; the default native gesture + WebView backend (no separate install)
  • (optional) idb (idb + idb_companion) β€” preferred native gesture backend when both are present; auto-detected
  • (optional) Maestro on PATH (or at ~/.maestro/bin) β€” the run_flow engine and the gesture fallback path
  • (optional) Android SDK + adb β€” adb paths degrade gracefully when absent
  • (optional) a running Metro bundler for the metro_* debugging tools

Claude Code plugin

Install podium-mcp as a Claude Code plugin β€” no manual config needed. One-time marketplace setup, then install:

/plugin marketplace add github:hoainho/podium-mcp
/plugin install podium-mcp@podium

Once installed, four skills are available directly in Claude Code:

SkillInvokeWhat it does
Device info/podium-mcp:device-info <UDID> [<BUNDLE_ID>]Health check, screen size, orientation, app list
E2E flow/podium-mcp:e2e <UDID> <BUNDLE_ID> [path or description]Run or author a Maestro flow
Bug repro/podium-mcp:bug-repro <UDID> <BUNDLE_ID> <description>Video + logs + crash evidence capture
RN debug/podium-mcp:rn-debug [UDID] [logs|apps|crash|all]Metro logs, connected apps, crash reports

The plugin auto-starts the MCP server (all 33 tools) when enabled. No .mcp.json edits required.

To submit this plugin to the Claude community marketplace (for discovery without the marketplace add step): claude.ai/settings/plugins/submit

Manual installation

git clone [email protected]:hoainho/podium-mcp.git
cd podium-mcp
npm install
npm run build

Usage

Register the built server with any MCP client. Claude Code (.mcp.json):

{
  "mcpServers": {
    "podium": {
      "type": "stdio",
      "command": "node",
      "args": ["/absolute/path/to/podium-mcp/dist/index.js"]
    }
  }
}

Quick manual smoke test over raw stdio (lists the registered tools):

printf '%s\n' \
  '{"jsonrpc":"2.0","id":1,"method":"initialize","params":{"protocolVersion":"2024-11-05","capabilities":{},"clientInfo":{"name":"smoke","version":"0"}}}' \
  '{"jsonrpc":"2.0","method":"notifications/initialized"}' \
  '{"jsonrpc":"2.0","id":2,"method":"tools/list"}' | node dist/index.js

Then call podium_health first to confirm which toolchain is available on the host.

Prompt playbook

Copy-paste prompts for common React Native testing & debugging tasks β€” e2e flows, test cases, feature verification, bug fixing, and device control β€” live in prompts/. Each prompt names the podium tools it drives and was validated against a real simulator. Start with prompts/README.md.

Capability coverage (the 5 requirements)

#Requirementpodium toolsVerified on a real RN app
1Read all info from devicedevice_list, screen_size, orientation_get, app_list, app_state, podium_healthβœ… screen 1206Γ—2622, app_list resolves bundle id + name
2Control deviceapp_launch/terminate/install/uninstall, tap_on, swipe, input_text, press_key, set_location, orientation_set, open_urlβœ… tap, key, location, orientation all pass
3Screenshot / capturescreenshot, record_start/record_stop (video)βœ… PNG + QuickTime .mp4
4Make e2erun_flow, inspect_screen, cheat_sheet + gesturesβœ… flow pass with per-step results
5Everything behind one connectionall 33 tools below β€” device, automation, capture, debugging, and WebView inspection in a single endpointβœ… see tool catalog

Tool reference (33 tools)

ToolKey paramsBacking engineFailure behavior
podium_healthβ€”which probesNever fails; booleans for xcrun / maestro / adb
device_listβ€”simctl list --json + adb devicesadb absent β†’ android: { available: false } (graceful)
device_bootudidsimctl bootStructured tool error
app_installudid, pathsimctl installStructured tool error
app_launchudid, bundleIdsimctl launchStructured tool error
app_terminateudid, bundleIdsimctl terminateStructured tool error
screenshotudid, saveTo?simctl io screenshotReturns path + byteSize (no base64 bloat)
open_urludid, urlsimctl openurlStructured tool error
set_locationudid, latitude, longitudesimctl location setCodifies the QA geo-spinner fix
app_stateudid, bundleIdsimctl listapps + launchctl list{ installed, running }
app_listudidsimctl listapps + plutil JSON{ count, apps: [{bundleId, name, type}] }
app_uninstalludid, bundleIdsimctl uninstallStructured tool error
screen_sizeudidsimctl io screenshot + sips{ widthPx, heightPx } (real pixels)
orientation_getudidnative query (mobilecli/idb) β†’ screenshot fallback{ orientation, basis } (exact when native; heuristic otherwise)
orientation_setudid, bundleId, valuenative (mobilecli) β†’ Maestro fallbackPORTRAIT / LANDSCAPE_LEFT / LANDSCAPE_RIGHT / UPSIDE_DOWN
record_startudid, saveTo? (.mp4)detached simctl io recordVideo{ ok, path, pid }; one recording per udid
record_stopudidSIGINT the recorder + flush{ ok, path, sizeBytes }
inspect_screenudid, compact?native flat AX list (idb/mobilecli) β†’ maestro hierarchycompact:true (default) returns only meaningful nodes
tap_onudid, bundleId, text|id|x+y, double?, long?native tap (idb/mobilecli) β†’ Maestro fallbacktext/id resolved via the element list; reports backend
input_textudid, bundleId, text, submit?native (idb/mobilecli) β†’ Maestro fallbackreports backend
swipeudid, bundleId, direction, start/end?native (idb/mobilecli) β†’ Maestro fallback%/pixel overrides resolved vs logical screen size
press_keyudid, bundleId, keynative (idb/mobilecli) β†’ Maestro fallbackback/power/tab are Android-only
tap_with_fallbackudid, x, y, bundleId?, maxRetries?, offsetStep?native tap + before/after screenshot diffRetries at y - offsetStep until the screen changes; for WebGL/Canvas overlays
notification_bar_clearudid, bundleId?native tap at (50,850) + screenshot diffDismisses the RN debug notification bar
run_flowudid + exactly one of yaml/files/dir(+tags), env?maestro testExactly-one-of validated before exec; per-step results
cheat_sheetβ€”bundled assets/maestro-cheat-sheet.yamlFully offline
webview_inspectudid, selector?, webviewId?, max?mobilecli (CDP)Resolves a CSS selector to DOM elements with absolute tapX/tapY for tap_on; first visible WebView when webviewId omitted
webview_evaludid, expression, webviewId?mobilecli (CDP)Evaluates JS in the WebView page context (read location.href, store state, balances)
webview_navigateudid, action (goto|back|forward|reload), url?, webviewId?mobilecli (CDP)Drives WebView navigation
metro_appsport? (8081)GET http://localhost:<port>/jsonMetro down β†’ structured metro not running
metro_logswebSocketDebuggerUrl? / port?, durationMs?, maxLogs?native WebSocket + CDP Runtime.enableAuto-discovers first app when URL omitted
crash_listprocessName?, sinceHours?, udid?~/Library/Logs/DiagnosticReports + sim containerEmpty list when dir unreadable
crash_getid, udid?samePath-traversal-safe (basename only); truncates honestly

WebView tools (webview_inspect/eval/navigate) use the bundled mobilecli over CDP β€” not the idb or Maestro paths β€” and require the app's WKWebView.isInspectable = true (default in debug/staging builds; usually disabled in production App Store builds).

Native-first gesture backend

Imperative gestures (tap_on, input_text, swipe, press_key, orientation_set) and inspect_screen route through the fastest available backend, probed once and cached:

  1. idb β€” used when both idb and idb_companion are installed (native, fastest).
  2. mobilecli β€” the bundled npm dependency (prebuilt Go binary). Default backend; no install needed.
  3. Maestro fallback β€” when no native backend resolves, or for actions a native backend can't express (double/long-press, UPSIDE_DOWN). The gesture generates a minimal flow with launchApp: { stopApp: false }, foregrounding the app without restarting so state is preserved.

Each result reports the backend it used. Set PODIUM_DISABLE_NATIVE=1 to force the Maestro path. Eliminating the per-gesture JVM spin-up cut tap_on from ~14.7 s to ~0.6 s and inspect_screen from ~8.9 s to ~0.9 s on an iPhone 16 Pro simulator. Run npm run benchmark for a full 33-tool pass/fail sweep.

Maestro fallback β€” idb flakiness retry

When the Maestro fallback path runs, its iOS driver intermittently fails with Failed to connect to 127.0.0.1:<port> / java.net.ConnectException. Flow executions automatically retry up to 2 times with 2s / 5s backoff and report the retries count. If it persists, the structured failure includes the raw output β€” the usual remedies are rebooting the simulator (device_boot after shutdown) or restarting the Maestro daemon.

Documented limits (by design, not bugs)

  • WebGL/Canvas content is un-automatable by selector β€” no DOM/hierarchy; use tap_with_fallback with screenshot-derived coordinates.
  • inspect_screen sees only the native layer for WebView content β€” use webview_inspect to resolve WKWebView DOM elements to tap coordinates (requires isInspectable = true).
  • WebView tools are dev/QA only β€” production App Store builds typically set WKWebView.isInspectable = false.
  • No Android SDK assumption β€” every adb-backed path degrades to a structured "adb not found" result instead of failing.
  • WebView content-process memory is unreadable from the app sandbox (iOS/Android platform limit) β€” use indirect signals (memory warnings, process terminations).
  • Maestro text: matcher is full-string regex (IGNORE_CASE) β€” partial strings don't match; copy hierarchy text verbatim or anchor with .*.
  • record_start/record_stop keep recorder state in-process (one Map per udid, server is long-lived). Clients must serialize start β†’ … β†’ stop on the same connection; firing both in one un-awaited batch races. One active recording per udid.

Architecture

src/
  index.ts          # MCP server entry β€” registers every tool group, warms caches
  lib/
    exec.ts         # execFile-based command runner (NO shell) + commandExists
    result.ts       # shared ok/error MCP content helpers
    simctl.ts       # xcrun simctl wrappers + device-list TTL cache
    native.ts       # gesture/inspect backend abstraction: idb β†’ mobilecli β†’ null
    idb.ts          # idb gesture/inspect adapter
    gesture.ts      # nativeTap hybrid (backend β†’ Maestro fallback)
    maestro.ts      # Maestro engine: flow runner, idb retry, hierarchy
    webview.ts      # mobilecli CDP β€” WebView list/inspect/eval/navigate
    metro.ts        # Metro CDP β€” app discovery + console log capture
    crash.ts        # DiagnosticReports crash listing/reading
    recording.ts    # detached screen recording lifecycle
  tools/            # one file per group (health, device, screen, flow, debug, webview)
assets/             # bundled offline Maestro cheat sheet
scripts/            # benchmark.ts (33-tool e2e), compare-mcps.ts
docs/               # tool catalog + e2e transcript

Development

npm run build       # tsc
npm run typecheck   # tsc --noEmit
npm test            # vitest run (61 tests; exec/network layer mocked β€” no sim needed)

Standards: TypeScript strict, no as any / @ts-ignore, no shell execution (all commands via lib/exec.ts), tools return structured errors instead of throwing. See CONTRIBUTING.md for the full guide and the "add a new tool" checklist.

Full tool catalog

See docs/tool-catalog.md for the authoritative tool-by-tool reference β€” every tool with its parameters, backing engine, and fallback behavior, plus the items deferred to a future version (cloud execution, live viewer).

Verified end-to-end

See docs/e2e-demo.md for a real transcript against a booted iPhone 16 Pro simulator running a production React Native app.

Platform note: macOS + iOS Simulator is the primary target. Android degrades gracefully β€” tools check for adb at runtime and return informative errors when the Android SDK is absent rather than failing hard.

Design ideas

podium-mcp is built around a few deliberate principles:

  • One podium, one connection. A single server fronts every mobile capability β€” device, UI, flows, capture, debugging, and WebView inspection β€” so an agent configures one endpoint and discovers all 33 tools at once, instead of stitching together several servers.
  • Safe by construction. Every external command runs through an execFile layer with an explicit argument array β€” never a shell string β€” so tool inputs (udids, paths, selectors, flow YAML) can't be interpreted as commands.
  • Never crash the conductor. Tools return structured results and errors instead of throwing; one bad call can't take the server down.
  • Degrade, don't fail. A missing toolchain (e.g. Android's adb) yields an informative result rather than a hard error.
  • Resilient automation. Flaky simulator drivers are retried with backoff, and every result reports exactly what happened (including the retry count).

How to use it, in order

  1. podium_health β€” confirm xcrun and maestro are available on the host.
  2. device_list β€” pick a booted simulator udid.
  3. Read state β€” app_list, app_state, screen_size, orientation_get.
  4. Drive the device β€” app_launch, then tap_on / input_text / swipe / press_key, plus set_location and orientation_set.
  5. Author end-to-end checks β€” inspect_screen to discover element text/ids, then run_flow (inline YAML or a .maestro file).
  6. Capture & debug β€” screenshot or record_start β†’ record_stop for video; metro_logs for live RN console output; crash_list / crash_get for diagnostics.

Contributing

Contributions are welcome β€” see CONTRIBUTING.md and our Code of Conduct. Use the issue templates for bugs and feature requests.

Security

Please report vulnerabilities privately per SECURITY.md β€” do not open a public issue.

License

MIT Β© 2026 hoainho