podium-mcp Server

Unified mobile E2E MCP — 28 tools (mobile-mcp + Maestro + RN-debugger parity)

Documentation

🎙️ podium-mcp

One baton. Every instrument.

A single stdio endpoint with 33 tools for iOS-simulator device control, native UI automation, end-to-end flows, React Native debugging, and WebView DOM inspection — one connection instead of several.

_{One prompt → podium drives Safari live → types the URL → explores the profile → opens a repo. Footage captured on a live iPhone 16 Pro simulator.}

A podium is where a maestro stands — one place to conduct the whole orchestra. This MCP server unifies three capability sets into a single stdio endpoint: device management (via simctl, with graceful adb support), UI inspection, interaction, and declarative end-to-end flows (driven through the Maestro flow engine), and React Native debugging (Metro console logs over CDP + crash reports). Rather than wiring several separate MCP servers into every client config, podium-mcp exposes everything behind one connection, with a shared exec layer, consistent error handling, and a single health-check tool to confirm what toolchain is available on the host machine.

Why
Requirements
Installation
Usage
Prompt playbook
Capability coverage
Tool reference
Documented limits
Architecture
Development
Full tool catalog
Verified end-to-end
Design ideas
Contributing
Security
License

Why

Driving a React Native app end-to-end usually means juggling three MCP servers — one for device/app control, one for UI flows, one for Metro/debugger logs — each with its own config entry, its own quirks, and its own failure modes. podium-mcp collapses that into one server with:

a single execFile-based command runner (no shell — arguments are passed verbatim),
consistent structured errors (a tool never crashes the server),
automatic retry around Maestro's known iOS-driver flakiness,
graceful degradation when a toolchain (e.g. adb) is absent.

Requirements

macOS with Xcode command-line tools (xcrun, simctl)
Node.js ≥ 22 (uses native fetch and WebSocket)
mobilecli — bundled automatically as an npm dependency; the default native gesture + WebView backend (no separate install)
(optional) idb (idb + idb_companion) — preferred native gesture backend when both are present; auto-detected
(optional) Maestro on PATH (or at ~/.maestro/bin) — the run_flow engine and the gesture fallback path
(optional) Android SDK + adb — adb paths degrade gracefully when absent
(optional) a running Metro bundler for the metro_* debugging tools

Claude Code plugin

Install podium-mcp as a Claude Code plugin — no manual config needed. One-time marketplace setup, then install:

/plugin marketplace add github:hoainho/podium-mcp
/plugin install podium-mcp@podium

Once installed, four skills are available directly in Claude Code:

Skill	Invoke	What it does
Device info	`/podium-mcp:device-info <UDID> [<BUNDLE_ID>]`	Health check, screen size, orientation, app list
E2E flow	`/podium-mcp:e2e <UDID> <BUNDLE_ID> [path or description]`	Run or author a Maestro flow
Bug repro	`/podium-mcp:bug-repro <UDID> <BUNDLE_ID> <description>`	Video + logs + crash evidence capture
RN debug	`/podium-mcp:rn-debug [UDID] [logs\|apps\|crash\|all]`	Metro logs, connected apps, crash reports

The plugin auto-starts the MCP server (all 33 tools) when enabled. No .mcp.json edits required.

To submit this plugin to the Claude community marketplace (for discovery without the marketplace add step): claude.ai/settings/plugins/submit

Manual installation

git clone [email protected]:hoainho/podium-mcp.git
cd podium-mcp
npm install
npm run build

Usage

{
  "mcpServers": {
    "podium": {
      "type": "stdio",
      "command": "node",
      "args": ["/absolute/path/to/podium-mcp/dist/index.js"]
    }
  }
}

Quick manual smoke test over raw stdio (lists the registered tools):

printf '%s\n' \
  '{"jsonrpc":"2.0","id":1,"method":"initialize","params":{"protocolVersion":"2024-11-05","capabilities":{},"clientInfo":{"name":"smoke","version":"0"}}}' \
  '{"jsonrpc":"2.0","method":"notifications/initialized"}' \
  '{"jsonrpc":"2.0","id":2,"method":"tools/list"}' | node dist/index.js

Then call podium_health first to confirm which toolchain is available on the host.

Prompt playbook

Copy-paste prompts for common React Native testing & debugging tasks — e2e flows, test cases, feature verification, bug fixing, and device control — live in prompts/. Each prompt names the podium tools it drives and was validated against a real simulator. Start with prompts/README.md.

Capability coverage (the 5 requirements)

#	Requirement	podium tools	Verified on a real RN app
1	Read all info from device	`device_list`, `screen_size`, `orientation_get`, `app_list`, `app_state`, `podium_health`	✅ screen 1206×2622, `app_list` resolves bundle id + name
2	Control device	`app_launch/terminate/install/uninstall`, `tap_on`, `swipe`, `input_text`, `press_key`, `set_location`, `orientation_set`, `open_url`	✅ tap, key, location, orientation all pass
3	Screenshot / capture	`screenshot`, `record_start`/`record_stop` (video)	✅ PNG + QuickTime `.mp4`
4	Make e2e	`run_flow`, `inspect_screen`, `cheat_sheet` + gestures	✅ flow pass with per-step results
5	Everything behind one connection	all 33 tools below — device, automation, capture, debugging, and WebView inspection in a single endpoint	✅ see tool catalog

Tool reference (33 tools)

Tool	Key params	Backing engine	Failure behavior
`podium_health`	—	`which` probes	Never fails; booleans for xcrun / maestro / adb
`device_list`	—	`simctl list --json` + `adb devices`	adb absent → `android: { available: false }` (graceful)
`device_boot`	udid	`simctl boot`	Structured tool error
`app_install`	udid, path	`simctl install`	Structured tool error
`app_launch`	udid, bundleId	`simctl launch`	Structured tool error
`app_terminate`	udid, bundleId	`simctl terminate`	Structured tool error
`screenshot`	udid, saveTo?	`simctl io screenshot`	Returns path + byteSize (no base64 bloat)
`open_url`	udid, url	`simctl openurl`	Structured tool error
`set_location`	udid, latitude, longitude	`simctl location set`	Codifies the QA geo-spinner fix
`app_state`	udid, bundleId	`simctl listapps` + `launchctl list`	`{ installed, running }`
`app_list`	udid	`simctl listapps` + `plutil` JSON	`{ count, apps: [{bundleId, name, type}] }`
`app_uninstall`	udid, bundleId	`simctl uninstall`	Structured tool error
`screen_size`	udid	`simctl io screenshot` + `sips`	`{ widthPx, heightPx }` (real pixels)
`orientation_get`	udid	native query (`mobilecli`/`idb`) → screenshot fallback	`{ orientation, basis }` (exact when native; heuristic otherwise)
`orientation_set`	udid, bundleId, value	native (`mobilecli`) → Maestro fallback	PORTRAIT / LANDSCAPE_LEFT / LANDSCAPE_RIGHT / UPSIDE_DOWN
`record_start`	udid, saveTo? (.mp4)	detached `simctl io recordVideo`	`{ ok, path, pid }`; one recording per udid
`record_stop`	udid	SIGINT the recorder + flush	`{ ok, path, sizeBytes }`
`inspect_screen`	udid, compact?	native flat AX list (`idb`/`mobilecli`) → `maestro hierarchy`	`compact:true` (default) returns only meaningful nodes
`tap_on`	udid, bundleId, text\|id\|x+y, double?, long?	native tap (`idb`/`mobilecli`) → Maestro fallback	text/id resolved via the element list; reports `backend`
`input_text`	udid, bundleId, text, submit?	native (`idb`/`mobilecli`) → Maestro fallback	reports `backend`
`swipe`	udid, bundleId, direction, start/end?	native (`idb`/`mobilecli`) → Maestro fallback	%/pixel overrides resolved vs logical screen size
`press_key`	udid, bundleId, key	native (`idb`/`mobilecli`) → Maestro fallback	back/power/tab are Android-only
`tap_with_fallback`	udid, x, y, bundleId?, maxRetries?, offsetStep?	native tap + before/after screenshot diff	Retries at `y - offsetStep` until the screen changes; for WebGL/Canvas overlays
`notification_bar_clear`	udid, bundleId?	native tap at (50,850) + screenshot diff	Dismisses the RN debug notification bar
`run_flow`	udid + exactly one of yaml/files/dir(+tags), env?	`maestro test`	Exactly-one-of validated before exec; per-step results
`cheat_sheet`	—	bundled `assets/maestro-cheat-sheet.yaml`	Fully offline
`webview_inspect`	udid, selector?, webviewId?, max?	`mobilecli` (CDP)	Resolves a CSS selector to DOM elements with absolute `tapX`/`tapY` for `tap_on`; first visible WebView when `webviewId` omitted
`webview_eval`	udid, expression, webviewId?	`mobilecli` (CDP)	Evaluates JS in the WebView page context (read `location.href`, store state, balances)
`webview_navigate`	udid, action (`goto`\|`back`\|`forward`\|`reload`), url?, webviewId?	`mobilecli` (CDP)	Drives WebView navigation
`metro_apps`	port? (8081)	GET `http://localhost:<port>/json`	Metro down → structured `metro not running`
`metro_logs`	webSocketDebuggerUrl? / port?, durationMs?, maxLogs?	native WebSocket + CDP `Runtime.enable`	Auto-discovers first app when URL omitted
`crash_list`	processName?, sinceHours?, udid?	`~/Library/Logs/DiagnosticReports` + sim container	Empty list when dir unreadable
`crash_get`	id, udid?	same	Path-traversal-safe (basename only); truncates honestly

WebView tools (webview_inspect/eval/navigate) use the bundled mobilecli over CDP — not the idb or Maestro paths — and require the app's WKWebView.isInspectable = true (default in debug/staging builds; usually disabled in production App Store builds).

Native-first gesture backend

Imperative gestures (tap_on, input_text, swipe, press_key, orientation_set) and inspect_screen route through the fastest available backend, probed once and cached:

idb — used when both idb and idb_companion are installed (native, fastest).
mobilecli — the bundled npm dependency (prebuilt Go binary). Default backend; no install needed.
Maestro fallback — when no native backend resolves, or for actions a native backend can't express (double/long-press, UPSIDE_DOWN). The gesture generates a minimal flow with launchApp: { stopApp: false }, foregrounding the app without restarting so state is preserved.

Each result reports the backend it used. Set PODIUM_DISABLE_NATIVE=1 to force the Maestro path. Eliminating the per-gesture JVM spin-up cut tap_on from ~14.7 s to ~0.6 s and inspect_screen from ~8.9 s to ~0.9 s on an iPhone 16 Pro simulator. Run npm run benchmark for a full 33-tool pass/fail sweep.

Maestro fallback — idb flakiness retry

When the Maestro fallback path runs, its iOS driver intermittently fails with Failed to connect to 127.0.0.1:<port> / java.net.ConnectException. Flow executions automatically retry up to 2 times with 2s / 5s backoff and report the retries count. If it persists, the structured failure includes the raw output — the usual remedies are rebooting the simulator (device_boot after shutdown) or restarting the Maestro daemon.

Documented limits (by design, not bugs)

WebGL/Canvas content is un-automatable by selector — no DOM/hierarchy; use tap_with_fallback with screenshot-derived coordinates.
inspect_screen sees only the native layer for WebView content — use webview_inspect to resolve WKWebView DOM elements to tap coordinates (requires isInspectable = true).
WebView tools are dev/QA only — production App Store builds typically set WKWebView.isInspectable = false.
No Android SDK assumption — every adb-backed path degrades to a structured "adb not found" result instead of failing.
WebView content-process memory is unreadable from the app sandbox (iOS/Android platform limit) — use indirect signals (memory warnings, process terminations).
Maestro text: matcher is full-string regex (IGNORE_CASE) — partial strings don't match; copy hierarchy text verbatim or anchor with .*.
record_start/record_stop keep recorder state in-process (one Map per udid, server is long-lived). Clients must serialize start → … → stop on the same connection; firing both in one un-awaited batch races. One active recording per udid.

Architecture

src/
  index.ts          # MCP server entry — registers every tool group, warms caches
  lib/
    exec.ts         # execFile-based command runner (NO shell) + commandExists
    result.ts       # shared ok/error MCP content helpers
    simctl.ts       # xcrun simctl wrappers + device-list TTL cache
    native.ts       # gesture/inspect backend abstraction: idb → mobilecli → null
    idb.ts          # idb gesture/inspect adapter
    gesture.ts      # nativeTap hybrid (backend → Maestro fallback)
    maestro.ts      # Maestro engine: flow runner, idb retry, hierarchy
    webview.ts      # mobilecli CDP — WebView list/inspect/eval/navigate
    metro.ts        # Metro CDP — app discovery + console log capture
    crash.ts        # DiagnosticReports crash listing/reading
    recording.ts    # detached screen recording lifecycle
  tools/            # one file per group (health, device, screen, flow, debug, webview)
assets/             # bundled offline Maestro cheat sheet
scripts/            # benchmark.ts (33-tool e2e), compare-mcps.ts
docs/               # tool catalog + e2e transcript

Development

npm run build       # tsc
npm run typecheck   # tsc --noEmit
npm test            # vitest run (61 tests; exec/network layer mocked — no sim needed)

Standards: TypeScript strict, no as any / @ts-ignore, no shell execution (all commands via lib/exec.ts), tools return structured errors instead of throwing. See CONTRIBUTING.md for the full guide and the "add a new tool" checklist.

Full tool catalog

See docs/tool-catalog.md for the authoritative tool-by-tool reference — every tool with its parameters, backing engine, and fallback behavior, plus the items deferred to a future version (cloud execution, live viewer).

Verified end-to-end

See docs/e2e-demo.md for a real transcript against a booted iPhone 16 Pro simulator running a production React Native app.

Platform note: macOS + iOS Simulator is the primary target. Android degrades gracefully — tools check for adb at runtime and return informative errors when the Android SDK is absent rather than failing hard.

Design ideas

podium-mcp is built around a few deliberate principles:

One podium, one connection. A single server fronts every mobile capability — device, UI, flows, capture, debugging, and WebView inspection — so an agent configures one endpoint and discovers all 33 tools at once, instead of stitching together several servers.
Safe by construction. Every external command runs through an execFile layer with an explicit argument array — never a shell string — so tool inputs (udids, paths, selectors, flow YAML) can't be interpreted as commands.
Never crash the conductor. Tools return structured results and errors instead of throwing; one bad call can't take the server down.
Degrade, don't fail. A missing toolchain (e.g. Android's adb) yields an informative result rather than a hard error.
Resilient automation. Flaky simulator drivers are retried with backoff, and every result reports exactly what happened (including the retry count).

How to use it, in order

podium_health — confirm xcrun and maestro are available on the host.
device_list — pick a booted simulator udid.
Read state — app_list, app_state, screen_size, orientation_get.
Drive the device — app_launch, then tap_on / input_text / swipe / press_key, plus set_location and orientation_set.
Author end-to-end checks — inspect_screen to discover element text/ids, then run_flow (inline YAML or a .maestro file).
Capture & debug — screenshot or record_start → record_stop for video; metro_logs for live RN console output; crash_list / crash_get for diagnostics.

Contributing

Contributions are welcome — see CONTRIBUTING.md and our Code of Conduct. Use the issue templates for bugs and feature requests.

Security

Please report vulnerabilities privately per SECURITY.md — do not open a public issue.