unbrowser by Unchained

Lightweight MCP browser for LLM agents: no Chrome, low-token BlockMaps, stateful sessions, JavaScript execution, forms, cookies, and escalation hints.

unbrowser

Web access for LLM agents. One static binary. No Chrome.

unbrowser MCP server

unbrowser is the lightweight open-source browser tier from Unchained: cheap, stateful web access for agents when curl/WebFetch is too dumb and full Chrome is too heavy. When a page needs real Chrome, cookies, extensions, or human-in-the-loop auth, escalate to unchainedsky-cli or Unchained.

Try it hosted: Unchained exposes a public Streamable HTTP MCP endpoint at https://unchainedsky.com/unbrowser-mcp for discovery and smoke tests. Glama also runs a hosted MCP release at glama.ai/mcp/servers/protostatis/unbrowser, and the Smithery page is at smithery.ai/servers/protostatis-dev/unbrowser. These hosted endpoints are shared infrastructure: do not send private cookies, secrets, or authenticated browsing tasks through them. For production workflows, install the local binary below so sessions and cookies stay on your machine.

Install

Python (recommended) — wheel ships the native binary. Requires Python 3.10+:

pipx install pyunbrowser   # cleanest on macOS Homebrew / modern Linux (handles PEP 668)
pip  install pyunbrowser   # in a venv on python3.10+

macOS gotcha: the system /usr/bin/python3 is 3.9 and the wheel will reject it with "requires Python >=3.10". Use Homebrew's python3.13 or pipx (which manages its own Python). If pip install fails with PEP 668 ("externally-managed-environment"), that's the same issue — pipx install pyunbrowser is the right call.

from unbrowser import Client       # note: pip name is pyunbrowser, import is unbrowser
with Client() as ub:                # (PyPI's name moderation blocks 'unbrowser';
    r = ub.navigate("https://news.ycombinator.com")   # py- prefix is the standard workaround)

Cargo — binary only, no Python wrapper:

cargo install unbrowser
unbrowser --mcp

MCP — add the binary to Claude Code, Claude Desktop, Cursor, Cline, or any MCP host:

{
  "mcpServers": {
    "unchained": {
      "command": "unbrowser",
      "args": ["--mcp"]
    }
  }
}

The unchained key is only the client-side alias. Use unbrowser if you want exact naming, or keep unchained as the breadcrumb to the full Unchained browser-agent stack.

Hosted MCP smoke/discovery endpoint — for MCP clients that support Streamable HTTP:

{
  "mcpServers": {
    "unbrowser-hosted": {
      "url": "https://unchainedsky.com/unbrowser-mcp"
    }
  }
}

Use this hosted route to inspect tools or run public-page smoke tests. It is intentionally unauthenticated and SSRF-guarded, and it is not a place to replay private cookies or secrets.

Pre-built tarball — for systems without Python or Rust:

# macOS Apple Silicon
curl -L https://github.com/protostatis/unbrowser/releases/latest/download/unbrowser-aarch64-apple-darwin.tar.gz | tar xz
# macOS Intel
curl -L https://github.com/protostatis/unbrowser/releases/latest/download/unbrowser-x86_64-apple-darwin.tar.gz | tar xz
# Linux x86_64 (glibc 2.31+ / Ubuntu 20.04+)
curl -L https://github.com/protostatis/unbrowser/releases/latest/download/unbrowser-x86_64-unknown-linux-gnu.tar.gz | tar xz

From source:

cargo build --release   # binary at ./target/release/unbrowser

Session CLI

For shell-only agents, use a persistent session instead of heredoc JSON-RPC:

unbrowser session start --id demo
unbrowser exec demo navigate https://news.ycombinator.com
unbrowser exec demo query '.titleline > a'
unbrowser exec --pretty demo blockmap
unbrowser session stop demo

Bare RPC (low-level escape hatch)

echo '{"id":1,"method":"navigate","params":{"url":"https://news.ycombinator.com"}}' | unbrowser

That's the install. Runs anywhere a static binary runs — laptop, Lambda, Cloudflare Workers, edge, embedded.

Open source under Apache 2.0. When the cheap path can't handle a page (heavy SPAs, behavioral bot challenges), escalate to a real browser via unchainedsky-cli (drives your local Chrome via CDP) or the Unchained desktop app.


By the numbers

This binaryHeadless Chrome (Playwright/Puppeteer)
Binary size~10MB250MB+ Chrome download
RAM / session~50MB200–500MB
Cold start~100ms~1s
Tokens / page (LLM)~500 (BlockMap inline)tens of thousands of HTML, parsed by you
Install stepscargo buildinstall Chrome + Node + Playwright + system deps
Lambda / Workers / edge❌ Chrome too big
100K pages/day cost$0 (your infra)$$$ Chrome fleet or hosted API

5–10× lower memory, 25× smaller binary, 10× faster cold start, 70× lower per-page token cost. That's the tradeoff this product makes — defer JS-rendering (Phase 4/5) and pixel rendering (out of scope) in exchange for a footprint that fits in places Chrome doesn't.

Agent-friendly by design

This isn't a Chrome wrapper that an agent uses through a Puppeteer-shaped abstraction. It's a browser whose every output is shaped for LLM consumption:

  • navigate returns a BlockMap — ~500 tokens of structured page summary (landmarks, headings, interactives, density signals) right in the response. No follow-up call needed to know what's on the page.
  • Stable element refs (e:142) — query, click, type, submit using opaque handles. The LLM never has to scrape the DOM itself.
  • challenge field on every blocked navigate — provider, confidence, and the exact clearance cookie name. The agent reacts intelligently instead of guessing.
  • density.likely_js_filled heuristic — distinguishes "real SSR page" from "SSR shell with JS-filled cells" (the CNBC trap). The agent bails before burning round-trips on a page it can't read.
  • MCP-nativeunbrowser --mcp exposes the RPC tool surface to any MCP host (Claude Code, Claude Desktop, Cursor, Cline). 4 lines of config, zero glue code.
  • Real Chrome fingerprint (Chrome 134 JA4 + Akamai H2 hash) so sites don't block you for being a script.

For pages that do need real Chrome (heavy SPAs, JS-challenge bot walls), the binary detects them and accepts cookies via cookies_set — so you solve once in Chrome and replay forever here.

Quick demo — Hacker News top 3

from unbrowser import Client

with Client() as ub:
    ub.navigate("https://news.ycombinator.com")
    for s in ub.query(".titleline > a")[:3]:
        print(s["text"], s["attrs"]["href"])

5 lines, no headless browser install. Output is structured JSON, not 35KB of HTML. The Client wrapper handles subprocess lifecycle (atexit reaper so orphans are impossible), JSON-RPC framing, and surfaces real exceptions instead of silent result lookups.

Bare-RPC version (if you can't use Python)

The same demo without the wrapper — useful for languages other than Python or multi-step sessions. The protocol is JSON-RPC over stdin/stdout, one JSON object per line:

import subprocess, json
p = subprocess.Popen(["./target/release/unbrowser"],
    stdin=subprocess.PIPE, stdout=subprocess.PIPE, text=True, bufsize=1)
i = 0
def call(method, **params):
    global i; i += 1
    p.stdin.write(json.dumps({"id": i, "method": method, "params": params}) + "\n")
    p.stdin.flush()
    return json.loads(p.stdout.readline())["result"]

call("navigate", url="https://news.ycombinator.com")
for s in call("query", selector=".titleline > a")[:3]:
    print(s["text"], s["attrs"]["href"])

That's the entire protocol surface. Same shape from any language with subprocess + JSON.

One-shot CLI

For shell-friendly calls, use the convenience subcommand:

unbrowser navigate https://news.ycombinator.com --json

That prints one JSON result and exits from any install path (PyPI wheel, Cargo, or release tarball). Use JSON-RPC only when you need a persistent session. Run unbrowser --help for the native CLI surface.

A/B runtime shims

For corpus tests against JS-heavy pages, compare the default stable shims with the opt-in enhanced browser-environment shims:

unbrowser navigate https://example.com --exec-scripts --json
unbrowser navigate https://example.com --exec-scripts --json --shims enhanced
# or for JSON-RPC / MCP sessions:
UNBROWSER_SHIMS=enhanced unbrowser

enhanced adds content-positive layout/media/scroll/IndexedDB guesses on top of the stable runtime. It is intentionally opt-in so A/B runs can measure whether more page state materializes without changing the baseline.

Script evaluation is still bounded by UNBROWSER_SCRIPT_EVAL_BUDGET_MS (default 5000); navigate results report scripts.budget_exhausted and scripts.budget_skipped when the budget stops further script execution. The outer RPC watchdog (UNBROWSER_TIMEOUT_MS, default 30000) still wins if it is lower than the script budget.

For a JSONL corpus sweep:

python3 scripts/shim_ab.py --url https://nextjs.org/docs --url https://www.npmjs.com/package/playwright

SPA tier — what works, what doesn't

Empirical, not aspirational. Latest matrix: 28/30 on tested categories.

Page tierCoverageWhat to expect
Static + SSR (Wikipedia, MDN, news, docs, GitHub repo browsing, search engines, archive.org)✅ excellentsub-second navigate; full BlockMap; all selectors work; ~hundreds of tokens vs ~tens of KB raw
SSR + light hydration (Next.js docs, marketing pages, react.dev's static content)✅ usablereads SSR'd content fine; hydration adds nothing but doesn't break either
Bot-walled with cookie handoff (Zillow, Cloudflare-protected sites)✅ via cookies_setsolve once in Chrome, replay forever; challenge.provider field tells the agent which vendor
Module-loader SPAs (Ember, AMD apps like crates.io)⚠️ partial with exec_scripts: truebundles fetch + execute, modules register, but framework auto-mount needs case-by-case shimming
Heavy React/Vue bundles (react.dev runtime, large dashboard apps)⚠️ bounded — won't hang, won't renderwith exec_scripts: true the navigate completes inside the 30s wall-clock budget (5s for the script-eval phase, the rest for settle); rendered DOM may not materialize. Tune via UNBROWSER_TIMEOUT_MS
Apps requiring Workers / Canvas / IndexedDB / WebGL❌ out of scope by designuse the cookie-handoff path with real Chrome via unchainedsky-cli (CDP) or the Unchained desktop app
Hardest-tier anti-bot (PerimeterX with behavioral, Kasada, Akamai BMP advanced)❌ even cookie handoff is fragilereal Chrome via CDP is the right tier

Vs the alternatives:

ThiscurlPlaywright / headless Chrome
Static / SSR pages✅ but token-heavyoverkill
SPA-shell sites⚠️ partial via exec_scripts
Bot-walled (with cookie handoff)
Run in Lambda / Workers / edge❌ Chrome too big
Per-page cost at 100K/day~free~free$$$
LLM-shaped output✅ BlockMap inlineDIY parseDIY parse

Verified against (working)

Concrete sites tested with measured times. Cold-start to extracted-result.

CategorySitesTime
Reference / docsWikipedia, MDN, docs.rs, PyPI, react.dev (SSR portion)0.9 – 5.8s
NewsHacker News, BBC, TechCrunch, ArXiv listings1 – 1.6s
SearchGoogle /search, Bing, Brave, DuckDuckGo (html)0.2 – 1.8s
DevGitHub repo pages, npm, StackOverflow, HuggingFace model cards0.7 – 2.4s
Crypto / financeCoinGecko, Yahoo Finance (post-redirect-fix)3.5 – 6.9s
SocialLobsters, old.reddit.com0.9 – 1.4s
Govt / institutionalarXiv, archive.org, gov.uk0.6 – 1.0s
Interaction primitivestype, click + auto-follow, cookies_set/get/replay, eval, query_text0.3 – 1.3s

Surprises: all four major search engines work cleanly. CoinGecko's heavy dashboard SSRs enough that quotes come through. HuggingFace model cards expose model name in <h1>.

Bot-detection diagnostics

Every blocked navigate returns a challenge field naming the vendor (perimeterx_block, cloudflare_turnstile, aws_waf, datadome, akamai_bmp, imperva, arkose_labs, recaptcha, press_hold, yahoo_sad_panda, interstitial, generic_human_verification, unknown_block) plus the expected clearance cookie name. Agents react with cookie handoff via cookies_set instead of guessing.

Local cookie solver service

For fully transparent cookie handoff, run the local-only solver service backed by unchained-cli:

pip install 'pyunbrowser[solver]'  # or: pip install unchainedsky-cli
python scripts/cookie_service.py --headless --profile unbrowser-cookie-service
export UNBROWSER_COOKIE_SERVICE_URL=http://127.0.0.1:8765

Then use scripts/router.py (or RouterConfig(cookie_service_url=...)) as the agent-facing entry point. On a blocked navigate the router will:

detect challenge -> call local service -> Chrome obtains cookies -> cookies_set -> retry once

The service exposes GET /.well-known/unbrowser-cookie-solver and POST /solve, supports the same challenge providers as navigate.challenge, and returns only cookies from the user's local Chrome/unchained session. It does not fabricate challenge tokens. Keep it bound to 127.0.0.1; non-loopback binds are rejected unless --allow-remote-bind is passed because /solve is unauthenticated and can return browser cookies. Use --allow-host for domain allowlisting when desired, and use --no-headless --stealth for sites that reject headless Chrome. Chrome persists across solves by default for the standalone service; pass --no-keep-chrome for one-shot use. Solves are serialized per service process because a service instance owns one CDP port/profile pair.

When installed from the Python package, the same pieces are bundled behind the console wrapper:

unbrowser cookie-service --headless --profile unbrowser-cookie-service
unbrowser router https://example.com/protected

unbrowser router also auto-starts the local cookie service on first challenge when unchained is available and UNBROWSER_COOKIE_SERVICE_URL is not set. --allow-host example.com allows example.com and its subdomains only; broad single-label suffixes like com are rejected. Without an allowlist, the service rejects private/reserved IPs, localhost, and internal single-label hosts by default; use --allow-host to opt in to a specific internal host for local testing. Router refuses non-loopback UNBROWSER_COOKIE_SERVICE_URL values by default because it posts target URLs and challenge metadata to that service; pass --allow-remote-cookie-service only for a trusted remote solver.

SPA-detection diagnostics

Every navigate's blockmap.density field signals SPA-ness so agents bail before wasting round-trips:

  • thin_shell: true — page is < 4KB body text with no headings or interactives (typical React/Ember root). For HTTP errors (status >= 400), shell signals are suppressed and http_error_status is attached so a 404 is not mistaken for an SPA.
  • likely_js_filled: true — table/list/cell shells are empty, or the page has many scripts with little visible UI (CNBC / YouTube-class trap)
  • json_scripts: N — count of <script type="application/json"> (often holds the data the JS would render — try eval() on those before escalating)
  • script_heavy_shell: true — many scripts, little text, few links; usually browser-rendered UI rather than useful SSR

Three ways agents talk to it

Session CLI (persistent shell workflow)

When an agent can only shell out but needs incremental state, start a local daemon-backed session. Cookies, DOM, JS globals, and element refs persist until stop:

unbrowser session start --id golf
unbrowser exec golf navigate https://news.ycombinator.com
unbrowser exec golf query '.titleline > a'
unbrowser exec --pretty golf blockmap
unbrowser exec golf eval 'document.title'
unbrowser session stop golf

session exec and the shorter exec alias accept either shorthand args for common methods or a raw JSON params object:

unbrowser exec golf navigate https://example.com --exec-scripts
unbrowser exec golf query_debug '.product-card' --limit 5
unbrowser exec golf extract_cards '{"kind":"product","limit":20}'
unbrowser session prune   # remove dead sockets

MCP (no glue)

{
  "mcpServers": {
    "unchained": {
      "command": "unbrowser",
      "args": ["--mcp"]
    }
  }
}

Tools are auto-discovered by Claude Code, Claude Desktop, Cursor, Cline.

Subprocess (custom runtimes)

13 lines of Python (above). Or any language with subprocess + JSON.

Auto-escalation router (scripts/router.py)

from scripts.router import Router, RouterConfig, cached_cookies_solver

with Router(RouterConfig(
    binary="./target/release/unbrowser",
    chrome_solver=cached_cookies_solver("cookies.json"),
)) as r:
    r.navigate("https://www.zillow.com/homes/for_rent/")  # auto-handles 403 + cookie replay

Live event watcher (scripts/watch.py)

The binary emits NDJSON events (ready, navigate, challenge) on stderr. Pipe them through watch.py for color-coded one-liners:

unbrowser 2> >(python3 scripts/watch.py)

RPC methods

navigate {url}fetch + parse + return {status, url, bytes, headers, blockmap, challenge, tool_confidence, tool_margin, tool_likelihoods, tool_recommendations}
query {selector}CSS query → [{ref, tag, attrs, text, text_chars, text_truncated}]
query_debug {selector, limit?}explain selector misses: match count, sample matches, DOM summary, top tags/classes/data attrs/ids, and hints like selector_miss, thin_shell, embedded_json
text {selector?}textContent of FIRST match (default body). On Wikipedia/MDN/news sites the first <p> is often a hatnote — prefer text_main for article body.
text_maintextContent of <main> / [role=main] / single <article> / longest non-chrome subtree. Use this for reading article/docs/blog content.
discover {url?, goal?, exec_scripts?, same_origin?, include_network?, limit?, debug?}Cheap-first information discovery. Merges DOM routes, inferred form/query URLs, and network JSON routes into one ranked graph with provenance and escalation hints. Defaults to static discovery; set exec_scripts: true when fetch-visible routes are insufficient.
extract_cards {selector?, limit?, kind?}auto-detect repeated product/listing/article cards and return normalized fields including title, price, condition, url, availability, snippet, meta, image_alt, score
extract_table {selector} / table_to_json {selector?}normalize an HTML table into headers, rows, and row count. table_to_json defaults to the first table.
click {ref}dispatch click; auto-follows <a href> (returns {status, url, bytes, headers, blockmap, challenge} — same shape as navigate)
type {ref, text}set value + dispatch input/change events
submit {ref}gather form fields and navigate. Supports GET and application/x-www-form-urlencoded POST; multipart is not supported.
eval {code}run JS in embedded QuickJS. Raw JSON-RPC also accepts script or expression aliases and now errors instead of silently returning null when no code-like param is present.
cookies_set / cookies_get / cookies_clearsession jar
blockmaprecompute the page summary
bodyraw HTML of last navigation

blockmap.selectors surfaces concrete selector hints for the current page (data-testid, aria-label, role) so agents can bias toward query or query_text without guessing.

discover is the route-finding layer to use before extraction when you need to learn where information lives. By default it returns compact navigate_summary, route_discover_summary, and network_extract_summary fields plus the merged routes, forms, api_endpoints, network_sources, and escalations. Pass debug: true only when you need the full nested navigate, route_discover, and network_extract payloads for diagnosis. limit must be between 1 and 200; invalid url / limit inputs fail at the RPC boundary.

{"jsonrpc":"2.0","id":1,"method":"discover","params":{"url":"https://example.com","goal":"find pricing docs api status","same_origin":true,"limit":25}}

Use exec_scripts: true as an opt-in second pass for pages whose static HTML does not expose enough routes. In that mode, routes already present before scripts are labeled static_dom; routes that only appear after JavaScript/timers/fetches are labeled js_dom.

CSS selector engine: tag, id, class, [attr=val] (also ^=, $=, *=, ~=), all four combinators ( , >, +, ~), :first/last/nth-child/of-type including An+B formulas, :only-child/of-type, :not(), and :has().

When to escalate to real Chrome

This binary is the cheap path. For the cases it can't handle (heavy framework hydration, behavioral bot challenges, Workers/Canvas/IndexedDB), the next tier is a real Chrome instance driven via CDP. Two ways to get there:

This binaryunchainedsky-cliUnchained desktop app
Runs JSQuickJS (no V8 JIT)real Chrome via CDPreal Chrome (the user's, with their logins)
SPA hydrationpartial
Bot challengescookie handoff onlyactive solving via real browsermanual / interactive
Setuppip install pyunbrowserpip install unchainedsky-clidesktop install
Audienceagent / pipelineagent / pipelineend user
Per-page footprint~50MBfull Chromefull Chrome

The escalation path is a deliberate choice, not an automatic fallback — you ship pyunbrowser for the 80% of pages that work cheap, then route the 20% to unchainedsky-cli (or to a human via the desktop app). The vocabulary (navigate, query, click, cookies_set, BlockMap) is shared so code transfers cleanly.

Honest limits

  • Script execution is opt-in via exec_scripts: true. Default navigate skips it (the SSR/static path is what most agents want). With it on, inline + external <script> tags run in QuickJS — works for many SPAs, but heavy framework bootstraps (Ember, big React) often don't auto-mount because shims can't fake every browser-specific signal. The blockmap's density.likely_js_filled flag tells agents in one call when to escalate instead of burning round-trips.
  • All eval is wall-clock bounded. A 30s watchdog (configurable via UNBROWSER_TIMEOUT_MS, clamped to 1s..10min) covers script execution AND every subsequent settle/microtask/timer callback, so a hostile site can never wedge the binary or strand a CPU-pegged orphan process.
  • Form submit is intentionally narrow. GET and application/x-www-form-urlencoded POST are supported, including checked checkbox/radio values. Multipart upload forms are not supported — construct the request manually via eval or escalate.
  • Hardest-tier bot detection (PerimeterX with behavioral telemetry, advanced Akamai BMP, Kasada) needs the cookie-handoff path. The binary detects and labels the challenge for you, but solving it requires real Chrome (or a token vendor).
  • No screenshots. Out of scope by design.

Build

Rust 1.95+ via rustup. On macOS, also brew install cmake ninja (BoringSSL dependency).

cargo build --release

~2 min first build (BoringSSL compiles), instant after.

Architecture in one diagram

JSON-RPC stdin ─┐    ┌─ stdout
                ▼    ▲
         ┌────────────────────┐
          │  request (Chrome  │   ┌──────────┐    ┌──────────────────┐
          │  TLS+H2 fingerprint)├──▶ html5ever ├───▶ rquickjs +       │
         │                    │   │  parser  │    │  dom.js +        │
         │  cookie_store      │   └──────────┘    │  blockmap.js +   │
         │  (jar)             │                   │  interact.js     │
         └────────────────────┘                   └──────────────────┘

License

Apache 2.0 — see LICENSE.


For the cases this binary can't handle (heavy framework hydration, behavioral bot challenges, anything needing real Chrome), the next tier is unchainedsky-cli — drives a real Chrome via CDP, same vocabulary. End-users who want a point-and-click agent can skip the CLI entirely and use the Unchained desktop app.

関連サーバー