browser-harness

Kontrol browser langsung melalui CDP. Gunakan saat pengguna ingin mengotomatisasi, mengambil data, menguji, atau berinteraksi dengan halaman web. Terhubung ke Chrome yang sudah berjalan milik pengguna.

npx skills add https://github.com/browser-use/browser-harness --skill browser-harness

browser-harness

Direct browser control via CDP. You drive the user's real browser with Python helpers run through the browser-harness command.

Prerequisite (one-time — NOT part of the AI workflow)

This skill is instructions only. It assumes the browser-harness command is already on $PATH. If command -v browser-harness fails, do the one-time install in references/install.md first, then continue. Installation and browser-connection setup are a prerequisite; once browser-harness <<'PY' … PY prints page info, never run install/connection steps again as part of normal work.

Usage

browser-harness <<'PY'
new_tab("https://docs.browser-use.com")
wait_for_load()
print(page_info())
PY
  • Invoke as browser-harness — it's on $PATH after install. No cd, no uv run.
  • Use the heredoc form for every multi-line command. It prevents shell quote mangling inside Python strings and JavaScript snippets.
  • First navigation is new_tab(url), not goto_url(url) — goto runs in the user's active tab and clobbers their work.
  • Helpers are pre-imported and the daemon auto-starts; you never start/stop it manually unless you want to.

What actually works

  • Screenshots first. capture_screenshot() to understand the page, find visible targets, and decide whether you need a click, a selector, or more navigation.
  • Clicking. capture_screenshot() → read the pixel off the image → click_at_xy(x, y)capture_screenshot() to verify. Suppress the Playwright-habit reflex of "locate first, then click" — no getBoundingClientRect, no selector hunt. Drop to DOM only when the target has no visible geometry. Hit-testing happens in Chrome's browser process, so clicks pass through iframes / shadow DOM / cross-origin without extra work.
  • Bulk HTTP. http_get(url) + ThreadPoolExecutor. No browser needed for static pages.
  • After goto: wait_for_load().
  • Wrong/stale tab: ensure_real_tab().
  • Verification: print(page_info()) is the simplest "is this alive?" check; screenshots are the default way to verify whether a visible action worked.
  • DOM reads: use js(...) for inspection/extraction when a screenshot shows coordinates are the wrong tool.
  • Auth wall: redirected to login → stop and ask the user. Don't type credentials from screenshots.
  • Raw CDP for anything helpers don't cover: cdp("Domain.method", params).

After every meaningful action, re-screenshot before assuming it worked.

Remote / cloud browsers

Use remote for parallel sub-agents (each gets an isolated browser via a distinct BU_NAME) or on a headless server. BROWSER_USE_API_KEY must be set.

browser-harness <<'PY'
start_remote_daemon("work")   # clean cloud browser; profileName=/profileId= to reuse a logged-in profile
PY

BU_NAME=work browser-harness <<'PY'
new_tab("https://example.com")
print(page_info())
PY

start_remote_daemon prints a liveUrl so the user can watch. Running remote daemons bill until timeout.

Interaction skills (progressive disclosure)

If you struggle with a specific UI mechanic, read the matching file under ${CLAUDE_PLUGIN_ROOT}/interaction-skills/ before inventing an approach. Available: browser-wall, connection, cookies, cross-origin-iframes, dialogs, downloads, drag-and-drop, dropdowns, iframes, network-requests, print-as-pdf, profile-sync, screenshots, scrolling, shadow-dom, tabs, uploads, viewport.

Task-specific edits

For task-specific helper additions, edit ${CLAUDE_PLUGIN_ROOT}/agent-workspace/agent_helpers.py. Keep core helpers short.

Domain skills (opt-in)

Community per-site playbooks live in ${CLAUDE_PLUGIN_ROOT}/agent-workspace/domain-skills/<host>/ and are off by default. Set BH_DOMAIN_SKILLS=1 to enable them; when enabled and the task is site-specific, read every file in the matching <site>/ directory before inventing an approach.

Design constraints

  • Coordinate clicks default. Input.dispatchMouseEvent goes through iframes/shadow/cross-origin at the compositor level.
  • Connect to the user's running Chrome. Don't launch your own browser.
  • Prefer compositor-level actions (screenshots, coordinate clicks, raw key input) over framework/DOM hacks. Reach for interaction-skills/ only when those are the wrong tool.

Lebih banyak skill dari browser-use

browser-use
browser-use
Mengotomatiskan interaksi browser untuk pengujian web, pengisian formulir, tangkapan layar, dan ekstraksi data. Gunakan saat pengguna perlu menavigasi situs web, berinteraksi dengan halaman web, mengisi formulir, mengambil tangkapan layar, atau mengekstrak informasi dari halaman web.
browser-automationofficial
cdp
browser-use
Drive Chrome via the DevTools Protocol from JavaScript. Run JS snippets through the `browser-harness-js` CLI — it auto-spawns a long-lived bun HTTP server…
official
browser
browser-use
Kontrol browser langsung melalui CDP. Gunakan saat pengguna ingin mengotomatisasi, mengambil data, menguji, atau berinteraksi dengan halaman web. Terhubung ke Chrome yang sudah berjalan milik pengguna.
official
cloud
browser-use
Dokumen referensi untuk REST API Cloud, SDK, dan pola integrasi. Bacalah file yang relevan berdasarkan kebutuhan pengguna.
official
open-source
browser-use
Dokumen referensi untuk menulis kode Python menggunakan pustaka browser-use. Baca file yang relevan berdasarkan kebutuhan pengguna.
official
remote-browser
browser-use
Otomatisasi peramban cloud untuk agen yang di-sandbox tanpa akses GUI lokal. Mendukung navigasi, inspeksi halaman, interaksi, eksekusi JavaScript, dan manajemen cookie melalui 30+ perintah CLI. Termasuk manajemen sesi cloud untuk menjalankan agen peramban otonom secara paralel, dengan pemantauan tugas dan opsi keluaran terstruktur. Menyediakan tunneling server dev lokal melalui Cloudflare untuk mengekspos port dari mesin jarak jauh ke peramban cloud. Mempertahankan status sesi di seluruh perintah, memungkinkan...
official
manim-video
browser-use
Jalur produksi untuk animasi matematis dan teknis menggunakan Manim Community Edition. Membuat video penjelasan bergaya 3Blue1Brown, algoritma…
official
video-use
browser-use
Edit video apa pun melalui percakapan. Transkripsi, potong, grading warna, buat animasi overlay, bakar subtitle — untuk talking head, montase, tutorial, perjalanan,…
official