remote-browser

Automatización de navegador en la nube para agentes en entorno aislado sin acceso local a GUI. Admite navegación, inspección de páginas, interacciones, ejecución de JavaScript y gestión de cookies mediante más de 30 comandos CLI. Incluye gestión de sesiones en la nube para ejecutar agentes de navegador autónomos en paralelo, con monitoreo de tareas y opciones de salida estructurada. Proporciona tunelización del servidor de desarrollo local mediante Cloudflare para exponer puertos de la máquina remota al navegador en la nube. Mantiene el estado de la sesión entre comandos, permitiendo...

npx skills add https://github.com/browser-use/browser-use --skill remote-browser

Browser Automation for Sandboxed Agents

This skill is for agents running on sandboxed remote machines (cloud VMs, CI, coding agents) that need to control a headless browser.

Prerequisites

browser-use doctor    # Verify installation

For setup details, see https://github.com/browser-use/browser-use/blob/main/browser_use/skill_cli/README.md

Core Workflow

  1. Navigate: browser-use open <url> — starts headless browser if needed
  2. Inspect: browser-use state — returns clickable elements with indices
  3. Interact: use indices from state (browser-use click 5, browser-use input 3 "text")
  4. Verify: browser-use state or browser-use screenshot to confirm
  5. Repeat: browser stays open between commands
  6. Cleanup: browser-use close when done

Browser Modes

browser-use open <url>                                    # Default: headless Chromium
browser-use cloud connect                                 # Provision cloud browser and connect
browser-use --connect open <url>                          # Auto-discover running Chrome via CDP
browser-use --cdp-url ws://localhost:9222/... open <url>  # Connect via CDP URL

Commands

# Navigation
browser-use open <url>                    # Navigate to URL
browser-use back                          # Go back in history
browser-use scroll down                   # Scroll down (--amount N for pixels)
browser-use scroll up                     # Scroll up
browser-use tab list                      # List all tabs with lock status
browser-use tab new [url]                 # Open a new tab (blank or with URL)
browser-use tab switch <index>            # Switch to tab by index
browser-use tab close <index> [index...]  # Close one or more tabs

# Page State — always run state first to get element indices
browser-use state                         # URL, title, clickable elements with indices
browser-use screenshot [path.png]         # Screenshot (base64 if no path, --full for full page)

# Interactions — use indices from state
browser-use click <index>                 # Click element by index
browser-use click <x> <y>                 # Click at pixel coordinates
browser-use type "text"                   # Type into focused element
browser-use input <index> "text"          # Click element, then type
browser-use keys "Enter"                  # Send keyboard keys (also "Control+a", etc.)
browser-use select <index> "option"       # Select dropdown option
browser-use upload <index> <path>         # Upload file to file input
browser-use hover <index>                 # Hover over element
browser-use dblclick <index>              # Double-click element
browser-use rightclick <index>            # Right-click element

# Data Extraction
browser-use eval "js code"                # Execute JavaScript, return result
browser-use get title                     # Page title
browser-use get html [--selector "h1"]    # Page HTML (or scoped to selector)
browser-use get text <index>              # Element text content
browser-use get value <index>             # Input/textarea value
browser-use get attributes <index>        # Element attributes
browser-use get bbox <index>              # Bounding box (x, y, width, height)

# Wait
browser-use wait selector "css"           # Wait for element (--state visible|hidden|attached|detached, --timeout ms)
browser-use wait text "text"              # Wait for text to appear

# Cookies
browser-use cookies get [--url <url>]     # Get cookies (optionally filtered)
browser-use cookies set <name> <value>    # Set cookie (--domain, --secure, --http-only, --same-site, --expires)
browser-use cookies clear [--url <url>]   # Clear cookies
browser-use cookies export <file>         # Export to JSON
browser-use cookies import <file>         # Import from JSON

# Python — persistent session with browser access
browser-use python "code"                 # Execute Python (variables persist across calls)
browser-use python --file script.py       # Run file
browser-use python --vars                 # Show defined variables
browser-use python --reset                # Clear namespace

# Session
browser-use close                         # Close browser and stop daemon
browser-use sessions                      # List active sessions
browser-use close --all                   # Close all sessions

The Python browser object provides: browser.url, browser.title, browser.html, browser.goto(url), browser.back(), browser.click(index), browser.type(text), browser.input(index, text), browser.keys(keys), browser.upload(index, path), browser.screenshot(path), browser.scroll(direction, amount), browser.wait(seconds).

Tunnels

Expose local dev servers to the browser via Cloudflare tunnels.

browser-use tunnel <port>                 # Start tunnel (idempotent)
browser-use tunnel list                   # Show active tunnels
browser-use tunnel stop <port>            # Stop tunnel
browser-use tunnel stop --all             # Stop all tunnels

Command Chaining

Commands can be chained with &&. The browser persists via the daemon, so chaining is safe and efficient.

browser-use open https://example.com && browser-use state
browser-use input 5 "[email protected]" && browser-use input 6 "password" && browser-use click 7

Chain when you don't need intermediate output. Run separately when you need to parse state to discover indices first.

Common Workflows

Exposing Local Dev Servers

python -m http.server 3000 &                      # Start dev server
browser-use tunnel 3000                            # → https://abc.trycloudflare.com
browser-use open https://abc.trycloudflare.com     # Browse the tunnel

Tunnels are independent of browser sessions and persist across browser-use close.

Multi-Agent (--connect mode)

Multiple agents can share one browser via --connect. Each agent gets its own tab — other agents can't interfere.

Setup: Register once, then pass the index with every --connect command:

INDEX=$(browser-use register)                    # → prints "1"
browser-use --connect $INDEX open <url>          # Navigate in agent's own tab
browser-use --connect $INDEX state               # Get state from agent's tab
browser-use --connect $INDEX click <element>     # Click in agent's tab
  • Tab locking: When an agent mutates a tab (click, type, navigate), that tab is locked to it. Other agents get an error if they try to mutate the same tab.
  • Read-only access: state, screenshot, get, and wait commands work on any tab regardless of locks.
  • Agent sessions expire after 5 minutes of inactivity. Run browser-use register again to get a new index.

Global Options

OptionDescription
--headedShow browser window
--connectAuto-discover running Chrome via CDP
--cdp-url <url>Connect via CDP URL (http:// or ws://)
--session NAMETarget a named session (default: "default")
--jsonOutput as JSON

Tips

  1. Always run state first to see available elements and their indices
  2. Sessions persist — browser stays open between commands until you close it
  3. Tunnels are independent — they persist across browser-use close
  4. tunnel is idempotent — calling again for the same port returns the existing URL

Troubleshooting

  • Browser won't start? browser-use close then retry. Run browser-use doctor to check.
  • Element not found? browser-use scroll down then browser-use state
  • Tunnel not working? which cloudflared to check, browser-use tunnel list to see active tunnels

Cleanup

browser-use close                         # Close browser session
browser-use tunnel stop --all             # Stop tunnels (if any)

Más skills de browser-use

browser-use
browser-use
Automatiza interacciones del navegador para pruebas web, llenado de formularios, capturas de pantalla y extracción de datos. Úsalo cuando el usuario necesite navegar por sitios web, interactuar con páginas web, llenar formularios, tomar capturas de pantalla o extraer información de páginas web.
browser-automationofficial
cdp
browser-use
Drive Chrome via the DevTools Protocol from JavaScript. Run JS snippets through the `browser-harness-js` CLI — it auto-spawns a long-lived bun HTTP server…
official
browser
browser-use
Control directo del navegador mediante CDP. Úsalo cuando el usuario quiera automatizar, extraer, probar o interactuar con páginas web. Se conecta al Chrome que ya está ejecutando el usuario.
official
browser-harness
browser-use
Control directo del navegador mediante CDP. Úselo cuando el usuario quiera automatizar, extraer, probar o interactuar con páginas web. Se conecta al Chrome que ya está ejecutando el usuario.
official
cloud
browser-use
Documentos de referencia para la API REST de Cloud, SDKs y patrones de integración. Lea el archivo relevante según lo que necesite el usuario.
official
open-source
browser-use
Documentos de referencia para escribir código Python con la biblioteca browser-use. Lea el archivo relevante según lo que necesite el usuario.
official
manim-video
browser-use
Pipeline de producción para animaciones matemáticas y técnicas usando Manim Community Edition. Crea videos explicativos estilo 3Blue1Brown, algoritmos…
official
video-use
browser-use
Edita cualquier video mediante conversación. Transcribe, corta, corrige color, genera animaciones superpuestas, quema subtítulos — para cabezas parlantes, montajes, tutoriales, viajes,…
official