SideButton
Open-source MCP server with knowledge packs, 40+ browser tools, and YAML workflow engine for AI agents.
SideButton
Open-source AI agent platform — MCP server, knowledge packs, and workflow automation tools.
AI agent platform with 40+ AI agent tools. Run autonomous AI agents with agentic workflows, knowledge packs, and real browser control. Connect Claude Code, Cursor, ChatGPT, or any MCP client.
npx sidebutton@latest
# Dashboard at http://localhost:9876
What you get
| MCP Server | 40+ AI agent tools for browser control, workflow execution, knowledge pack access. Stdio and SSE transports. |
| REST API | 60+ endpoints. Trigger workflows remotely from webhooks, cron jobs, mobile apps, or other agents. |
| Workflow Engine | AI workflow automation with 34+ step types — browser, shell, LLM, control flow. Define agentic workflows in YAML. |
| Knowledge Packs | Installable domain knowledge — CSS selectors, data models, state machines. Role playbooks turn coding agents into an AI software engineer, QA, or PM. |
| Chrome Extension | 40+ browser commands. Real DOM access via WebSocket, not screenshots. Recording mode. |
| Dashboard | Svelte UI — workflow browser, run logs, skill pack manager, system status. |
Quick Start
# Install and start
npx sidebutton@latest
# Or from source
pnpm install && pnpm build && pnpm start
# Open http://localhost:9876
CLI
pnpm cli serve # Start server with dashboard
pnpm cli serve --stdio # Start with stdio transport (for Claude Desktop)
pnpm cli list # List available workflows
pnpm cli status # Check server status
# Skill pack management
pnpm cli registry add <path|url> # Install skill packs from a registry
pnpm cli registry update [name] # Update installed packs
pnpm cli registry remove <name> # Uninstall packs and remove registry
pnpm cli search [query] # Search available skill packs
# Creating skill packs
pnpm cli init [domain] # Scaffold a new skill pack
pnpm cli validate [path] # Validate pack structure
pnpm cli publish [source] # Publish to a registry
MCP Server
SideButton is an AI agent platform and MCP server. AI coding agents connect to it directly for browser control, workflow automation, and domain knowledge.
Works with Claude Code, Cursor, Claude Desktop, VS Code, Windsurf, ChatGPT — any MCP client.
Claude Code
Add to ~/.claude/settings.json:
{
"mcpServers": {
"sidebutton": {
"type": "sse",
"url": "http://localhost:9876/mcp"
}
}
}
Claude Desktop
Add to ~/Library/Application Support/Claude/claude_desktop_config.json:
{
"mcpServers": {
"sidebutton": {
"command": "npx",
"args": ["sidebutton", "--stdio"]
}
}
}
Cursor
Add to ~/.cursor/mcp.json:
{
"mcpServers": {
"sidebutton": {
"url": "http://localhost:9876/mcp"
}
}
}
MCP Tools
| Tool | Description |
|---|---|
run_workflow | Execute a workflow by ID |
list_workflows | List all available workflows |
get_workflow | Get workflow YAML definition |
get_run_log | Get execution log for a run |
list_run_logs | List recent workflow executions |
get_browser_status | Check browser extension connection |
capture_page | Capture selectors from current page |
navigate | Navigate browser to URL |
snapshot | Get page accessibility snapshot |
click | Click an element |
type | Type text into an element |
scroll | Scroll the page |
screenshot | Capture page screenshot |
hover | Hover over element |
extract | Extract text from element |
extract_all | Extract all matching elements |
extract_map | Extract structured data from repeated elements |
select_option | Select dropdown option |
fill | Fill input value (React-compatible) |
press_key | Send keyboard keys |
scroll_into_view | Scroll element into viewport |
evaluate | Execute JavaScript in browser |
exists | Check if element exists |
wait | Wait for element or delay |
check_writing_quality | Evaluate text quality |
REST API
60+ JSON endpoints for external integrations. Same workflows available via MCP locally and via REST remotely.
# Run a workflow
curl -X POST http://localhost:9876/api/workflows/check_ticket/run \
-H "Content-Type: application/json" \
-d '{"params": {"ticket_id": "PROJ-123"}}'
# List workflows
curl http://localhost:9876/api/workflows
# Get run log
curl http://localhost:9876/api/runs/latest
Trigger workflows from webhooks, cron jobs, mobile apps, or other agents on different machines.
Workflow Engine
YAML-first orchestration. 34+ step types:
Step Types
| Type | Description |
|---|---|
| Browser | |
browser.navigate | Open a URL |
browser.click | Click an element by selector |
browser.type | Type text into an element |
browser.fill | Fill input value (React-compatible) |
browser.scroll | Scroll the page |
browser.extract | Extract text from element into variable |
browser.extractAll | Extract all matching elements |
browser.extractMap | Extract structured data from repeated elements |
browser.wait | Wait for element or fixed delay |
browser.exists | Check if element exists |
browser.hover | Position cursor on element |
browser.key | Send keyboard keys |
browser.snapshot | Capture accessibility snapshot |
browser.injectCSS | Inject CSS styles into page |
browser.injectJS | Execute JavaScript in page |
browser.select_option | Select dropdown option |
browser.scrollIntoView | Scroll element into view |
| Shell | |
shell.run | Execute a bash command |
terminal.open | Open a visible terminal window (macOS) |
terminal.run | Run command in terminal window |
| LLM | |
llm.classify | Structured classification with categories |
llm.generate | Free-form text generation |
| Control Flow | |
control.if | Conditional branching |
control.retry | Retry with backoff |
control.stop | End workflow with message |
workflow.call | Call another workflow with parameters |
| Data | |
data.first | Extract first item from list |
LLM steps work with Ollama (local), OpenAI, Anthropic, and Google.
Example
id: check_ticket_status
title: "Check Jira ticket and classify"
steps:
- type: browser.navigate
url: "https://your-org.atlassian.net/browse/{{ticket_id}}"
- type: browser.extract
selector: "[data-testid='status-field']"
as: current_status
- type: control.if
condition: "{{current_status}} != 'Done'"
then:
- type: llm.classify
prompt: "Should this ticket be closed? Context: {{current_status}}"
classes: [close, keep_open]
as: decision
Variable Interpolation
Use {{variable}} syntax to reference extracted values or parameters:
steps:
- type: browser.extract
selector: ".username"
as: user
- type: shell.run
cmd: "echo 'Hello, {{user}}!'"
Knowledge Packs
Installable domain knowledge (skill packs) per web app or domain. Knowledge packs power AI code review, automated testing, and enterprise AI agent deployments.
Also referred to as skill packs in code and CLI commands.
- Selectors — CSS selectors for UI elements
- Data models — entity types, fields, relationships, valid states
- State machines — valid transitions per state
- Role playbooks — role-specific procedures (QA, SE, PM, SD)
- Common tasks — step-by-step procedures, gotchas, edge cases
sidebutton install github.com
sidebutton install atlassian.net
11 domains, 28+ modules published. Open registry — build and share packs for any web app.
Chrome Extension
Install from the Chrome Web Store.
- 40+ browser commands — navigate, click, type, extract, scroll, wait, snapshot
- Real DOM access via CSS selectors — not pixel coordinates, not screenshots
- Recording mode — capture manual actions as workflows
- Embed buttons — inject action buttons into any web page
- WebSocket connection — stable reconnection, works with local or remote server
After installing:
- Navigate to any website
- Click the SideButton extension icon
- Click "Connect This Tab"
Dashboard & Observability
Svelte UI at http://localhost:9876:
- Workflow browser — list, search, run
- Run logs — step-by-step execution traces with timing, variables, errors
- Skill pack manager — install, browse, inspect
- System status — extension connection, LLM config, server health
SideButton handles AI agent orchestration — from workflow execution to knowledge injection.
Architecture
┌──────────────────────────────────────────────────────────────────────────┐
│ @sidebutton/server │
│ │
│ ┌─────────────────────┐ ┌──────────────────────────────────────────┐ │
│ │ stdio Transport │ │ Fastify HTTP + WebSocket (port 9876) │ │
│ │ ───────────────── │ │ ──────────────────────────────────── │ │
│ │ stdin → JSON-RPC │ │ GET / → Dashboard (Svelte) │ │
│ │ stdout ← JSON-RPC │ │ GET /ws → Chrome Extension WS │ │
│ │ (Claude Desktop) │ │ POST /mcp → MCP JSON-RPC (SSE) │ │
│ └──────────┬──────────┘ │ GET /api/* → REST API │ │
│ │ └──────────────────────┬───────────────────┘ │
│ │ │ │
│ └──────────────────┬──────────────────┘ │
│ ▼ │
│ ┌────────────────────────────────────────────────────────────────────┐ │
│ │ @sidebutton/core │ │
│ │ │ │
│ │ - Workflow types & parser (YAML) │ │
│ │ - Step executors (37 step types) │ │
│ │ - Variable interpolation │ │
│ │ - Execution context & events │ │
│ └────────────────────────────────────────────────────────────────────┘ │
└──────────────────────────────────────────────────────────────────────────┘
▲ ▲ ▲ ▲
│ stdio │ WebSocket │ HTTP POST │ REST
▼ ▼ ▼ ▼
┌──────────────┐ ┌─────────────────┐ ┌─────────────────┐ ┌───────────────────┐
│Claude Desktop│ │ Chrome Extension│ │ Claude Code │ │ Mobile App │
│ (MCP stdio) │ │ (Browser Auto) │ │ (MCP SSE) │ │ (REST Client) │
└──────────────┘ └─────────────────┘ └─────────────────┘ └───────────────────┘
Project Structure
sidebutton/
├── packages/
│ ├── core/ # @sidebutton/core — workflow engine
│ │ └── src/
│ │ ├── types.ts # Workflow types
│ │ ├── parser.ts # YAML loader
│ │ ├── executor.ts # Workflow runner
│ │ └── steps/ # Step implementations
│ ├── server/ # @sidebutton/server — MCP + HTTP + CLI
│ │ ├── bin/ # CLI entry point
│ │ └── src/
│ │ ├── server.ts # Fastify HTTP server
│ │ ├── stdio-mode.ts # stdio transport entry point
│ │ ├── extension.ts # WebSocket client
│ │ ├── mcp/ # MCP handlers
│ │ │ ├── handler.ts # MCP JSON-RPC logic
│ │ │ ├── stdio.ts # stdio transport adapter
│ │ │ └── tools.ts # Tool definitions
│ │ └── cli.ts # Commander CLI
│ └── dashboard/ # Svelte web UI
│ └── src/
│ ├── App.svelte
│ └── lib/
├── extension/ # Chrome extension
├── workflows/ # Public workflow library
├── actions/ # User-created workflows
├── skills/ # Installed skill packs
└── run_logs/ # Execution history
Environment Variables
| Variable | Required For | Description |
|---|---|---|
OPENAI_API_KEY | llm.* steps | OpenAI API key for LLM workflows |
ANTHROPIC_API_KEY | llm.* steps | Anthropic API key (alternative) |
Development
pnpm install # Install dependencies
pnpm build # Build all packages
pnpm start # Start server
pnpm cli list # List workflows
pnpm cli status # Check status
Watch Mode
pnpm dev # Full dev mode (all packages)
pnpm dev:server # Server with auto-restart on :9876
pnpm dev:dashboard # Dashboard watch build
pnpm dev:core # Core library watch build
Platform Automation Disclaimer
SideButton is a general-purpose browser automation framework. When automating third-party platforms:
- Review Terms of Service: Many platforms prohibit or restrict automation. You are responsible for complying with the terms of any platform you automate.
- Account Risk: Automation may result in account restrictions or suspension on some platforms.
- Use Responsibly: Only automate actions you would perform manually. Respect rate limits and platform guidelines.
The authors do not endorse or encourage violations of third-party terms of service.
Legal
License
This project uses mixed licensing. See LICENSING.md for details.
- Engine, server, CLI, dashboard — Apache-2.0
- Browser extension — FSL-1.1-Apache-2.0 (converts to Apache-2.0 on 2029-03-15)
Related Servers
Bright Data
sponsorDiscover, extract, and interact with the web - one interface powering automated access across the public internet.
Crawl4AI RAG
Integrate web crawling and Retrieval-Augmented Generation (RAG) into AI agents and coding assistants.
YouTube Video Summarizer MCP
Fetch and summarize YouTube videos by extracting titles, descriptions, and transcripts.
Reddit MCP
A server to browse, search, and read content on Reddit using the Reddit API.
freesound-mcp
A Model Context Protocol (MCP) server that enables AI applications to search and download audio resources from the Freesound platform via natural language commands.
Anysite
Turn any website into an API
MCP LLMS.txt Explorer
Explore and analyze websites that have implemented the llms.txt standard.
Postman V2
An MCP server that provides access to Postman using V2 api version.
yt-dlp-mcp
Download video and audio from various platforms like YouTube, Facebook, and TikTok using yt-dlp.
VLR MCP
MCP server for accessing VLR.gg VALORANT esports data
MCP-Puppeteer-Linux
Automate web browsers on Linux using Puppeteer. Enables LLMs to interact with web pages, take screenshots, and execute JavaScript.