Argus

AI-powered QA harness that catches JS errors, accessibility failures, visual regressions, and security issues via Chrome DevTools MCP — no test scripts required.

Argus — AI-Powered Dev Testing Tool

Argus MCP server

Argus Panoptes — the all-seeing giant of Greek mythology with a hundred eyes who never slept.

Automated browser testing pipeline that catches bugs, compares environments, and sends rich reports to Slack (or generates a self-contained HTML dashboard when Slack is not configured) — powered by Chrome DevTools MCP and Claude Code.


MCP Quick Start

Add both servers to your .mcp.json:

{
  "mcpServers": {
    "chrome-devtools": {
      "command": "npx",
      "args": ["-y", "chrome-devtools-mcp@latest"]
    },
    "argus": {
      "command": "npx",
      "args": ["-y", "argusqa-os"]
    }
  }
}

Or register via the Claude Code CLI:

claude mcp add chrome-devtools -- npx -y chrome-devtools-mcp@latest
claude mcp add argus -- npx -y argusqa-os

Set your target URL and start Chrome with remote debugging:

# .env
TARGET_DEV_URL=http://localhost:3000

# Start Chrome (required — Argus drives this instance via CDP)
# macOS:   open -a "Google Chrome" --args --remote-debugging-port=9222 --headless=new
# Windows: "C:\Program Files\Google\Chrome\Application\chrome.exe" --remote-debugging-port=9222 --headless=new
# Linux:   google-chrome --remote-debugging-port=9222 --headless=new --no-sandbox

Then ask Claude (or any MCP client):

Run argus_audit on http://localhost:3000

Six tools are exposed:

ToolWhat it does
argus_auditFast QA pass — JS errors, network failures, accessibility, SEO, security, CSS, content
argus_audit_fullDeep QA pass — adds Lighthouse scoring, responsive layout checks across 4 viewports, memory leak detection, hover-state bug detection, and accessibility tree snapshot
argus_compareDiff dev vs staging side-by-side — screenshots, findings delta, environment regressions
argus_last_reportReturn the last saved JSON report without re-running a scan
argus_watch_snapshotSnapshot the currently open Chrome tab without navigating — raw console + network capture
argus_get_contextCapture everything broken on the open tab, formatted as a diagnostic context for Claude to diagnose and suggest fixes

Requires: Node.js ≥ 20.19, Chrome (desktop or headless), and the chrome-devtools-mcp server registered alongside Argus (shown above).


The landing/ directory contains the product landing page (React + Vite + Tailwind + Framer Motion) with Supabase-backed waitlist and enterprise contact forms. Live at argus-qa.com (deployed via Cloudflare Pages; background video served from Cloudflare R2). See landing/README.md for setup.

Tech stack icons

🔴 Critical / 🟡 Warning / 🔵 Info⚙️🧪📋
114 distinct issue types detected24 analysis engines348 test assertions82 test blocks

What Argus Catches

Argus runs 24 analysis engines per run and detects 114 distinct issue types across JavaScript runtime, network, CSS, performance, accessibility, SEO, security, content quality, responsive layout, memory, runtime anti-patterns, hover-state interactions, accessibility tree snapshots, keyboard focus, and Chrome DevTools issues panel — plus flakiness detection, historical baselines, user flow assertions, and environment comparison as cross-cutting layers. Every finding is classified by severity (critical / warning / info) and routed to the right Slack channel — or rendered as a local report.html when Slack is not configured.

JavaScript Runtime

SeverityBug / IssueDetection Method
🔴 CriticalUncaught exceptions — TypeError, ReferenceError, etc.window.onerror listener injected before page load
🔴 CriticalUnhandled Promise rejectionsunhandledrejection event listener injected into the page
🟡 Warningconsole.error calls (on non-critical routes)Chrome DevTools list_console_messages
🔴 Criticalconsole.error calls (on critical routes)Chrome DevTools list_console_messages
🔵 Infoconsole.warn deprecation notices and warningsChrome DevTools list_console_messages

Network & API

SeverityBug / IssueDetection Method
🔴 CriticalHTTP 5xx server errors on any requestlist_network_requests → status ≥ 500
🔴 Critical401 / 403 auth failures — user is being kicked outlist_network_requests → status 401 or 403
🔴 CriticalAPI endpoint called 5+ times in one page load — likely an infinite loopNetwork frequency grouping by normalized URL + method
🟡 WarningHTTP 4xx client errors (404, 422, 429, etc.)list_network_requests → status 400–499 (non-auth)
🟡 WarningAPI endpoint called 3–4 times — likely a double-fetch bugFrequency grouping → 3 ≤ count ≤ 4 (check useEffect deps)
🔵 InfoAPI endpoint called twice — may be intentional prefetchFrequency grouping → count = 2
🔵 InfoAPI call summary per page load (total calls, unique endpoints, duplicates)Aggregated network analysis
🟡 WarningRedirect chain longer than 2 hops — extra round-trips inflate load timeNavigation Timing redirectCount read after page settle
🟡 WarningBroken internal link — <a href> target returns HTTP 404<a> elements harvested via evaluate_script, each verified against list_network_requests

Page Health

SeverityBug / IssueDetection Method
🔴 CriticalBlank or near-empty page — less than 50 characters of body textdocument.body.innerText length check after navigation
🟡 WarningExpected element never appeared — page may have crashed mid-loadwaitFor selector timeout after 10 seconds

CSS & Styling

SeverityBug / IssueDetection Method
🟡 Warning!important cascade conflict — forced override fighting another ruleCSS rule walk: property declared with !important on same element
🟡 WarningComponent style leak — BEM selector found in the wrong stylesheet.block__element selector in a file whose name doesn't match block
🟡 WarningReact inline style overriding a stylesheet declaration on the same elementstyle="" attribute vs. matching CSS rule, __reactFiber presence confirmed
🔵 InfoCSS property declared by multiple rules on the same element (cascade override)Computed style walk across all matched rules per key element
🔵 InfoUnused CSS rules — selectors matching no element on the page (> 10 flagged)querySelectorAll(selector).length === 0 for every rule
🔵 InfoCSS Modules detected — hashed class names found on DOM elementsPattern _ComponentName_class_hash matched on live DOM
🔵 InfoSCSS source map found — compiled CSS traced back to .scss origin filesourceMappingURL comment in <style> tags

Performance

SeverityBug / IssueDetection Method
🟡 WarningLCP > 2500ms — largest element took too long to paintChrome performance trace → performance_analyze_insight
🟡 WarningCLS > 0.1 — layout shifted significantly after initial renderChrome performance trace
🟡 WarningFID / TBT > 100ms — main thread was blocked during interactionChrome performance trace
🟡 WarningTTFB > 800ms — server took too long to send the first byteChrome performance trace

Accessibility

SeverityBug / IssueDetection Method
🔴 CriticalLighthouse accessibility score below 50 / 100Lighthouse audit via lighthouse_audit
🟡 WarningLighthouse accessibility score 50–89 / 100Lighthouse audit
🟡 WarningMissing alt text on imagesIndividual Lighthouse audit check
🟡 WarningInsufficient color contrast ratioIndividual Lighthouse audit check
🟡 WarningMissing ARIA labels on interactive elementsIndividual Lighthouse audit check
🟡 WarningKeyboard navigation broken or unreachable elementsIndividual Lighthouse audit check

SEO

SeverityBug / IssueDetection Method
🟡 WarningMissing <meta name="description">DOM inspection via evaluate_script
🟡 WarningMissing Open Graph tags (og:title, og:description, og:image)DOM inspection via evaluate_script
🟡 Warningog:image URL is relative — Open Graph requires an absolute URLDOM inspection + URL prefix check (http:// / https://)
🟡 WarningMultiple <h1> tags on one pageDOM inspection — querySelectorAll('h1').length > 1
🟡 WarningZero <h1> tags — page has no primary headingDOM inspection — querySelectorAll('h1').length === 0
🟡 WarningGeneric page title (less than 10 characters, or default placeholder)DOM inspection + length check
🟡 WarningMissing <link rel="canonical">DOM inspection via evaluate_script
🟡 WarningMissing <meta name="viewport">DOM inspection via evaluate_script

Security

SeverityBug / IssueDetection Method
🔴 CriticalAuth token found in localStorage or sessionStorageevaluate_script walks storage keys for token patterns
🔴 CriticalSensitive token in the page URL (query param or hash)URL pattern match against current window.location.href
🔴 Criticaleval() call detected in page scriptsevaluate_script AST-style text scan of inline <script> tags
🔴 CriticalCSP violation — inline script or external resource blocked by Content-Security-PolicyChrome DevTools Issues panel (list_console_messages({ types: ['issue'] }))
🟡 WarningSensitive data (password, token, secret) logged to the consolelist_console_messages + keyword match
🟡 WarningMissing Content-Security-Policy response headerfetch(location.href) inside the page → response headers check
🟡 WarningMissing X-Frame-Options response headerSame headers fetch
🟡 WarningCross-origin <iframe> without sandbox attribute — enables form submission, parent navigation, cookie accessevaluate_script checks iframe[src] elements for missing sandbox attribute
🟡 WarningPage served over plain HTTP with no HTTPS upgrade redirectURL protocol check (http:// + non-localhost)
🔵 InfoCookie present without HttpOnly flag (limited detection — JS-visible cookies only)document.cookie inspection
🔵 InfoDeprecated browser API usage (e.g. document.domain, DOMSubtreeModified)Chrome DevTools Issues panel

Content Quality

SeverityBug / IssueDetection Method
🟡 Warningnull or undefined rendered as visible textDOM text scan for literal "null" / "undefined" strings
🟡 WarningLorem ipsum / placeholder copy still in productionDOM text scan for "lorem ipsum" and common placeholder strings
🟡 WarningBroken image (404 or failed to load)evaluate_script checks img.naturalWidth === 0 on all images
🔵 InfoEmpty data list — <ul>, <ol>, or <select> with no childrenDOM structure check

Responsive / Mobile

SeverityBug / IssueDetection Method
🔴 CriticalHorizontal overflow at mobile / tablet viewport (≤ 768px)emulate at 375px and 768px → document.documentElement.scrollWidth > clientWidth
🟡 WarningTouch target smaller than 44×44 px at mobile or tablet viewportCSS computed size check on interactive elements at 375px and 768px
🔵 InfoResponsive screenshot grid — snapshots at 375 / 768 / 1024 / 1440pxemulate at 4 breakpoints, screenshots dispatched to Slack

Network Performance

SeverityBug / IssueDetection Method
🔴 CriticalAPI response time > 3000msPerformanceObserver entries for fetch / XHR calls
🟡 WarningAPI response time > 1000msSame observer, lower threshold
🔴 CriticalAPI response payload > 2 MBlist_network_requests → response body size
🟡 WarningAPI response payload > 500 KBSame, lower threshold
🟡 WarningCross-origin (third-party) script TTFB > 2000ms — blocking render or late interactivityHAR timing.wait field from list_network_requests HAR data; cross-origin requests only

Network Request Origin Tagging

All network findings carry an origin field ('first-party' / 'third-party') so operators can triage critical first-party failures separately from third-party noise.

Lighthouse Audits

SeverityBug / IssueDetection Method
🔴 CriticalLighthouse accessibility score < 50 / 100lighthouse_audit (accessibility category)
🟡 WarningLighthouse accessibility score 50–89 / 100lighthouse_audit
🟡 WarningLighthouse performance score < 90 / 100lighthouse_audit (performance category)
🟡 WarningLighthouse SEO score < 90 / 100lighthouse_audit (seo category)
🟡 WarningLighthouse best-practices score < 90 / 100lighthouse_audit (best-practices category)
🟡 WarningIndividual failing Lighthouse audit itemsSurfaced per-audit from the full Lighthouse report

Memory Leaks

SeverityBug / IssueDetection Method
🔴 Critical> 100 detached DOM nodes in V8 heap — severe leaktake_memory_snapshot → parse flat nodes array for "Detached Xxx" names
🟡 Warning> 10 detached DOM nodes in V8 heap — probable leakSame snapshot parse, lower threshold
🟡 WarningHeap grew > 2 MB after navigate-away + navigate-back — probable per-load leakperformance.memory.usedJSHeapSize delta across round-trip (soft — GC-dependent)

Runtime Anti-Patterns

SeverityBug / IssueDetection Method
🟡 WarningSynchronous XMLHttpRequest — blocks the main thread until the server respondsXMLHttpRequest.open patched via addScriptToEvaluateOnNewDocument; async === false calls recorded
🟡 Warningdocument.write / document.writeln called — can erase the page or block parsingdocument.write and document.writeln patched before page load; calls recorded with method + content
🟡 WarningLong task > 50ms on the main thread — blocks user interactionPerformanceObserver with entryTypes: ['longtask'] injected before page load
🔴 CriticalCORS policy violation — cross-origin fetch blocked by the browserlist_console_messages + pattern match for "has been blocked by CORS policy"
🟡 WarningService worker registration failure — SW script returns 4xx or is invalidnavigator.serviceWorker.register patched before page load; .catch() records failing script URL
🔵 InfoSame-origin static asset (.js, .css, .png, .woff2, etc.) served without Cache-Control or ETag — browsers cannot cache it efficientlyevaluate_script reads performance.getEntriesByType('resource'), HEAD-fetches each unique same-origin asset, checks response headers

Historical Baselines & Trends

SeverityBug / IssueDetection Method
🔴 CriticalNew critical finding not present in the saved baseline — regression introduced since last runapplyBaseline compares finding keys (type::message[:100]::status) against reports/baselines/<branch>.json (D7.2 per-branch)
🟡 WarningNew warning finding not present in the baselineSame key comparison, warning severity
🔵 InfoPre-existing finding still present — no change since last runSuppressed from real-time alerts; included in info digest only
🔵 InfoRun trend summary — new vs resolved counts, saved per runAppended to reports/baselines/<branch>-trends.json; surfaced as a trend line in Slack digest

Hover-State Bugs

SeverityBug / IssueDetection Method
🟡 Warning / 🔴 Critical[aria-haspopup] element whose controlled popup does not become visible after hover — aria-expanded stays false and popup remains display:none / visibility:hidden / opacity:0hover dispatches mousemove; evaluate_script checks aria-expanded + getComputedStyle on the controlled element; critical on routes marked critical: true
🟡 Warning[data-tooltip] element whose [role="tooltip"] is not visible in the DOM after hover — not found or opacity ≤ 0.05Same hover + evaluate_script checks tooltip opacity, display, visibility, and offsetHeight

Accessibility Snapshot Analysis

SeverityBug / IssueDetection Method
🟡 WarningInteractive element (<button>, <a>, [role="button"], [role="link"]) with no accessible name — no text content, aria-label, aria-labelledby, title, or alttake_snapshot captures DOM/AX state; evaluate_script queries each visible interactive element for accessible name sources
🟡 WarningForm control (<input>, <select>, <textarea>) with no associated label — no <label for="...">, aria-label, or aria-labelledby (placeholder is intentionally excluded — not a valid accessible name per WCAG 2.1 §3.3.2)evaluate_script checks label[for], ancestor <label>, aria-label, and aria-labelledby for each visible control
🟡 WarningLandmark role appearing more than once without distinct aria-label / aria-labelledby — screen readers cannot differentiate themevaluate_script counts [role=X] instances and checks for unique label values across: main, banner, contentinfo, navigation, search, complementary, form, region
🟡 WarningHeading level skip — h1→h3 or h4→h6 jumps more than one level, breaking WCAG 1.3.1 document outlineDOM walk of h1h6 elements; detects gaps > 1 between consecutive heading levels
🟡 Warningaria-expanded button/control has no aria-controls attribute or references a non-existent elementevaluate_script checks [aria-expanded] elements for missing or broken aria-controls pointer

Keyboard Accessibility

SeverityBug / IssueDetection Method
🟡 WarningButton or focusable element has outline:0 with no box-shadow fallback — no visible focus ringpress_key({ key: 'Tab' }) walk + evaluate_script reads document.activeElement computed style for outline/box-shadow

Flakiness Detection

SeverityBug / IssueDetection Method
originalConfirmed finding — present in both crawl runsmergeRunResults finds the key in both run1 and run2 (type::message[:100]::status scheme); original severity kept
🔵 InfoFlaky finding — appeared in only one of two crawl runsPresent in run1 or run2 but not both; downgraded to severity: 'info', labelled :zap: _flaky_ in Slack digest

User Flow Assertions

SeverityBug / IssueDetection Method
🔴 CriticalFlow step failed — navigate/fill/click/waitFor threw mid-flow (page state unknown)flow-runner.js wraps every step; any throw emits flow_step_failed and halts the flow
🔴 Criticalelement_visible assert — expected selector absent within timeoutPolled via evaluate_script + document.querySelector (MCP wait_for doesn't reliably throw on timeout)
🟡 Warningno_console_errors assert — console errors recorded during this flow (baseline-sliced, not session-wide)Baseline snapshot of list_console_messages at flow start; only messages after that offset count
🟡 Warningno_network_errors assert — 4xx/5xx request during this flow (baseline-sliced)Baseline snapshot of list_network_requests at flow start; status ≥ 400 after offset
🟡 Warningurl_contains assert — URL does not include expected substring after flow completesevaluate_script reads window.location.href
🟡 Warningelement_not_visible assert — selector unexpectedly present in DOMevaluate_script!document.querySelector(...)
🔴 Criticalno_js_errors assert — uncaught exceptions captured in window.__argusErrors during flowScript parses the injected error buffer

Environment Regressions (dev vs staging)

SeverityBug / IssueDetection Method
🔴 CriticalAPI status regressed — request that returned 2xx in dev now returns 5xx in stagingNetwork diff between both environments
🟡 WarningVisual change > 0.5% pixels different between dev and staging screenshotspixelmatch pixel-level comparison + diff overlay image
🟡 WarningNew console error in staging that doesn't exist in devConsole message diff
🟡 WarningNew network request in staging — unexpected endpoint appearedNetwork request URL diff
🟡 WarningRequest present in dev is missing in staging — endpoint removed or brokenNetwork request URL diff
🟡 WarningAPI status changed between environments (any non-5xx change)Network status diff
🔵 InfoDOM structural change — element count differs between dev and stagingHTML tag count comparison across snapshots

What It Does

Argus watches your running application and automatically surfaces issues that test suites miss: visual regressions, API loops, CSS drift, console noise, and accessibility failures — all with screenshots delivered directly to Slack.

FeatureDescription
Error DetectionCrawls your app's routes; captures JS exceptions, console errors, failed API calls, redirect chains, and broken internal links
Environment ComparisonDiffs dev vs staging: screenshots, DOM structure, network requests, console errors
CSS AnalysisDetects cascade overrides, component style leaks, unused rules, React inline style conflicts
API Frequency AnalysisFlags endpoints called more than once per page load (double-fetch, missing useEffect deps, infinite loops)
Network Performanceslow_api > 1s/3s and large_payload > 500KB/2MB per API call
SEO ChecksMissing meta description, OG tags, canonical, viewport, h1 — DOM-inspected on every route
Security CheckslocalStorage tokens, token-in-URL, eval(), sensitive console output, missing CSP/X-Frame-Options
Content Qualitynull/undefined rendered text, lorem ipsum, broken images, empty data lists
Responsive AnalysisOverflow + touch target checks at 375/768px; screenshot grid at 4 breakpoints dispatched to Slack
Memory Leak DetectionV8 heap snapshot → detached DOM node count; heap growth across navigate-away + navigate-back
Runtime Anti-PatternsSynchronous XHR, document.write, long tasks > 50ms, CORS violations, service worker registration failures, and missing cache headers on static assets — detected via script injection and post-load HEAD checks
Hover-State Bug DetectionFires hover on every [aria-haspopup] and [data-tooltip] element; detects broken dropdowns and invisible tooltips that CSS :hover was supposed to reveal
Accessibility Snapshot AnalysisCalls take_snapshot then evaluate_script; flags interactive elements missing accessible names, unlabelled form controls, duplicate landmark regions, heading level skips, and aria-expanded buttons with missing/broken aria-controls
Keyboard Focus AnalysisTab-walks every focusable element (up to 20 steps); detects focus_visible_missing (button/link with outline:0 and no box-shadow fallback — keyboard users cannot see where focus is)
Chrome DevTools Issues PanelQueries list_console_messages({ types: ['issue'] }) for the Issues panel namespace, which is entirely separate from console.error; catches CSP violations and deprecated API usage (verified) — additional Chrome-surfaced types (CORS blocks, mixed content, cookie misconfiguration, low-contrast) are classified when present
Mobile CPU ThrottlingApplies 4× CPU throttle (emulate_cpu({ throttlingRate: 4 })) during ≤768px responsive breakpoints — finds layout reflow and animation jank that only manifests under realistic mobile CPU pressure
Origin-Tagged Network FindingsAll network error and timing findings carry origin: 'first-party' | 'third-party' so operators can triage critical first-party failures without digging through third-party CDN noise
Historical BaselinesSaves finding keys after each run; subsequent runs only alert on new issues; trend summary in Slack digest
Flakiness DetectionCrawls each route twice per run; findings in both runs are confirmed (original severity); findings in only one run are marked flaky (severity: info, :zap: _flaky_ label)
User Flow AssertionsNamed multi-step flows (navigate/fill/click/press_key/drag/upload_file/waitFor/sleep/handle_dialog/assert) with baseline-sliced no_console_errors, no_network_errors, element_visible, url_contains, no_js_errors asserts — runs end-to-end user journeys without writing Playwright specs · Use typing: true on a fill step to dispatch real keyboard events via mcp.type_text (triggers input-event validation) · Use drag step to fire dragstart→dragover→drop sequences · Use upload_file step to deliver a local file to a file input via CDP ({ action: 'upload_file', selector: 'input[type=file]', filePath: '/path/to/file' })
API Contract ValidationDefine apiContracts[] in targets.js with inline schema or schemaFile; validates captured response bodies against JSON Schema (type, required, properties, items) — emits api_contract_violation warnings when shapes diverge from spec
Severity Policy OverridesDefine severityOverrides in targets.js ({ finding_type: 'info' | 'warning' | 'critical' | 'suppress' }); applied before Slack routing — remap or silence specific detections without touching analyzer code
Auth Token RefreshrefreshSession() is called before each route; re-runs the login flow when the saved session has less than sessionRefreshWindowMs (default 5 min) remaining — prevents long crawls from failing mid-run when the auth cookie expires
Slack-optional modeWhen SLACK_BOT_TOKEN is not configured, Argus skips Slack entirely and auto-generates a local report.html (all findings + inline screenshots) and opens it in the default browser — zero setup required to start using Argus
Codebase Cross-ReferencePoints ARGUS_SOURCE_DIR at your app source to detect: missing env vars (process.env.X used in code but absent from .env), feature flag leakage (conditional env var that is falsy/unset), console error stack traces resolved to file:line, and internal links that return 404 — all without opening a browser
GitHub PR IntegrationPosts a structured Markdown findings table as a PR comment (updates in-place — one comment per PR, no spam); sets an argus-qa commit status check (failure when new criticals exist, success otherwise) — blocks merge via branch protection when regressions are introduced. Requires GITHUB_TOKEN + GITHUB_REPOSITORY env vars
Auto Route DiscoveryAugments manual routes[] with paths from three sources: fetches /sitemap.xml (follows one sitemap-index level, 10s timeout), scans Next.js pages/ (Next 12) and app/ (Next 13+) directories stripping route groups (auth), and greps JS/TS source for React Router <Route path> declarations. Dynamic [param] segments are skipped — no concrete URL to crawl. Manual route config (critical, waitFor) always takes precedence.
argus init Setup Wizardnpm run init (or npx argus init) guides first-time setup: collects target URLs, detects the app framework (Next.js / React Router / unknown) from the source directory's package.json, runs C3 route discovery against the dev URL, prompts for optional Slack tokens and GitHub credentials, then writes a populated .env and a pre-filled src/config/targets.js — zero manual config editing required.
Watch Modenpm run watch attaches to whatever Chrome tab is open and polls list_console_messages + list_network_requests every 1 s (configurable via ARGUS_WATCH_INTERVAL_MS). Reports new console errors, network failures (4xx/5xx), CORS blocks, and auth failures in real time — without navigating. On Ctrl+C, generates a final reports/report.html. No route config needed.
Full Lighthouse SuiteAll 4 Lighthouse categories (performance, SEO, best-practices, accessibility) with per-audit items
Performance BudgetsEnforces LCP < 2500ms, CLS < 0.1, FID < 100ms, TTFB < 800ms per route
Slack NotificationsRich Block Kit reports with inline screenshots routed to #bugs-critical, #bugs-warnings, #bugs-digest
Slash Command/argus-retest <url> triggers an on-demand test from any Slack channel
CI IntegrationGitHub Actions workflow runs daily at 6 AM UTC and on every push to main
MCP Server (AI-callable Argus)Register Argus as an MCP server via .mcp.json; Claude (or any MCP client) can call argus_audit, argus_audit_full, argus_compare, argus_last_report, argus_watch_snapshot, and argus_get_context directly from a conversation — no CLI, no terminal required. Published to npm as argusqa-os — add via { "command": "npx", "args": ["-y", "argusqa-os"] } in .mcp.json

Works with React + SCSS, CSS Modules, CSS-in-JS (styled-components / emotion), and plain HTML/CSS apps.


How It Works

Three components run against the same Chrome instance:

Claude Code (Terminal / VS Code)
  ├── MCP Protocol → Chrome DevTools MCP Server → Chrome
  └── Writes → Orchestration Layer → Slack Bot API
  • Chrome DevTools MCP Server — programmatic access to Chrome: network traffic, console, screenshots, DOM, performance traces
  • Claude Code — orchestration hub: reads codebase, drives the MCP tools, classifies findings, posts to Slack
  • Slack Bot (BugBot) — receives reports, exposes /argus-retest slash command, handles Acknowledge / Retest button actions

In interactive mode (running from Claude Code), MCP tools are called natively. In CI mode (GitHub Actions), src/utils/mcp-client.js spawns chrome-devtools-mcp as a child process and communicates via JSON-RPC over stdio.


Prerequisites

RequirementVersionNotes
Node.jsv20.19+Required by Chrome DevTools MCP
ChromeStable (current)Must be installed
Claude CodeLatestnpm install -g @anthropic-ai/claude-code
Slack workspaceOptional — only needed if you want Slack reports. Without it, Argus generates a local report.html instead

One-Time Setup

Option A — MCP Server (Claude Code / any MCP client)

No local install required. npx auto-downloads argusqa-os on first use.

1. Register both MCP servers

Add to .mcp.json in your project root:

{
  "mcpServers": {
    "chrome-devtools": {
      "command": "npx",
      "args": ["-y", "chrome-devtools-mcp@latest"]
    },
    "argus": {
      "command": "npx",
      "args": ["-y", "argusqa-os"]
    }
  }
}

Or via Claude Code CLI:

claude mcp add chrome-devtools -- npx -y chrome-devtools-mcp@latest
claude mcp add argus -- npx -y argusqa-os

2. Environment variables

Create a .env file in your project root:

TARGET_DEV_URL=http://localhost:3000
TARGET_STAGING_URL=https://staging.yourapp.com   # optional — enables argus_compare

3. Start Chrome with remote debugging

# macOS
open -a "Google Chrome" --args --remote-debugging-port=9222 --headless=new

# Windows
"C:\Program Files\Google\Chrome\Application\chrome.exe" --remote-debugging-port=9222 --headless=new --no-sandbox --disable-gpu

# Linux
google-chrome --remote-debugging-port=9222 --headless=new --no-sandbox

4. Slack notifications (optional)

Skip to use local report.html mode — Argus generates a self-contained HTML report when Slack is not configured.

  1. api.slack.com/appsCreate New App → name it BugBot
  2. OAuth & Permissions → Bot Token Scopes: chat:write, files:write, files:read
  3. Install to workspace → copy Bot User OAuth Token (xoxb-...) to .env as SLACK_BOT_TOKEN
  4. Create #bugs-critical, #bugs-warnings, #bugs-digest and /invite @BugBot in each
SLACK_BOT_TOKEN=xoxb-...
SLACK_CHANNEL_CRITICAL=C0000000000
SLACK_CHANNEL_WARNINGS=C0000000001
SLACK_CHANNEL_DIGEST=C0000000002

Option B — npm Package (dev dependency / CI/CD)

1. Install

npm install --save-dev argusqa-os

2. Environment variables

Run the interactive wizard to auto-generate .env and src/config/targets.js:

npx argus

The wizard detects your framework (Next.js / React Router), discovers routes from sitemap.xml and your file structure, and optionally collects Slack and GitHub credentials.

Alternative — manual setup: Create a .env with TARGET_DEV_URL and optionally TARGET_STAGING_URL.

3. Start Chrome with remote debugging

Same as Option A — see above.

4. Slack notifications (optional)

Same as Option A — see above.


Option C — Clone the Repository (full source / contributors)

1. Clone and install

git clone https://github.com/ironclawdevs27/Argus.git
cd Argus
npm install
npm run setup   # creates reports/ directory

2. Environment variables

Recommended — use the interactive setup wizard:

npm run init

Alternative — manual setup:

cp .env.example .env

Open .env and fill in:

TARGET_DEV_URL=http://localhost:3000
TARGET_STAGING_URL=https://staging.yourapp.com   # leave blank → CSS-only analysis mode

# Slack — OPTIONAL. Omit to get a local report.html instead.
# SLACK_BOT_TOKEN=xoxb-...
# SLACK_SIGNING_SECRET=...
# SLACK_CHANNEL_CRITICAL=C0000000000
# SLACK_CHANNEL_WARNINGS=C0000000001
# SLACK_CHANNEL_DIGEST=C0000000002

3. Configure routes

If you ran npm run init — skip this step.

Otherwise, edit src/config/targets.js:

export const routes = [
  { path: '/',          name: 'Home',      critical: true,  waitFor: 'main' },
  { path: '/login',     name: 'Login',     critical: true,  waitFor: 'form' },
  { path: '/dashboard', name: 'Dashboard', critical: true,  waitFor: '[data-testid="dashboard"]' },
  { path: '/settings',  name: 'Settings',  critical: false, waitFor: null },
];
  • critical: true — errors on this route go to #bugs-critical
  • waitFor — CSS selector Argus waits for before capturing (signals the page is ready)

4. Connect Chrome DevTools MCP to Claude Code

claude mcp add chrome-devtools -- npx chrome-devtools-mcp@latest

Verify — ask Claude: "List all open Chrome pages" — you should see your tabs.

5. Start Chrome with remote debugging

Same as Option A — see above.

6. Slack notifications (optional)

Same as Option A — see above.


Running Argus

Option A — Via MCP (Claude Code / any MCP client)

Ask Claude directly — no terminal needed.

Available tools:

ToolWhat it does
argus_auditFast QA pass — JS errors, network failures, accessibility, SEO, security, CSS, content
argus_audit_fullDeep QA pass — adds Lighthouse, responsive layout checks across 4 viewports, memory leak detection, hover-state bug detection, and accessibility tree snapshot
argus_compareDiff dev vs staging — screenshots, findings delta, environment regressions
argus_last_reportReturn the last saved JSON report without re-running a scan
argus_watch_snapshotSnapshot the currently open Chrome tab without navigating — raw console + network capture
argus_get_contextCapture everything broken on the open tab, formatted as a diagnostic context for Claude to diagnose and suggest fixes

argus_audit — fast audit of any URL:

Run argus_audit on http://localhost:3000/checkout
Run argus_audit on http://localhost:3000/login with critical: true

argus_audit_full — deep audit with Lighthouse + memory + responsive checks:

Run argus_audit_full on http://localhost:3000/dashboard

argus_compare — dev vs staging diff (reads TARGET_DEV_URL and TARGET_STAGING_URL from .env):

Run argus_compare

argus_last_report — retrieve last audit without re-running Chrome:

Run argus_last_report

argus_watch_snapshot — snapshot the currently open tab without navigating. Useful when the page is in an authenticated or post-interaction state that navigation would reset:

Run argus_watch_snapshot
Run argus_watch_snapshot with url: http://localhost:3000

argus_get_context — when your app is stuck or throwing errors, run this to capture everything that's broken and feed it to Claude for diagnosis:

Run argus_get_context

Then follow with: "Here's the context — what's causing these errors and how do I fix them?"


Option B & C — Via CLI / npm scripts

Available commands:

CommandWhat it does
npm run crawlMulti-page batch audit of all routes in targets.js
npm run compareDev vs staging diff (or CSS analysis if no TARGET_STAGING_URL)
npm run watchPassive monitor — polls the open Chrome tab every 1s, no navigation
npm run report:htmlGenerate reports/report.html from the latest JSON audit
npm run serverStart the Slack slash command + interaction server (port 3001)
npm run initInteractive setup wizard — generates .env + targets.js
npm run test:unitRun 61 unit tests (no Chrome required)
npm run test:harnessRun 82-block correctness harness (requires Chrome)

npm run crawl — full audit of all configured routes:

npm run crawl

Reports are saved to reports/ as JSON files. Run npm run report:html after any crawl for a portable reports/report.html with all screenshots inlined — useful for sharing with designers or reviewing offline.

npm run compare — dev vs staging diff:

npm run compare

When TARGET_STAGING_URL is not set, automatically switches to CSS analysis mode — cascade overrides, component style leaks, unused rules, and React inline style conflicts on the dev environment only.

npm run watch — passive monitoring (polls every 1s, no navigation):

Attaches to whatever Chrome tab is open and reports new issues in real time without navigating anywhere. Use this while developing.

Requires 2 terminals:
  Terminal 1 — your app (npm start / npm run dev)
  Terminal 2 — npm run watch

Steps:

  1. Open Chrome and navigate to your app
  2. Terminal 1: start your application
  3. Terminal 2: npm run watch — Argus begins polling
  4. Develop normally — console errors, network failures (4xx/5xx), CORS blocks, and auth failures print in real time
  5. Ctrl+C — stops the monitor and writes reports/report.html
# Attribute findings to a specific URL:
npm run watch http://localhost:4000
VariableDefaultDescription
ARGUS_WATCH_INTERVAL_MS1000Poll interval in milliseconds
TARGET_DEV_URLhttp://localhost:3000URL attributed to findings when none passed

npm run report:html — generate HTML dashboard from last audit:

npm run report:html
# → reports/report.html (all findings + inline screenshots, portable, no server needed)

Option D — From Slack (on-demand)

/argus-retest https://staging.yourapp.com/checkout

BugBot responds immediately, runs the test, and posts results back. Detailed bug reports go to #bugs-critical. See Slack Slash Command Setup for configuration.


CSS Analysis Mode

When TARGET_STAGING_URL is not set in .env, npm run compare automatically switches to CSS analysis mode instead of comparing two environments.

What it analyzes on your dev environment:

CheckWhat it catches
Cascade overridesSame CSS property declared multiple times on an element; !important flagged as warning
Component style leaksBEM selector (.card__title) found in a stylesheet that doesn't belong to that component
Unused rulesCSS selectors that match no element on the current page
CSS ModulesDetects hashed class names; extracts readable component names (Button, Card, etc.)
React inline style conflictsstyle="" attribute overriding a stylesheet declaration on the same element
SCSS source mapsTraces compiled CSS back to original .scss files where source maps are available

API frequency analysis also runs automatically:

Call countSeverityLikely cause
2 callsinfoPossible prefetch + actual — verify intentional
3–4 callswarningDouble-fetch — check useEffect deps or component re-mounts
5+ callscriticalRunaway loop — missing cleanup, infinite re-render

Performance Budgets

Argus enforces these thresholds on every crawl:

MetricThresholdSeverity
LCP (Largest Contentful Paint)< 2500mswarning
CLS (Cumulative Layout Shift)< 0.1warning
FID / TBT (interaction latency)< 100mswarning
TTFB (Time to First Byte)< 800mswarning

Violations are reported as individual warning bugs with the measured value.


Lighthouse Suite

Runs all four Lighthouse categories on every route:

  • Accessibility — score < 50 → critical; score < 90 → warning
  • Performance — score < 90 → warning
  • SEO — score < 90 → warning
  • Best Practices — score < 90 → warning

Individual failing audit items (e.g., missing alt text, low contrast, render-blocking resources) are surfaced as separate findings alongside the category score.


Slack Channel Routing

Slack is optional. When SLACK_BOT_TOKEN is not set, Argus skips Slack entirely and auto-generates a local report.html (all findings + inline screenshots) and opens it in the default browser. No Slack setup needed to start using Argus.

When Slack is configured, findings are routed by severity:

SeverityChannelWhen
critical#bugs-criticalJS exceptions, HTTP 5xx, blank page, auth failure, API called 5+ times, Lighthouse accessibility < 50, auth token in storage/URL, responsive overflow, slow API > 3s, payload > 2MB, > 100 detached DOM nodes, CORS policy violations, debugger; statements in production code, blocked mixed content (HTTP resource on HTTPS page)
warning#bugs-warningsVisual regression > 0.5%, HTTP 4xx, CSS overrides with !important, API called 3–4×, Lighthouse scores < 90, missing SEO/OG tags, missing security headers, placeholder content, touch targets too small, slow API > 1s, payload > 500KB, > 10 detached DOM nodes, redirect chains > 2 hops, broken links, sync XHR, document.write, long tasks > 50ms, SW registration failures, duplicate id attributes, passive mixed content (images/audio on HTTPS page)
info#bugs-digestConsole warnings, unused CSS rules, API summaries, CSS Modules detection, empty data lists, responsive screenshot grid, missing cache headers on static assets

Each message includes:

  • Severity badge + affected URL + timestamp
  • AI-generated description
  • Inline screenshot (uploaded directly to Slack — no external hosting)
  • View Page, Acknowledge, and Retest action buttons

Slack Slash Command Setup

To use /argus-retest from Slack, you need to expose the Argus server publicly.

Step 1 — Start the server

npm run server

Server runs on port 3001.

Step 2 — Expose with Cloudflare Tunnel

Download cloudflared (free, no account needed), then:

cloudflared tunnel --url http://localhost:3001

Alternatively, with no install at all (SSH tunnel):

ssh -R 80:localhost:3001 [email protected]

Copy the public HTTPS URL that appears.

Step 3 — Configure Slack App

  1. api.slack.com/apps → BugBot → Slash Commands → Create New Command:

    • Command: /argus-retest
    • Request URL: https://your-public-url/slack/commands
    • Description: Run Argus regression test on a URL
    • Usage hint: <url>
  2. Interactivity & Shortcuts → Enable → Request URL: https://your-public-url/slack/interactions

  3. OAuth & PermissionsReinstall to Workspace

Step 4 — Test

/argus-retest http://localhost:3000

BugBot should reply within 3 seconds with a "running" acknowledgement, then post results.


GitHub Actions CI Setup

Add secrets to your repository

Go to GitHub repo → SettingsSecrets and variablesActions → add:

Secret nameRequiredValue
SLACK_BOT_TOKENNoYour xoxb-... token. Omit entirely to use Slack-optional mode — Argus generates report.html instead
SLACK_SIGNING_SECRETNo*From Slack App → Basic Information (only needed for /argus-retest slash command)
SLACK_CHANNEL_CRITICALNo*Channel ID (required when Slack is configured)
SLACK_CHANNEL_WARNINGSNo*Channel ID (required when Slack is configured)
SLACK_CHANNEL_DIGESTNo*Channel ID (required when Slack is configured)
TARGET_STAGING_URLYesYour staging base URL
GITHUB_TOKENNoFor C2 PR integration — auto-injected by GitHub Actions as secrets.GITHUB_TOKEN
GITHUB_REPOSITORYNoFor C2 PR integration — owner/repo format (e.g., acme/my-app)

C2 PR integration: when GITHUB_TOKEN and GITHUB_REPOSITORY are set, Argus posts a PR comment and commit status check for every crawl. GITHUB_PR_NUMBER is injected automatically by the workflow from github.event.pull_request.number. The included workflow does not wire these up by default — add them to the env: block in .github/workflows/argus.yml if you want PR-level comments.

The workflow at .github/workflows/argus.yml runs:

  • On every push to main / master
  • Daily at 6 AM UTC (before the team starts work)
  • Manually via ActionsRun workflow (with optional URL override)

If critical issues are found, the pipeline fails — preventing silent regressions from being missed.


Project Structure

argus/
├── .env                              # Your secrets (never commit this)
├── .env.example                      # Template — copy to .env
├── .gitignore
├── package.json
├── README.md
├── .claude/
│   └── settings.json                 # Claude Code permission config (auto-approve node/npm/reports)
├── .github/
│   └── workflows/
│       └── argus.yml                 # CI pipeline
├── .vscode/
│   └── mcp.json                      # Chrome DevTools MCP config for VS Code
├── .mcp.json                         # Argus MCP server registration — exposes argus_audit/argus_audit_full/argus_compare/argus_last_report to Claude
├── src/
│   ├── argus.js                      # Single-page audit entry point
│   ├── batch-runner.js               # Multi-page batch audit
│   ├── mcp-server.js                 # Argus MCP server — argus_audit / argus_audit_full / argus_compare / argus_last_report
│   ├── adapters/
│   │   └── browser.js                # CdpBrowserAdapter — facade over all chrome-devtools-mcp calls
│   ├── domain/
│   │   └── finding.js                # createFinding() factory — canonical finding shape
│   ├── registry.js                   # Analyzer plugin registry — registerExpensive/getCheap/getExpensive
│   ├── config/
│   │   ├── targets.js                # Routes to test, thresholds, config
│   │   └── schema.js                 # Zod validation schema; validateConfig() called inside runCrawl()
│   ├── orchestration/
│   │   ├── crawl-and-report.js       # Backward-compat re-export shell → orchestrator + report-processor + dispatcher
│   │   ├── orchestrator.js           # Crawl loop, route/flow crawl, runCrawl()
│   │   ├── report-processor.js       # Dedup → severity overrides → baseline → JSON write
│   │   ├── dispatcher.js             # Slack / GitHub / HTML dispatch
│   │   ├── env-comparison.js         # Dev vs staging diff + CSS analysis mode
│   │   ├── watch-mode.js             # Passive browser monitoring (WatchSession + runWatchMode)
│   │   └── slack-notifier.js         # Slack Block Kit dispatcher
│   ├── server/
│   │   ├── index.js                  # Express server (port 3001)
│   │   ├── slash-command-handler.js  # /argus-retest handler
│   │   └── interaction-handler.js    # Acknowledge + Retest button handler
│   ├── utils/
│   │   ├── css-analyzer.js           # CSS analysis script injected into the browser
│   │   ├── seo-analyzer.js           # SEO checks: meta, OG tags, h1, canonical, viewport
│   │   ├── security-analyzer.js      # Security: localStorage tokens, eval(), headers, cookies
│   │   ├── content-analyzer.js       # Content quality: null text, placeholders, broken images
│   │   ├── responsive-analyzer.js    # Responsive: overflow + touch targets at 4 breakpoints
│   │   ├── memory-analyzer.js        # Memory leaks: V8 heap snapshot + heap growth
│   │   ├── logger.js                 # Pino structured logger — childLogger(module)│   │   ├── retry.js                  # withRetry() exponential backoff — navigate/fill only; Number.isFinite guard│   │   ├── telemetry.js              # OTel tracing + metrics — startSpan() / recordFinding() / recordFlaky() / recordNewFindings(); no-op default│   │   ├── session-manager.js        # Auth: backward-compat re-export barrel│   │   ├── session-persistence.js    # Auth: saveSession (mkdirSync+atomic write), restoreSession, hasSession, clearSession│   │   ├── login-orchestrator.js     # Auth: runLoginFlow, refreshSession + lock file│   │   ├── baseline-manager.js       # Baselines: loadBaseline, saveBaseline, applyBaseline, appendTrend
│   │   ├── flakiness-detector.js     # Flakiness: mergeRunResults — confirmed vs flaky per double-crawl
│   │   ├── flow-runner.js            # User flow assertions: runFlow / runAllFlows — assert DSL
│   │   ├── html-reporter.js          # HTML dashboard: generateHtmlReport() + npm run report:html (D7.1 / D7.7)
│   │   ├── parallel-crawler.js       # chunkArray sharding utility (ARGUS_CONCURRENCY=N parallel crawl)
│   │   ├── contract-validator.js     # API contract validation: validateSchema, matchesContract (D7.4)
│   │   ├── severity-overrides.js     # Severity policy overrides: applyOverrides (D7.5)
│   │   ├── slack-guard.js            # Slack-optional guard: isSlackConfigured() (D7.7)
│   │   ├── hover-analyzer.js         # Hover-state bug detection — aria-haspopup + data-tooltip (D8.1)
│   │   ├── snapshot-analyzer.js      # Accessibility tree snapshot — missing names, labels, landmarks, heading hierarchy, ARIA state (D8.2 + v6)
│   │   ├── issues-analyzer.js        # Chrome DevTools Issues panel — CSP/deprecated/cookie issues
│   │   ├── network-timing-analyzer.js # HAR timing analysis — slow third-party detection
│   │   ├── keyboard-analyzer.js      # Keyboard Tab-walk — focus_visible_missing, focus_lost
│   │   ├── codebase-analyzer.js      # Codebase cross-reference — env vars, feature flags, dead routes (C1)
│   │   ├── github-reporter.js        # GitHub PR comment + commit status integration (C2)
│   │   ├── route-discoverer.js       # Auto route discovery — sitemap + Next.js + React Router (C3)
│   │   ├── diff.js                   # pixelmatch screenshot + DOM/network diff utilities
│   │   ├── mcp-parsers.js            # Text-format parsers for list_console_messages + list_network_requests (v9)
│   │   └── mcp-client.js             # Headless JSON-RPC MCP client for CI mode
│   └── cli/
│       └── init.js                   # argus init setup wizard — detect framework, discover routes, write .env + targets.js (C4)
├── test/
│   └── unit/                         # Vitest unit tests — no Chrome required
│       ├── finding.test.js           # createFinding() — fields, throws, frozen, extra fields (8 tests)
│       ├── config-schema.test.js     # validateConfig() + ConfigSchema.safeParse (8 tests)
│       ├── report-processor.test.js  # deduplicateFindings + rebuildSummary (11 tests)
│       ├── flakiness-detector.test.js # findingKey normalization + mergeRunResults (13 tests)
│       ├── baseline-manager.test.js  # loadBaseline/saveBaseline/applyBaseline (9 tests)
│       └── flow-runner.test.js       # normalizeArray (pure) + runFlow mock browser (11 tests)
├── landing/                          # Product landing page (React 18 + Vite + Tailwind + Framer Motion)
│   ├── src/
│   │   ├── App.jsx                   # Single-page app — hero, features, comparison, waitlist + enterprise modals
│   │   └── supabase.js               # Supabase client factory (null-safe when env vars missing)
│   ├── public/
│   │   ├── favicon.svg               # SVG favicon — purple ring + dot
│   │   ├── argus-poster.png          # Video poster fallback (1918×1078)
│   │   ├── og-image-v2.jpg           # OG social card — 1200×630 JPEG, branded overlay, black-outlined stat numbers
│   │   ├── robots.txt                # Allows all crawlers; Sitemap reference
│   │   └── sitemap.xml               # Canonical URL for argus-qa.com/
│   ├── index.html                    # Vite entry; OG/Twitter/JSON-LD SEO tags; canonical; favicon
│   ├── package.json
│   ├── .env.example                  # VITE_SUPABASE_URL + VITE_SUPABASE_ANON_KEY template
│   └── README.md                     # Setup guide, Supabase SQL schema, env vars, deployment
├── scripts/
│   └── dispatch-report.js            # Standalone Slack re-dispatch script (re-posts last report.json to Slack)
├── test-harness/                     # Fixture server + test runner (82 blocks, 348 hard assertions, 54 fixture pages)
│   ├── README.md
│   ├── server.js                     # Express fixture server (ports 3100 dev / 3101 staging)
│   ├── harness-config.js             # Route definitions + expected findings
│   ├── validate.js                   # Test runner — 82 numbered blocks ([80] MCP server, [81] createFinding, [82] withRetry)
│   ├── pages/                        # 54 fixture pages (one per detection category)
│   ├── nextjs-fixture/               # Next.js app structure for C3 discovery tests (10 files)
│   ├── source-fixture/               # Minimal app.js for C1 codebase-analyzer tests (env var audit)
│   └── static/
│       └── button-styles.css         # BEM card selectors in button file → component leak
└── reports/                          # Output: JSON reports + screenshots (gitignored)
    ├── baselines/
    │   ├── <branch>.json             # Per-route finding keys — per git branch (D7.2)
    │   └── <branch>-trends.json      # Append-only run history per branch (D7.2)
    └── .gitkeep

Key Technical Decisions

DecisionChoiceReason
Screenshot comparisonpixelmatch + AI classificationpixelmatch is fast and deterministic; Claude removes false positives from anti-aliasing and dynamic content
Slack APIBot API, not Incoming WebhooksBot API supports file uploads, message updates, interactive buttons, and threads
File uploadsfiles.getUploadURLExternal + PUT + files.completeUploadExternalfiles.upload is deprecated; pre-signed URL requires PUT — POST silently produces broken files
CSS analysisScript injected via evaluate_scriptRuns in page context so it sees the live computed styles, CSS Modules hashes, and React fiber properties
Responsive viewportemulate (not resize_page)resize_page only resizes the browser window and does not update CSS viewport width — emulate is the correct API
Viewport width measurementdocument.documentElement.clientWidthAfter emulate with mobile flag, window.innerWidth returns the legacy layout viewport (~952px), not the device width
V8 heap snapshottake_memory_snapshot({ filePath }) → read from diskThe MCP tool writes JSON to disk (not inline); parse with JSON.parse(fs.readFileSync(filePath)) then delete the temp file
Detached DOM detectionWalk flat nodes array for "Detached " prefix in strings tableChrome serializes detached elements as "Detached HTMLDivElement" etc.; secondary check on detachedness === 2 (Chrome 90+)
Baseline finding keytype::message[:100]::statusExcludes timestamps and dynamic URL path IDs; message truncated to 100 chars to handle slight wording variations; ::status suffix only added when non-null
Baseline alert filterisNew === true (strict)Only findings explicitly marked new by applyBaseline are dispatched to Slack — prevents stale re-dispatch if baseline-manager is not called (fails silently rather than spamming)
Flakiness routingseverity: 'info' for flaky findingsDowngrading severity means existing dispatchToSlack routing sends them to the info digest with zero routing changes — only the :zap: _flaky_ label needed
Private findingKey per moduleEach of baseline-manager.js and flakiness-detector.js has its own copyAvoids coupling two independently-useful modules via a shared export for a trivial 3-line function
Runtime anti-pattern injectionaddScriptToEvaluateOnNewDocument via MCPScripts registered this way run in the new page context before any page script — intercepts XMLHttpRequest.open, document.write, and navigator.serviceWorker.register before the page can call them
CORS error detectionlist_console_messages + text match, not in-page interceptCORS errors are generated by the browser itself, not by page JS — console.error patcher misses them; the MCP console log captures them
Long task detectionPerformanceObserver({ entryTypes: ['longtask'] }) injected before loadOnly the duration is included in the finding message (not startTime) — ensures identical tasks on two crawl runs produce the same dedup key
CI MCP clientJSON-RPC over stdioIn CI there's no Claude Code agent — the headless client replaces it with the same API surface
Node.jsv20.19+Minimum required by Chrome DevTools MCP

Known MCP Tool Limitations

The Chrome DevTools MCP behavioral constraints below cause 3 permanent test failures in the harness (345/348 pass). These are MCP-layer restrictions — they cannot be fixed in Argus code.

type_text clarification: type_text does fire DOM input events when the element is properly focused first with mcp.click({ uid }). Always use uid-based focus — passing { selector } to mcp.click silently does nothing.

ToolConstraintImpact
dragUses mouse simulation, not HTML5 DnD APIdragstart/dragover/drop events never fire
list_console_messages({ types: ['issue'] })Issues panel returns empty even when violations existCSP and deprecated-API detection is unreliable

These constraints are documented with workarounds in SKILL.md §10.


Environment Variables Reference

VariableRequiredDescription
SLACK_BOT_TOKENNoxoxb-... Bot User OAuth Token. Omit to enable Slack-optional mode — Argus generates report.html and opens it in the browser instead
SLACK_SIGNING_SECRETNo*Verifies slash command / interaction requests from Slack (required only when using /argus-retest)
SLACK_CHANNEL_CRITICALNo*Channel ID for critical bugs (required when Slack is configured)
SLACK_CHANNEL_WARNINGSNo*Channel ID for warnings (required when Slack is configured)
SLACK_CHANNEL_DIGESTNo*Channel ID for info / daily digest (required when Slack is configured)
TARGET_DEV_URLYesBase URL of your dev environment
TARGET_STAGING_URLNoBase URL of staging. If blank → CSS analysis mode
SCREENSHOT_DIFF_THRESHOLDNoPixel diff % to flag (default: 0.5)
REPORT_OUTPUT_DIRNoWhere to write reports (default: ./reports)
ARGUS_CONCURRENCYNoNumber of parallel MCP clients for route crawling (default: 1 = sequential)
PORTNoServer port (default: 3001)
ARGUS_LOG_LEVELNoPino log level — trace, debug, info, warn, error, fatal (default: info)
ARGUS_LOG_PRETTYNoSet to 1 for human-readable log output instead of JSON (dev mode)
ARGUS_RETRY_ATTEMPTSNoMax retry attempts for navigate/fill MCP calls (default: 3)
OTEL_EXPORTER_OTLP_ENDPOINTNoOTLP collector endpoint — enables span/metric export to Jaeger, Grafana Tempo, Datadog, etc.
ARGUS_OTEL_CONSOLENoSet to 1 to print OTel spans to stdout without an OTLP endpoint (dev tracing)
ARGUS_WATCH_INTERVAL_MSNoWatch mode poll interval in milliseconds (default: 1000)
ARGUS_SOURCE_DIRNoPath to your app's source directory — enables codebase cross-reference (env var detection, feature flag leakage, dead routes)
ARGUS_ENV_FILENoPath to your app's .env file — C1 cross-references env vars used in source code against this file to detect missing declarations
GITHUB_TOKENNoGitHub personal access token — required for PR comment + commit status integration
GITHUB_REPOSITORYNoRepository in owner/repo format — required for GitHub PR integration
GITHUB_SHANoCommit SHA for the commit status check — injected automatically by GitHub Actions (${{ github.sha }})
GITHUB_PR_NUMBERNoPR number for comment targeting — set via ${{ github.event.pull_request.number }} in your workflow
ARGUS_REPORT_URLNoFull URL to the hosted HTML report — linked from the GitHub commit status check

Troubleshooting

Chrome DevTools MCP not connecting

claude mcp add chrome-devtools -- npx chrome-devtools-mcp@latest
# Then restart Claude Code

Slack messages not posting

  • Confirm SLACK_BOT_TOKEN starts with xoxb- (not xoxp-, xoxe-, or xapp-)
  • Verify BugBot is invited to each channel: /invite @BugBot
  • Check token scopes: chat:write, files:write, files:read

Screenshots not appearing in Slack messages

  • The upload uses a pre-signed URL that requires PUT, not POST — if you see a broken image, check that the Slack token has files:write scope and the channel is correct

Slash command returns "dispatch_failed"

  • Your tunnel URL has changed (Cloudflare Tunnel / localhost.run URLs change on restart)
  • Update the Request URL in Slack App → Slash Commands and reinstall

CSS analysis returns empty results

  • Page may be behind auth — make sure you're logged in on the Chrome instance Argus is controlling
  • Cross-origin stylesheets (CDN fonts, third-party widgets) can't be read due to browser security restrictions — this is expected

Screenshots are blank

  • Page hasn't finished loading — increase pageSettleMs in src/config/targets.js
  • Add a waitFor selector for that route

CI pipeline fails immediately

  • Chrome may not be starting fast enough — increase the sleep 3 after Chrome launch to sleep 5 in .github/workflows/argus.yml

How Argus Differs From Playwright / Cypress

Argus is not a replacement for unit or E2E tests. It's a complementary layer:

Playwright / CypressArgus
TestsYour logic and API contractsWhat the user actually sees
CatchesRegression in behaviourCSS drift, visual regressions, API redundancy, console noise, perf budgets
RunsIn your test suiteContinuously, on the live running app
SetupWrite test filesConfigure routes in targets.js
OutputPass / failStructured Slack reports with screenshots and action buttons

They complement each other — Argus catches what test suites miss.

Related Servers