Argus
AI-powered QA harness that catches JS errors, accessibility failures, visual regressions, and security issues via Chrome DevTools MCP — no test scripts required.
Argus — AI-Powered Dev Testing Tool
Argus Panoptes — the all-seeing giant of Greek mythology with a hundred eyes who never slept.
Automated browser testing pipeline that catches bugs, compares environments, and sends rich reports to Slack (or generates a self-contained HTML dashboard when Slack is not configured) — powered by Chrome DevTools MCP and Claude Code.
MCP Quick Start
Add both servers to your .mcp.json:
{
"mcpServers": {
"chrome-devtools": {
"command": "npx",
"args": ["-y", "chrome-devtools-mcp@latest"]
},
"argus": {
"command": "npx",
"args": ["-y", "argusqa-os"]
}
}
}
Or register via the Claude Code CLI:
claude mcp add chrome-devtools -- npx -y chrome-devtools-mcp@latest
claude mcp add argus -- npx -y argusqa-os
Set your target URL and start Chrome with remote debugging:
# .env
TARGET_DEV_URL=http://localhost:3000
# Start Chrome (required — Argus drives this instance via CDP)
# macOS: open -a "Google Chrome" --args --remote-debugging-port=9222 --headless=new
# Windows: "C:\Program Files\Google\Chrome\Application\chrome.exe" --remote-debugging-port=9222 --headless=new
# Linux: google-chrome --remote-debugging-port=9222 --headless=new --no-sandbox
Then ask Claude (or any MCP client):
Run argus_audit on http://localhost:3000
Six tools are exposed:
| Tool | What it does |
|---|---|
argus_audit | Fast QA pass — JS errors, network failures, accessibility, SEO, security, CSS, content |
argus_audit_full | Deep QA pass — adds Lighthouse scoring, responsive layout checks across 4 viewports, memory leak detection, hover-state bug detection, and accessibility tree snapshot |
argus_compare | Diff dev vs staging side-by-side — screenshots, findings delta, environment regressions |
argus_last_report | Return the last saved JSON report without re-running a scan |
argus_watch_snapshot | Snapshot the currently open Chrome tab without navigating — raw console + network capture |
argus_get_context | Capture everything broken on the open tab, formatted as a diagnostic context for Claude to diagnose and suggest fixes |
Requires: Node.js ≥ 20.19, Chrome (desktop or headless), and the
chrome-devtools-mcpserver registered alongside Argus (shown above).
The landing/ directory contains the product landing page (React + Vite + Tailwind + Framer Motion) with Supabase-backed waitlist and enterprise contact forms. Live at argus-qa.com (deployed via Cloudflare Pages; background video served from Cloudflare R2). See landing/README.md for setup.
| 🔴 Critical / 🟡 Warning / 🔵 Info | ⚙️ | 🧪 | 📋 |
|---|---|---|---|
| 114 distinct issue types detected | 24 analysis engines | 348 test assertions | 82 test blocks |
What Argus Catches
Argus runs 24 analysis engines per run and detects 114 distinct issue types across JavaScript runtime, network, CSS, performance, accessibility, SEO, security, content quality, responsive layout, memory, runtime anti-patterns, hover-state interactions, accessibility tree snapshots, keyboard focus, and Chrome DevTools issues panel — plus flakiness detection, historical baselines, user flow assertions, and environment comparison as cross-cutting layers. Every finding is classified by severity (critical / warning / info) and routed to the right Slack channel — or rendered as a local report.html when Slack is not configured.
JavaScript Runtime
| Severity | Bug / Issue | Detection Method |
|---|---|---|
| 🔴 Critical | Uncaught exceptions — TypeError, ReferenceError, etc. | window.onerror listener injected before page load |
| 🔴 Critical | Unhandled Promise rejections | unhandledrejection event listener injected into the page |
| 🟡 Warning | console.error calls (on non-critical routes) | Chrome DevTools list_console_messages |
| 🔴 Critical | console.error calls (on critical routes) | Chrome DevTools list_console_messages |
| 🔵 Info | console.warn deprecation notices and warnings | Chrome DevTools list_console_messages |
Network & API
| Severity | Bug / Issue | Detection Method |
|---|---|---|
| 🔴 Critical | HTTP 5xx server errors on any request | list_network_requests → status ≥ 500 |
| 🔴 Critical | 401 / 403 auth failures — user is being kicked out | list_network_requests → status 401 or 403 |
| 🔴 Critical | API endpoint called 5+ times in one page load — likely an infinite loop | Network frequency grouping by normalized URL + method |
| 🟡 Warning | HTTP 4xx client errors (404, 422, 429, etc.) | list_network_requests → status 400–499 (non-auth) |
| 🟡 Warning | API endpoint called 3–4 times — likely a double-fetch bug | Frequency grouping → 3 ≤ count ≤ 4 (check useEffect deps) |
| 🔵 Info | API endpoint called twice — may be intentional prefetch | Frequency grouping → count = 2 |
| 🔵 Info | API call summary per page load (total calls, unique endpoints, duplicates) | Aggregated network analysis |
| 🟡 Warning | Redirect chain longer than 2 hops — extra round-trips inflate load time | Navigation Timing redirectCount read after page settle |
| 🟡 Warning | Broken internal link — <a href> target returns HTTP 404 | <a> elements harvested via evaluate_script, each verified against list_network_requests |
Page Health
| Severity | Bug / Issue | Detection Method |
|---|---|---|
| 🔴 Critical | Blank or near-empty page — less than 50 characters of body text | document.body.innerText length check after navigation |
| 🟡 Warning | Expected element never appeared — page may have crashed mid-load | waitFor selector timeout after 10 seconds |
CSS & Styling
| Severity | Bug / Issue | Detection Method |
|---|---|---|
| 🟡 Warning | !important cascade conflict — forced override fighting another rule | CSS rule walk: property declared with !important on same element |
| 🟡 Warning | Component style leak — BEM selector found in the wrong stylesheet | .block__element selector in a file whose name doesn't match block |
| 🟡 Warning | React inline style overriding a stylesheet declaration on the same element | style="" attribute vs. matching CSS rule, __reactFiber presence confirmed |
| 🔵 Info | CSS property declared by multiple rules on the same element (cascade override) | Computed style walk across all matched rules per key element |
| 🔵 Info | Unused CSS rules — selectors matching no element on the page (> 10 flagged) | querySelectorAll(selector).length === 0 for every rule |
| 🔵 Info | CSS Modules detected — hashed class names found on DOM elements | Pattern _ComponentName_class_hash matched on live DOM |
| 🔵 Info | SCSS source map found — compiled CSS traced back to .scss origin file | sourceMappingURL comment in <style> tags |
Performance
| Severity | Bug / Issue | Detection Method |
|---|---|---|
| 🟡 Warning | LCP > 2500ms — largest element took too long to paint | Chrome performance trace → performance_analyze_insight |
| 🟡 Warning | CLS > 0.1 — layout shifted significantly after initial render | Chrome performance trace |
| 🟡 Warning | FID / TBT > 100ms — main thread was blocked during interaction | Chrome performance trace |
| 🟡 Warning | TTFB > 800ms — server took too long to send the first byte | Chrome performance trace |
Accessibility
| Severity | Bug / Issue | Detection Method |
|---|---|---|
| 🔴 Critical | Lighthouse accessibility score below 50 / 100 | Lighthouse audit via lighthouse_audit |
| 🟡 Warning | Lighthouse accessibility score 50–89 / 100 | Lighthouse audit |
| 🟡 Warning | Missing alt text on images | Individual Lighthouse audit check |
| 🟡 Warning | Insufficient color contrast ratio | Individual Lighthouse audit check |
| 🟡 Warning | Missing ARIA labels on interactive elements | Individual Lighthouse audit check |
| 🟡 Warning | Keyboard navigation broken or unreachable elements | Individual Lighthouse audit check |
SEO
| Severity | Bug / Issue | Detection Method |
|---|---|---|
| 🟡 Warning | Missing <meta name="description"> | DOM inspection via evaluate_script |
| 🟡 Warning | Missing Open Graph tags (og:title, og:description, og:image) | DOM inspection via evaluate_script |
| 🟡 Warning | og:image URL is relative — Open Graph requires an absolute URL | DOM inspection + URL prefix check (http:// / https://) |
| 🟡 Warning | Multiple <h1> tags on one page | DOM inspection — querySelectorAll('h1').length > 1 |
| 🟡 Warning | Zero <h1> tags — page has no primary heading | DOM inspection — querySelectorAll('h1').length === 0 |
| 🟡 Warning | Generic page title (less than 10 characters, or default placeholder) | DOM inspection + length check |
| 🟡 Warning | Missing <link rel="canonical"> | DOM inspection via evaluate_script |
| 🟡 Warning | Missing <meta name="viewport"> | DOM inspection via evaluate_script |
Security
| Severity | Bug / Issue | Detection Method |
|---|---|---|
| 🔴 Critical | Auth token found in localStorage or sessionStorage | evaluate_script walks storage keys for token patterns |
| 🔴 Critical | Sensitive token in the page URL (query param or hash) | URL pattern match against current window.location.href |
| 🔴 Critical | eval() call detected in page scripts | evaluate_script AST-style text scan of inline <script> tags |
| 🔴 Critical | CSP violation — inline script or external resource blocked by Content-Security-Policy | Chrome DevTools Issues panel (list_console_messages({ types: ['issue'] })) |
| 🟡 Warning | Sensitive data (password, token, secret) logged to the console | list_console_messages + keyword match |
| 🟡 Warning | Missing Content-Security-Policy response header | fetch(location.href) inside the page → response headers check |
| 🟡 Warning | Missing X-Frame-Options response header | Same headers fetch |
| 🟡 Warning | Cross-origin <iframe> without sandbox attribute — enables form submission, parent navigation, cookie access | evaluate_script checks iframe[src] elements for missing sandbox attribute |
| 🟡 Warning | Page served over plain HTTP with no HTTPS upgrade redirect | URL protocol check (http:// + non-localhost) |
| 🔵 Info | Cookie present without HttpOnly flag (limited detection — JS-visible cookies only) | document.cookie inspection |
| 🔵 Info | Deprecated browser API usage (e.g. document.domain, DOMSubtreeModified) | Chrome DevTools Issues panel |
Content Quality
| Severity | Bug / Issue | Detection Method |
|---|---|---|
| 🟡 Warning | null or undefined rendered as visible text | DOM text scan for literal "null" / "undefined" strings |
| 🟡 Warning | Lorem ipsum / placeholder copy still in production | DOM text scan for "lorem ipsum" and common placeholder strings |
| 🟡 Warning | Broken image (404 or failed to load) | evaluate_script checks img.naturalWidth === 0 on all images |
| 🔵 Info | Empty data list — <ul>, <ol>, or <select> with no children | DOM structure check |
Responsive / Mobile
| Severity | Bug / Issue | Detection Method |
|---|---|---|
| 🔴 Critical | Horizontal overflow at mobile / tablet viewport (≤ 768px) | emulate at 375px and 768px → document.documentElement.scrollWidth > clientWidth |
| 🟡 Warning | Touch target smaller than 44×44 px at mobile or tablet viewport | CSS computed size check on interactive elements at 375px and 768px |
| 🔵 Info | Responsive screenshot grid — snapshots at 375 / 768 / 1024 / 1440px | emulate at 4 breakpoints, screenshots dispatched to Slack |
Network Performance
| Severity | Bug / Issue | Detection Method |
|---|---|---|
| 🔴 Critical | API response time > 3000ms | PerformanceObserver entries for fetch / XHR calls |
| 🟡 Warning | API response time > 1000ms | Same observer, lower threshold |
| 🔴 Critical | API response payload > 2 MB | list_network_requests → response body size |
| 🟡 Warning | API response payload > 500 KB | Same, lower threshold |
| 🟡 Warning | Cross-origin (third-party) script TTFB > 2000ms — blocking render or late interactivity | HAR timing.wait field from list_network_requests HAR data; cross-origin requests only |
Network Request Origin Tagging
All network findings carry an origin field ('first-party' / 'third-party') so operators can triage critical first-party failures separately from third-party noise.
Lighthouse Audits
| Severity | Bug / Issue | Detection Method |
|---|---|---|
| 🔴 Critical | Lighthouse accessibility score < 50 / 100 | lighthouse_audit (accessibility category) |
| 🟡 Warning | Lighthouse accessibility score 50–89 / 100 | lighthouse_audit |
| 🟡 Warning | Lighthouse performance score < 90 / 100 | lighthouse_audit (performance category) |
| 🟡 Warning | Lighthouse SEO score < 90 / 100 | lighthouse_audit (seo category) |
| 🟡 Warning | Lighthouse best-practices score < 90 / 100 | lighthouse_audit (best-practices category) |
| 🟡 Warning | Individual failing Lighthouse audit items | Surfaced per-audit from the full Lighthouse report |
Memory Leaks
| Severity | Bug / Issue | Detection Method |
|---|---|---|
| 🔴 Critical | > 100 detached DOM nodes in V8 heap — severe leak | take_memory_snapshot → parse flat nodes array for "Detached Xxx" names |
| 🟡 Warning | > 10 detached DOM nodes in V8 heap — probable leak | Same snapshot parse, lower threshold |
| 🟡 Warning | Heap grew > 2 MB after navigate-away + navigate-back — probable per-load leak | performance.memory.usedJSHeapSize delta across round-trip (soft — GC-dependent) |
Runtime Anti-Patterns
| Severity | Bug / Issue | Detection Method |
|---|---|---|
| 🟡 Warning | Synchronous XMLHttpRequest — blocks the main thread until the server responds | XMLHttpRequest.open patched via addScriptToEvaluateOnNewDocument; async === false calls recorded |
| 🟡 Warning | document.write / document.writeln called — can erase the page or block parsing | document.write and document.writeln patched before page load; calls recorded with method + content |
| 🟡 Warning | Long task > 50ms on the main thread — blocks user interaction | PerformanceObserver with entryTypes: ['longtask'] injected before page load |
| 🔴 Critical | CORS policy violation — cross-origin fetch blocked by the browser | list_console_messages + pattern match for "has been blocked by CORS policy" |
| 🟡 Warning | Service worker registration failure — SW script returns 4xx or is invalid | navigator.serviceWorker.register patched before page load; .catch() records failing script URL |
| 🔵 Info | Same-origin static asset (.js, .css, .png, .woff2, etc.) served without Cache-Control or ETag — browsers cannot cache it efficiently | evaluate_script reads performance.getEntriesByType('resource'), HEAD-fetches each unique same-origin asset, checks response headers |
Historical Baselines & Trends
| Severity | Bug / Issue | Detection Method |
|---|---|---|
| 🔴 Critical | New critical finding not present in the saved baseline — regression introduced since last run | applyBaseline compares finding keys (type::message[:100]::status) against reports/baselines/<branch>.json (D7.2 per-branch) |
| 🟡 Warning | New warning finding not present in the baseline | Same key comparison, warning severity |
| 🔵 Info | Pre-existing finding still present — no change since last run | Suppressed from real-time alerts; included in info digest only |
| 🔵 Info | Run trend summary — new vs resolved counts, saved per run | Appended to reports/baselines/<branch>-trends.json; surfaced as a trend line in Slack digest |
Hover-State Bugs
| Severity | Bug / Issue | Detection Method |
|---|---|---|
| 🟡 Warning / 🔴 Critical | [aria-haspopup] element whose controlled popup does not become visible after hover — aria-expanded stays false and popup remains display:none / visibility:hidden / opacity:0 | hover dispatches mousemove; evaluate_script checks aria-expanded + getComputedStyle on the controlled element; critical on routes marked critical: true |
| 🟡 Warning | [data-tooltip] element whose [role="tooltip"] is not visible in the DOM after hover — not found or opacity ≤ 0.05 | Same hover + evaluate_script checks tooltip opacity, display, visibility, and offsetHeight |
Accessibility Snapshot Analysis
| Severity | Bug / Issue | Detection Method |
|---|---|---|
| 🟡 Warning | Interactive element (<button>, <a>, [role="button"], [role="link"]) with no accessible name — no text content, aria-label, aria-labelledby, title, or alt | take_snapshot captures DOM/AX state; evaluate_script queries each visible interactive element for accessible name sources |
| 🟡 Warning | Form control (<input>, <select>, <textarea>) with no associated label — no <label for="...">, aria-label, or aria-labelledby (placeholder is intentionally excluded — not a valid accessible name per WCAG 2.1 §3.3.2) | evaluate_script checks label[for], ancestor <label>, aria-label, and aria-labelledby for each visible control |
| 🟡 Warning | Landmark role appearing more than once without distinct aria-label / aria-labelledby — screen readers cannot differentiate them | evaluate_script counts [role=X] instances and checks for unique label values across: main, banner, contentinfo, navigation, search, complementary, form, region |
| 🟡 Warning | Heading level skip — h1→h3 or h4→h6 jumps more than one level, breaking WCAG 1.3.1 document outline | DOM walk of h1–h6 elements; detects gaps > 1 between consecutive heading levels |
| 🟡 Warning | aria-expanded button/control has no aria-controls attribute or references a non-existent element | evaluate_script checks [aria-expanded] elements for missing or broken aria-controls pointer |
Keyboard Accessibility
| Severity | Bug / Issue | Detection Method |
|---|---|---|
| 🟡 Warning | Button or focusable element has outline:0 with no box-shadow fallback — no visible focus ring | press_key({ key: 'Tab' }) walk + evaluate_script reads document.activeElement computed style for outline/box-shadow |
Flakiness Detection
| Severity | Bug / Issue | Detection Method |
|---|---|---|
| original | Confirmed finding — present in both crawl runs | mergeRunResults finds the key in both run1 and run2 (type::message[:100]::status scheme); original severity kept |
| 🔵 Info | Flaky finding — appeared in only one of two crawl runs | Present in run1 or run2 but not both; downgraded to severity: 'info', labelled :zap: _flaky_ in Slack digest |
User Flow Assertions
| Severity | Bug / Issue | Detection Method |
|---|---|---|
| 🔴 Critical | Flow step failed — navigate/fill/click/waitFor threw mid-flow (page state unknown) | flow-runner.js wraps every step; any throw emits flow_step_failed and halts the flow |
| 🔴 Critical | element_visible assert — expected selector absent within timeout | Polled via evaluate_script + document.querySelector (MCP wait_for doesn't reliably throw on timeout) |
| 🟡 Warning | no_console_errors assert — console errors recorded during this flow (baseline-sliced, not session-wide) | Baseline snapshot of list_console_messages at flow start; only messages after that offset count |
| 🟡 Warning | no_network_errors assert — 4xx/5xx request during this flow (baseline-sliced) | Baseline snapshot of list_network_requests at flow start; status ≥ 400 after offset |
| 🟡 Warning | url_contains assert — URL does not include expected substring after flow completes | evaluate_script reads window.location.href |
| 🟡 Warning | element_not_visible assert — selector unexpectedly present in DOM | evaluate_script → !document.querySelector(...) |
| 🔴 Critical | no_js_errors assert — uncaught exceptions captured in window.__argusErrors during flow | Script parses the injected error buffer |
Environment Regressions (dev vs staging)
| Severity | Bug / Issue | Detection Method |
|---|---|---|
| 🔴 Critical | API status regressed — request that returned 2xx in dev now returns 5xx in staging | Network diff between both environments |
| 🟡 Warning | Visual change > 0.5% pixels different between dev and staging screenshots | pixelmatch pixel-level comparison + diff overlay image |
| 🟡 Warning | New console error in staging that doesn't exist in dev | Console message diff |
| 🟡 Warning | New network request in staging — unexpected endpoint appeared | Network request URL diff |
| 🟡 Warning | Request present in dev is missing in staging — endpoint removed or broken | Network request URL diff |
| 🟡 Warning | API status changed between environments (any non-5xx change) | Network status diff |
| 🔵 Info | DOM structural change — element count differs between dev and staging | HTML tag count comparison across snapshots |
What It Does
Argus watches your running application and automatically surfaces issues that test suites miss: visual regressions, API loops, CSS drift, console noise, and accessibility failures — all with screenshots delivered directly to Slack.
| Feature | Description |
|---|---|
| Error Detection | Crawls your app's routes; captures JS exceptions, console errors, failed API calls, redirect chains, and broken internal links |
| Environment Comparison | Diffs dev vs staging: screenshots, DOM structure, network requests, console errors |
| CSS Analysis | Detects cascade overrides, component style leaks, unused rules, React inline style conflicts |
| API Frequency Analysis | Flags endpoints called more than once per page load (double-fetch, missing useEffect deps, infinite loops) |
| Network Performance | slow_api > 1s/3s and large_payload > 500KB/2MB per API call |
| SEO Checks | Missing meta description, OG tags, canonical, viewport, h1 — DOM-inspected on every route |
| Security Checks | localStorage tokens, token-in-URL, eval(), sensitive console output, missing CSP/X-Frame-Options |
| Content Quality | null/undefined rendered text, lorem ipsum, broken images, empty data lists |
| Responsive Analysis | Overflow + touch target checks at 375/768px; screenshot grid at 4 breakpoints dispatched to Slack |
| Memory Leak Detection | V8 heap snapshot → detached DOM node count; heap growth across navigate-away + navigate-back |
| Runtime Anti-Patterns | Synchronous XHR, document.write, long tasks > 50ms, CORS violations, service worker registration failures, and missing cache headers on static assets — detected via script injection and post-load HEAD checks |
| Hover-State Bug Detection | Fires hover on every [aria-haspopup] and [data-tooltip] element; detects broken dropdowns and invisible tooltips that CSS :hover was supposed to reveal |
| Accessibility Snapshot Analysis | Calls take_snapshot then evaluate_script; flags interactive elements missing accessible names, unlabelled form controls, duplicate landmark regions, heading level skips, and aria-expanded buttons with missing/broken aria-controls |
| Keyboard Focus Analysis | Tab-walks every focusable element (up to 20 steps); detects focus_visible_missing (button/link with outline:0 and no box-shadow fallback — keyboard users cannot see where focus is) |
| Chrome DevTools Issues Panel | Queries list_console_messages({ types: ['issue'] }) for the Issues panel namespace, which is entirely separate from console.error; catches CSP violations and deprecated API usage (verified) — additional Chrome-surfaced types (CORS blocks, mixed content, cookie misconfiguration, low-contrast) are classified when present |
| Mobile CPU Throttling | Applies 4× CPU throttle (emulate_cpu({ throttlingRate: 4 })) during ≤768px responsive breakpoints — finds layout reflow and animation jank that only manifests under realistic mobile CPU pressure |
| Origin-Tagged Network Findings | All network error and timing findings carry origin: 'first-party' | 'third-party' so operators can triage critical first-party failures without digging through third-party CDN noise |
| Historical Baselines | Saves finding keys after each run; subsequent runs only alert on new issues; trend summary in Slack digest |
| Flakiness Detection | Crawls each route twice per run; findings in both runs are confirmed (original severity); findings in only one run are marked flaky (severity: info, :zap: _flaky_ label) |
| User Flow Assertions | Named multi-step flows (navigate/fill/click/press_key/drag/upload_file/waitFor/sleep/handle_dialog/assert) with baseline-sliced no_console_errors, no_network_errors, element_visible, url_contains, no_js_errors asserts — runs end-to-end user journeys without writing Playwright specs · Use typing: true on a fill step to dispatch real keyboard events via mcp.type_text (triggers input-event validation) · Use drag step to fire dragstart→dragover→drop sequences · Use upload_file step to deliver a local file to a file input via CDP ({ action: 'upload_file', selector: 'input[type=file]', filePath: '/path/to/file' }) |
| API Contract Validation | Define apiContracts[] in targets.js with inline schema or schemaFile; validates captured response bodies against JSON Schema (type, required, properties, items) — emits api_contract_violation warnings when shapes diverge from spec |
| Severity Policy Overrides | Define severityOverrides in targets.js ({ finding_type: 'info' | 'warning' | 'critical' | 'suppress' }); applied before Slack routing — remap or silence specific detections without touching analyzer code |
| Auth Token Refresh | refreshSession() is called before each route; re-runs the login flow when the saved session has less than sessionRefreshWindowMs (default 5 min) remaining — prevents long crawls from failing mid-run when the auth cookie expires |
| Slack-optional mode | When SLACK_BOT_TOKEN is not configured, Argus skips Slack entirely and auto-generates a local report.html (all findings + inline screenshots) and opens it in the default browser — zero setup required to start using Argus |
| Codebase Cross-Reference | Points ARGUS_SOURCE_DIR at your app source to detect: missing env vars (process.env.X used in code but absent from .env), feature flag leakage (conditional env var that is falsy/unset), console error stack traces resolved to file:line, and internal links that return 404 — all without opening a browser |
| GitHub PR Integration | Posts a structured Markdown findings table as a PR comment (updates in-place — one comment per PR, no spam); sets an argus-qa commit status check (failure when new criticals exist, success otherwise) — blocks merge via branch protection when regressions are introduced. Requires GITHUB_TOKEN + GITHUB_REPOSITORY env vars |
| Auto Route Discovery | Augments manual routes[] with paths from three sources: fetches /sitemap.xml (follows one sitemap-index level, 10s timeout), scans Next.js pages/ (Next 12) and app/ (Next 13+) directories stripping route groups (auth), and greps JS/TS source for React Router <Route path> declarations. Dynamic [param] segments are skipped — no concrete URL to crawl. Manual route config (critical, waitFor) always takes precedence. |
argus init Setup Wizard | npm run init (or npx argus init) guides first-time setup: collects target URLs, detects the app framework (Next.js / React Router / unknown) from the source directory's package.json, runs C3 route discovery against the dev URL, prompts for optional Slack tokens and GitHub credentials, then writes a populated .env and a pre-filled src/config/targets.js — zero manual config editing required. |
| Watch Mode | npm run watch attaches to whatever Chrome tab is open and polls list_console_messages + list_network_requests every 1 s (configurable via ARGUS_WATCH_INTERVAL_MS). Reports new console errors, network failures (4xx/5xx), CORS blocks, and auth failures in real time — without navigating. On Ctrl+C, generates a final reports/report.html. No route config needed. |
| Full Lighthouse Suite | All 4 Lighthouse categories (performance, SEO, best-practices, accessibility) with per-audit items |
| Performance Budgets | Enforces LCP < 2500ms, CLS < 0.1, FID < 100ms, TTFB < 800ms per route |
| Slack Notifications | Rich Block Kit reports with inline screenshots routed to #bugs-critical, #bugs-warnings, #bugs-digest |
| Slash Command | /argus-retest <url> triggers an on-demand test from any Slack channel |
| CI Integration | GitHub Actions workflow runs daily at 6 AM UTC and on every push to main |
| MCP Server (AI-callable Argus) | Register Argus as an MCP server via .mcp.json; Claude (or any MCP client) can call argus_audit, argus_audit_full, argus_compare, argus_last_report, argus_watch_snapshot, and argus_get_context directly from a conversation — no CLI, no terminal required. Published to npm as argusqa-os — add via { "command": "npx", "args": ["-y", "argusqa-os"] } in .mcp.json |
Works with React + SCSS, CSS Modules, CSS-in-JS (styled-components / emotion), and plain HTML/CSS apps.
How It Works
Three components run against the same Chrome instance:
Claude Code (Terminal / VS Code)
├── MCP Protocol → Chrome DevTools MCP Server → Chrome
└── Writes → Orchestration Layer → Slack Bot API
- Chrome DevTools MCP Server — programmatic access to Chrome: network traffic, console, screenshots, DOM, performance traces
- Claude Code — orchestration hub: reads codebase, drives the MCP tools, classifies findings, posts to Slack
- Slack Bot (BugBot) — receives reports, exposes
/argus-retestslash command, handles Acknowledge / Retest button actions
In interactive mode (running from Claude Code), MCP tools are called natively. In CI mode (GitHub Actions), src/utils/mcp-client.js spawns chrome-devtools-mcp as a child process and communicates via JSON-RPC over stdio.
Prerequisites
| Requirement | Version | Notes |
|---|---|---|
| Node.js | v20.19+ | Required by Chrome DevTools MCP |
| Chrome | Stable (current) | Must be installed |
| Claude Code | Latest | npm install -g @anthropic-ai/claude-code |
| Slack workspace | — | Optional — only needed if you want Slack reports. Without it, Argus generates a local report.html instead |
One-Time Setup
Option A — MCP Server (Claude Code / any MCP client)
No local install required. npx auto-downloads argusqa-os on first use.
1. Register both MCP servers
Add to .mcp.json in your project root:
{
"mcpServers": {
"chrome-devtools": {
"command": "npx",
"args": ["-y", "chrome-devtools-mcp@latest"]
},
"argus": {
"command": "npx",
"args": ["-y", "argusqa-os"]
}
}
}
Or via Claude Code CLI:
claude mcp add chrome-devtools -- npx -y chrome-devtools-mcp@latest
claude mcp add argus -- npx -y argusqa-os
2. Environment variables
Create a .env file in your project root:
TARGET_DEV_URL=http://localhost:3000
TARGET_STAGING_URL=https://staging.yourapp.com # optional — enables argus_compare
3. Start Chrome with remote debugging
# macOS
open -a "Google Chrome" --args --remote-debugging-port=9222 --headless=new
# Windows
"C:\Program Files\Google\Chrome\Application\chrome.exe" --remote-debugging-port=9222 --headless=new --no-sandbox --disable-gpu
# Linux
google-chrome --remote-debugging-port=9222 --headless=new --no-sandbox
4. Slack notifications (optional)
Skip to use local
report.htmlmode — Argus generates a self-contained HTML report when Slack is not configured.
- api.slack.com/apps → Create New App → name it BugBot
- OAuth & Permissions → Bot Token Scopes:
chat:write,files:write,files:read - Install to workspace → copy Bot User OAuth Token (
xoxb-...) to.envasSLACK_BOT_TOKEN - Create
#bugs-critical,#bugs-warnings,#bugs-digestand/invite @BugBotin each
SLACK_BOT_TOKEN=xoxb-...
SLACK_CHANNEL_CRITICAL=C0000000000
SLACK_CHANNEL_WARNINGS=C0000000001
SLACK_CHANNEL_DIGEST=C0000000002
Option B — npm Package (dev dependency / CI/CD)
1. Install
npm install --save-dev argusqa-os
2. Environment variables
Run the interactive wizard to auto-generate .env and src/config/targets.js:
npx argus
The wizard detects your framework (Next.js / React Router), discovers routes from sitemap.xml and your file structure, and optionally collects Slack and GitHub credentials.
Alternative — manual setup: Create a .env with TARGET_DEV_URL and optionally TARGET_STAGING_URL.
3. Start Chrome with remote debugging
Same as Option A — see above.
4. Slack notifications (optional)
Same as Option A — see above.
Option C — Clone the Repository (full source / contributors)
1. Clone and install
git clone https://github.com/ironclawdevs27/Argus.git
cd Argus
npm install
npm run setup # creates reports/ directory
2. Environment variables
Recommended — use the interactive setup wizard:
npm run init
Alternative — manual setup:
cp .env.example .env
Open .env and fill in:
TARGET_DEV_URL=http://localhost:3000
TARGET_STAGING_URL=https://staging.yourapp.com # leave blank → CSS-only analysis mode
# Slack — OPTIONAL. Omit to get a local report.html instead.
# SLACK_BOT_TOKEN=xoxb-...
# SLACK_SIGNING_SECRET=...
# SLACK_CHANNEL_CRITICAL=C0000000000
# SLACK_CHANNEL_WARNINGS=C0000000001
# SLACK_CHANNEL_DIGEST=C0000000002
3. Configure routes
If you ran npm run init — skip this step.
Otherwise, edit src/config/targets.js:
export const routes = [
{ path: '/', name: 'Home', critical: true, waitFor: 'main' },
{ path: '/login', name: 'Login', critical: true, waitFor: 'form' },
{ path: '/dashboard', name: 'Dashboard', critical: true, waitFor: '[data-testid="dashboard"]' },
{ path: '/settings', name: 'Settings', critical: false, waitFor: null },
];
critical: true— errors on this route go to#bugs-criticalwaitFor— CSS selector Argus waits for before capturing (signals the page is ready)
4. Connect Chrome DevTools MCP to Claude Code
claude mcp add chrome-devtools -- npx chrome-devtools-mcp@latest
Verify — ask Claude: "List all open Chrome pages" — you should see your tabs.
5. Start Chrome with remote debugging
Same as Option A — see above.
6. Slack notifications (optional)
Same as Option A — see above.
Running Argus
Option A — Via MCP (Claude Code / any MCP client)
Ask Claude directly — no terminal needed.
Available tools:
| Tool | What it does |
|---|---|
argus_audit | Fast QA pass — JS errors, network failures, accessibility, SEO, security, CSS, content |
argus_audit_full | Deep QA pass — adds Lighthouse, responsive layout checks across 4 viewports, memory leak detection, hover-state bug detection, and accessibility tree snapshot |
argus_compare | Diff dev vs staging — screenshots, findings delta, environment regressions |
argus_last_report | Return the last saved JSON report without re-running a scan |
argus_watch_snapshot | Snapshot the currently open Chrome tab without navigating — raw console + network capture |
argus_get_context | Capture everything broken on the open tab, formatted as a diagnostic context for Claude to diagnose and suggest fixes |
argus_audit — fast audit of any URL:
Run argus_audit on http://localhost:3000/checkout
Run argus_audit on http://localhost:3000/login with critical: true
argus_audit_full — deep audit with Lighthouse + memory + responsive checks:
Run argus_audit_full on http://localhost:3000/dashboard
argus_compare — dev vs staging diff (reads TARGET_DEV_URL and TARGET_STAGING_URL from .env):
Run argus_compare
argus_last_report — retrieve last audit without re-running Chrome:
Run argus_last_report
argus_watch_snapshot — snapshot the currently open tab without navigating. Useful when the page is in an authenticated or post-interaction state that navigation would reset:
Run argus_watch_snapshot
Run argus_watch_snapshot with url: http://localhost:3000
argus_get_context — when your app is stuck or throwing errors, run this to capture everything that's broken and feed it to Claude for diagnosis:
Run argus_get_context
Then follow with: "Here's the context — what's causing these errors and how do I fix them?"
Option B & C — Via CLI / npm scripts
Available commands:
| Command | What it does |
|---|---|
npm run crawl | Multi-page batch audit of all routes in targets.js |
npm run compare | Dev vs staging diff (or CSS analysis if no TARGET_STAGING_URL) |
npm run watch | Passive monitor — polls the open Chrome tab every 1s, no navigation |
npm run report:html | Generate reports/report.html from the latest JSON audit |
npm run server | Start the Slack slash command + interaction server (port 3001) |
npm run init | Interactive setup wizard — generates .env + targets.js |
npm run test:unit | Run 61 unit tests (no Chrome required) |
npm run test:harness | Run 82-block correctness harness (requires Chrome) |
npm run crawl — full audit of all configured routes:
npm run crawl
Reports are saved to reports/ as JSON files. Run npm run report:html after any crawl for a portable reports/report.html with all screenshots inlined — useful for sharing with designers or reviewing offline.
npm run compare — dev vs staging diff:
npm run compare
When TARGET_STAGING_URL is not set, automatically switches to CSS analysis mode — cascade overrides, component style leaks, unused rules, and React inline style conflicts on the dev environment only.
npm run watch — passive monitoring (polls every 1s, no navigation):
Attaches to whatever Chrome tab is open and reports new issues in real time without navigating anywhere. Use this while developing.
Requires 2 terminals:
Terminal 1 — your app (npm start / npm run dev)
Terminal 2 — npm run watch
Steps:
- Open Chrome and navigate to your app
- Terminal 1: start your application
- Terminal 2:
npm run watch— Argus begins polling - Develop normally — console errors, network failures (4xx/5xx), CORS blocks, and auth failures print in real time
Ctrl+C— stops the monitor and writesreports/report.html
# Attribute findings to a specific URL:
npm run watch http://localhost:4000
| Variable | Default | Description |
|---|---|---|
ARGUS_WATCH_INTERVAL_MS | 1000 | Poll interval in milliseconds |
TARGET_DEV_URL | http://localhost:3000 | URL attributed to findings when none passed |
npm run report:html — generate HTML dashboard from last audit:
npm run report:html
# → reports/report.html (all findings + inline screenshots, portable, no server needed)
Option D — From Slack (on-demand)
/argus-retest https://staging.yourapp.com/checkout
BugBot responds immediately, runs the test, and posts results back. Detailed bug reports go to #bugs-critical. See Slack Slash Command Setup for configuration.
CSS Analysis Mode
When TARGET_STAGING_URL is not set in .env, npm run compare automatically switches to CSS analysis mode instead of comparing two environments.
What it analyzes on your dev environment:
| Check | What it catches |
|---|---|
| Cascade overrides | Same CSS property declared multiple times on an element; !important flagged as warning |
| Component style leaks | BEM selector (.card__title) found in a stylesheet that doesn't belong to that component |
| Unused rules | CSS selectors that match no element on the current page |
| CSS Modules | Detects hashed class names; extracts readable component names (Button, Card, etc.) |
| React inline style conflicts | style="" attribute overriding a stylesheet declaration on the same element |
| SCSS source maps | Traces compiled CSS back to original .scss files where source maps are available |
API frequency analysis also runs automatically:
| Call count | Severity | Likely cause |
|---|---|---|
| 2 calls | info | Possible prefetch + actual — verify intentional |
| 3–4 calls | warning | Double-fetch — check useEffect deps or component re-mounts |
| 5+ calls | critical | Runaway loop — missing cleanup, infinite re-render |
Performance Budgets
Argus enforces these thresholds on every crawl:
| Metric | Threshold | Severity |
|---|---|---|
| LCP (Largest Contentful Paint) | < 2500ms | warning |
| CLS (Cumulative Layout Shift) | < 0.1 | warning |
| FID / TBT (interaction latency) | < 100ms | warning |
| TTFB (Time to First Byte) | < 800ms | warning |
Violations are reported as individual warning bugs with the measured value.
Lighthouse Suite
Runs all four Lighthouse categories on every route:
- Accessibility — score < 50 →
critical; score < 90 →warning - Performance — score < 90 →
warning - SEO — score < 90 →
warning - Best Practices — score < 90 →
warning
Individual failing audit items (e.g., missing alt text, low contrast, render-blocking resources) are surfaced as separate findings alongside the category score.
Slack Channel Routing
Slack is optional. When
SLACK_BOT_TOKENis not set, Argus skips Slack entirely and auto-generates a localreport.html(all findings + inline screenshots) and opens it in the default browser. No Slack setup needed to start using Argus.
When Slack is configured, findings are routed by severity:
| Severity | Channel | When |
|---|---|---|
critical | #bugs-critical | JS exceptions, HTTP 5xx, blank page, auth failure, API called 5+ times, Lighthouse accessibility < 50, auth token in storage/URL, responsive overflow, slow API > 3s, payload > 2MB, > 100 detached DOM nodes, CORS policy violations, debugger; statements in production code, blocked mixed content (HTTP resource on HTTPS page) |
warning | #bugs-warnings | Visual regression > 0.5%, HTTP 4xx, CSS overrides with !important, API called 3–4×, Lighthouse scores < 90, missing SEO/OG tags, missing security headers, placeholder content, touch targets too small, slow API > 1s, payload > 500KB, > 10 detached DOM nodes, redirect chains > 2 hops, broken links, sync XHR, document.write, long tasks > 50ms, SW registration failures, duplicate id attributes, passive mixed content (images/audio on HTTPS page) |
info | #bugs-digest | Console warnings, unused CSS rules, API summaries, CSS Modules detection, empty data lists, responsive screenshot grid, missing cache headers on static assets |
Each message includes:
- Severity badge + affected URL + timestamp
- AI-generated description
- Inline screenshot (uploaded directly to Slack — no external hosting)
- View Page, Acknowledge, and Retest action buttons
Slack Slash Command Setup
To use /argus-retest from Slack, you need to expose the Argus server publicly.
Step 1 — Start the server
npm run server
Server runs on port 3001.
Step 2 — Expose with Cloudflare Tunnel
Download cloudflared (free, no account needed), then:
cloudflared tunnel --url http://localhost:3001
Alternatively, with no install at all (SSH tunnel):
ssh -R 80:localhost:3001 [email protected]
Copy the public HTTPS URL that appears.
Step 3 — Configure Slack App
-
api.slack.com/apps → BugBot → Slash Commands → Create New Command:
- Command:
/argus-retest - Request URL:
https://your-public-url/slack/commands - Description:
Run Argus regression test on a URL - Usage hint:
<url>
- Command:
-
Interactivity & Shortcuts → Enable → Request URL:
https://your-public-url/slack/interactions -
OAuth & Permissions → Reinstall to Workspace
Step 4 — Test
/argus-retest http://localhost:3000
BugBot should reply within 3 seconds with a "running" acknowledgement, then post results.
GitHub Actions CI Setup
Add secrets to your repository
Go to GitHub repo → Settings → Secrets and variables → Actions → add:
| Secret name | Required | Value |
|---|---|---|
SLACK_BOT_TOKEN | No | Your xoxb-... token. Omit entirely to use Slack-optional mode — Argus generates report.html instead |
SLACK_SIGNING_SECRET | No* | From Slack App → Basic Information (only needed for /argus-retest slash command) |
SLACK_CHANNEL_CRITICAL | No* | Channel ID (required when Slack is configured) |
SLACK_CHANNEL_WARNINGS | No* | Channel ID (required when Slack is configured) |
SLACK_CHANNEL_DIGEST | No* | Channel ID (required when Slack is configured) |
TARGET_STAGING_URL | Yes | Your staging base URL |
GITHUB_TOKEN | No | For C2 PR integration — auto-injected by GitHub Actions as secrets.GITHUB_TOKEN |
GITHUB_REPOSITORY | No | For C2 PR integration — owner/repo format (e.g., acme/my-app) |
C2 PR integration: when
GITHUB_TOKENandGITHUB_REPOSITORYare set, Argus posts a PR comment and commit status check for every crawl.GITHUB_PR_NUMBERis injected automatically by the workflow fromgithub.event.pull_request.number. The included workflow does not wire these up by default — add them to theenv:block in.github/workflows/argus.ymlif you want PR-level comments.
The workflow at .github/workflows/argus.yml runs:
- On every push to
main/master - Daily at 6 AM UTC (before the team starts work)
- Manually via Actions → Run workflow (with optional URL override)
If critical issues are found, the pipeline fails — preventing silent regressions from being missed.
Project Structure
argus/
├── .env # Your secrets (never commit this)
├── .env.example # Template — copy to .env
├── .gitignore
├── package.json
├── README.md
├── .claude/
│ └── settings.json # Claude Code permission config (auto-approve node/npm/reports)
├── .github/
│ └── workflows/
│ └── argus.yml # CI pipeline
├── .vscode/
│ └── mcp.json # Chrome DevTools MCP config for VS Code
├── .mcp.json # Argus MCP server registration — exposes argus_audit/argus_audit_full/argus_compare/argus_last_report to Claude
├── src/
│ ├── argus.js # Single-page audit entry point
│ ├── batch-runner.js # Multi-page batch audit
│ ├── mcp-server.js # Argus MCP server — argus_audit / argus_audit_full / argus_compare / argus_last_report
│ ├── adapters/
│ │ └── browser.js # CdpBrowserAdapter — facade over all chrome-devtools-mcp calls
│ ├── domain/
│ │ └── finding.js # createFinding() factory — canonical finding shape
│ ├── registry.js # Analyzer plugin registry — registerExpensive/getCheap/getExpensive
│ ├── config/
│ │ ├── targets.js # Routes to test, thresholds, config
│ │ └── schema.js # Zod validation schema; validateConfig() called inside runCrawl()
│ ├── orchestration/
│ │ ├── crawl-and-report.js # Backward-compat re-export shell → orchestrator + report-processor + dispatcher
│ │ ├── orchestrator.js # Crawl loop, route/flow crawl, runCrawl()
│ │ ├── report-processor.js # Dedup → severity overrides → baseline → JSON write
│ │ ├── dispatcher.js # Slack / GitHub / HTML dispatch
│ │ ├── env-comparison.js # Dev vs staging diff + CSS analysis mode
│ │ ├── watch-mode.js # Passive browser monitoring (WatchSession + runWatchMode)
│ │ └── slack-notifier.js # Slack Block Kit dispatcher
│ ├── server/
│ │ ├── index.js # Express server (port 3001)
│ │ ├── slash-command-handler.js # /argus-retest handler
│ │ └── interaction-handler.js # Acknowledge + Retest button handler
│ ├── utils/
│ │ ├── css-analyzer.js # CSS analysis script injected into the browser
│ │ ├── seo-analyzer.js # SEO checks: meta, OG tags, h1, canonical, viewport
│ │ ├── security-analyzer.js # Security: localStorage tokens, eval(), headers, cookies
│ │ ├── content-analyzer.js # Content quality: null text, placeholders, broken images
│ │ ├── responsive-analyzer.js # Responsive: overflow + touch targets at 4 breakpoints
│ │ ├── memory-analyzer.js # Memory leaks: V8 heap snapshot + heap growth
│ │ ├── logger.js # Pino structured logger — childLogger(module)│ │ ├── retry.js # withRetry() exponential backoff — navigate/fill only; Number.isFinite guard│ │ ├── telemetry.js # OTel tracing + metrics — startSpan() / recordFinding() / recordFlaky() / recordNewFindings(); no-op default│ │ ├── session-manager.js # Auth: backward-compat re-export barrel│ │ ├── session-persistence.js # Auth: saveSession (mkdirSync+atomic write), restoreSession, hasSession, clearSession│ │ ├── login-orchestrator.js # Auth: runLoginFlow, refreshSession + lock file│ │ ├── baseline-manager.js # Baselines: loadBaseline, saveBaseline, applyBaseline, appendTrend
│ │ ├── flakiness-detector.js # Flakiness: mergeRunResults — confirmed vs flaky per double-crawl
│ │ ├── flow-runner.js # User flow assertions: runFlow / runAllFlows — assert DSL
│ │ ├── html-reporter.js # HTML dashboard: generateHtmlReport() + npm run report:html (D7.1 / D7.7)
│ │ ├── parallel-crawler.js # chunkArray sharding utility (ARGUS_CONCURRENCY=N parallel crawl)
│ │ ├── contract-validator.js # API contract validation: validateSchema, matchesContract (D7.4)
│ │ ├── severity-overrides.js # Severity policy overrides: applyOverrides (D7.5)
│ │ ├── slack-guard.js # Slack-optional guard: isSlackConfigured() (D7.7)
│ │ ├── hover-analyzer.js # Hover-state bug detection — aria-haspopup + data-tooltip (D8.1)
│ │ ├── snapshot-analyzer.js # Accessibility tree snapshot — missing names, labels, landmarks, heading hierarchy, ARIA state (D8.2 + v6)
│ │ ├── issues-analyzer.js # Chrome DevTools Issues panel — CSP/deprecated/cookie issues
│ │ ├── network-timing-analyzer.js # HAR timing analysis — slow third-party detection
│ │ ├── keyboard-analyzer.js # Keyboard Tab-walk — focus_visible_missing, focus_lost
│ │ ├── codebase-analyzer.js # Codebase cross-reference — env vars, feature flags, dead routes (C1)
│ │ ├── github-reporter.js # GitHub PR comment + commit status integration (C2)
│ │ ├── route-discoverer.js # Auto route discovery — sitemap + Next.js + React Router (C3)
│ │ ├── diff.js # pixelmatch screenshot + DOM/network diff utilities
│ │ ├── mcp-parsers.js # Text-format parsers for list_console_messages + list_network_requests (v9)
│ │ └── mcp-client.js # Headless JSON-RPC MCP client for CI mode
│ └── cli/
│ └── init.js # argus init setup wizard — detect framework, discover routes, write .env + targets.js (C4)
├── test/
│ └── unit/ # Vitest unit tests — no Chrome required
│ ├── finding.test.js # createFinding() — fields, throws, frozen, extra fields (8 tests)
│ ├── config-schema.test.js # validateConfig() + ConfigSchema.safeParse (8 tests)
│ ├── report-processor.test.js # deduplicateFindings + rebuildSummary (11 tests)
│ ├── flakiness-detector.test.js # findingKey normalization + mergeRunResults (13 tests)
│ ├── baseline-manager.test.js # loadBaseline/saveBaseline/applyBaseline (9 tests)
│ └── flow-runner.test.js # normalizeArray (pure) + runFlow mock browser (11 tests)
├── landing/ # Product landing page (React 18 + Vite + Tailwind + Framer Motion)
│ ├── src/
│ │ ├── App.jsx # Single-page app — hero, features, comparison, waitlist + enterprise modals
│ │ └── supabase.js # Supabase client factory (null-safe when env vars missing)
│ ├── public/
│ │ ├── favicon.svg # SVG favicon — purple ring + dot
│ │ ├── argus-poster.png # Video poster fallback (1918×1078)
│ │ ├── og-image-v2.jpg # OG social card — 1200×630 JPEG, branded overlay, black-outlined stat numbers
│ │ ├── robots.txt # Allows all crawlers; Sitemap reference
│ │ └── sitemap.xml # Canonical URL for argus-qa.com/
│ ├── index.html # Vite entry; OG/Twitter/JSON-LD SEO tags; canonical; favicon
│ ├── package.json
│ ├── .env.example # VITE_SUPABASE_URL + VITE_SUPABASE_ANON_KEY template
│ └── README.md # Setup guide, Supabase SQL schema, env vars, deployment
├── scripts/
│ └── dispatch-report.js # Standalone Slack re-dispatch script (re-posts last report.json to Slack)
├── test-harness/ # Fixture server + test runner (82 blocks, 348 hard assertions, 54 fixture pages)
│ ├── README.md
│ ├── server.js # Express fixture server (ports 3100 dev / 3101 staging)
│ ├── harness-config.js # Route definitions + expected findings
│ ├── validate.js # Test runner — 82 numbered blocks ([80] MCP server, [81] createFinding, [82] withRetry)
│ ├── pages/ # 54 fixture pages (one per detection category)
│ ├── nextjs-fixture/ # Next.js app structure for C3 discovery tests (10 files)
│ ├── source-fixture/ # Minimal app.js for C1 codebase-analyzer tests (env var audit)
│ └── static/
│ └── button-styles.css # BEM card selectors in button file → component leak
└── reports/ # Output: JSON reports + screenshots (gitignored)
├── baselines/
│ ├── <branch>.json # Per-route finding keys — per git branch (D7.2)
│ └── <branch>-trends.json # Append-only run history per branch (D7.2)
└── .gitkeep
Key Technical Decisions
| Decision | Choice | Reason |
|---|---|---|
| Screenshot comparison | pixelmatch + AI classification | pixelmatch is fast and deterministic; Claude removes false positives from anti-aliasing and dynamic content |
| Slack API | Bot API, not Incoming Webhooks | Bot API supports file uploads, message updates, interactive buttons, and threads |
| File uploads | files.getUploadURLExternal + PUT + files.completeUploadExternal | files.upload is deprecated; pre-signed URL requires PUT — POST silently produces broken files |
| CSS analysis | Script injected via evaluate_script | Runs in page context so it sees the live computed styles, CSS Modules hashes, and React fiber properties |
| Responsive viewport | emulate (not resize_page) | resize_page only resizes the browser window and does not update CSS viewport width — emulate is the correct API |
| Viewport width measurement | document.documentElement.clientWidth | After emulate with mobile flag, window.innerWidth returns the legacy layout viewport (~952px), not the device width |
| V8 heap snapshot | take_memory_snapshot({ filePath }) → read from disk | The MCP tool writes JSON to disk (not inline); parse with JSON.parse(fs.readFileSync(filePath)) then delete the temp file |
| Detached DOM detection | Walk flat nodes array for "Detached " prefix in strings table | Chrome serializes detached elements as "Detached HTMLDivElement" etc.; secondary check on detachedness === 2 (Chrome 90+) |
| Baseline finding key | type::message[:100]::status | Excludes timestamps and dynamic URL path IDs; message truncated to 100 chars to handle slight wording variations; ::status suffix only added when non-null |
| Baseline alert filter | isNew === true (strict) | Only findings explicitly marked new by applyBaseline are dispatched to Slack — prevents stale re-dispatch if baseline-manager is not called (fails silently rather than spamming) |
| Flakiness routing | severity: 'info' for flaky findings | Downgrading severity means existing dispatchToSlack routing sends them to the info digest with zero routing changes — only the :zap: _flaky_ label needed |
Private findingKey per module | Each of baseline-manager.js and flakiness-detector.js has its own copy | Avoids coupling two independently-useful modules via a shared export for a trivial 3-line function |
| Runtime anti-pattern injection | addScriptToEvaluateOnNewDocument via MCP | Scripts registered this way run in the new page context before any page script — intercepts XMLHttpRequest.open, document.write, and navigator.serviceWorker.register before the page can call them |
| CORS error detection | list_console_messages + text match, not in-page intercept | CORS errors are generated by the browser itself, not by page JS — console.error patcher misses them; the MCP console log captures them |
| Long task detection | PerformanceObserver({ entryTypes: ['longtask'] }) injected before load | Only the duration is included in the finding message (not startTime) — ensures identical tasks on two crawl runs produce the same dedup key |
| CI MCP client | JSON-RPC over stdio | In CI there's no Claude Code agent — the headless client replaces it with the same API surface |
| Node.js | v20.19+ | Minimum required by Chrome DevTools MCP |
Known MCP Tool Limitations
The Chrome DevTools MCP behavioral constraints below cause 3 permanent test failures in the harness (345/348 pass). These are MCP-layer restrictions — they cannot be fixed in Argus code.
type_textclarification:type_textdoes fire DOMinputevents when the element is properly focused first withmcp.click({ uid }). Always use uid-based focus — passing{ selector }tomcp.clicksilently does nothing.
| Tool | Constraint | Impact |
|---|---|---|
drag | Uses mouse simulation, not HTML5 DnD API | dragstart/dragover/drop events never fire |
list_console_messages({ types: ['issue'] }) | Issues panel returns empty even when violations exist | CSP and deprecated-API detection is unreliable |
These constraints are documented with workarounds in SKILL.md §10.
Environment Variables Reference
| Variable | Required | Description |
|---|---|---|
SLACK_BOT_TOKEN | No | xoxb-... Bot User OAuth Token. Omit to enable Slack-optional mode — Argus generates report.html and opens it in the browser instead |
SLACK_SIGNING_SECRET | No* | Verifies slash command / interaction requests from Slack (required only when using /argus-retest) |
SLACK_CHANNEL_CRITICAL | No* | Channel ID for critical bugs (required when Slack is configured) |
SLACK_CHANNEL_WARNINGS | No* | Channel ID for warnings (required when Slack is configured) |
SLACK_CHANNEL_DIGEST | No* | Channel ID for info / daily digest (required when Slack is configured) |
TARGET_DEV_URL | Yes | Base URL of your dev environment |
TARGET_STAGING_URL | No | Base URL of staging. If blank → CSS analysis mode |
SCREENSHOT_DIFF_THRESHOLD | No | Pixel diff % to flag (default: 0.5) |
REPORT_OUTPUT_DIR | No | Where to write reports (default: ./reports) |
ARGUS_CONCURRENCY | No | Number of parallel MCP clients for route crawling (default: 1 = sequential) |
PORT | No | Server port (default: 3001) |
ARGUS_LOG_LEVEL | No | Pino log level — trace, debug, info, warn, error, fatal (default: info) |
ARGUS_LOG_PRETTY | No | Set to 1 for human-readable log output instead of JSON (dev mode) |
ARGUS_RETRY_ATTEMPTS | No | Max retry attempts for navigate/fill MCP calls (default: 3) |
OTEL_EXPORTER_OTLP_ENDPOINT | No | OTLP collector endpoint — enables span/metric export to Jaeger, Grafana Tempo, Datadog, etc. |
ARGUS_OTEL_CONSOLE | No | Set to 1 to print OTel spans to stdout without an OTLP endpoint (dev tracing) |
ARGUS_WATCH_INTERVAL_MS | No | Watch mode poll interval in milliseconds (default: 1000) |
ARGUS_SOURCE_DIR | No | Path to your app's source directory — enables codebase cross-reference (env var detection, feature flag leakage, dead routes) |
ARGUS_ENV_FILE | No | Path to your app's .env file — C1 cross-references env vars used in source code against this file to detect missing declarations |
GITHUB_TOKEN | No | GitHub personal access token — required for PR comment + commit status integration |
GITHUB_REPOSITORY | No | Repository in owner/repo format — required for GitHub PR integration |
GITHUB_SHA | No | Commit SHA for the commit status check — injected automatically by GitHub Actions (${{ github.sha }}) |
GITHUB_PR_NUMBER | No | PR number for comment targeting — set via ${{ github.event.pull_request.number }} in your workflow |
ARGUS_REPORT_URL | No | Full URL to the hosted HTML report — linked from the GitHub commit status check |
Troubleshooting
Chrome DevTools MCP not connecting
claude mcp add chrome-devtools -- npx chrome-devtools-mcp@latest
# Then restart Claude Code
Slack messages not posting
- Confirm
SLACK_BOT_TOKENstarts withxoxb-(notxoxp-,xoxe-, orxapp-) - Verify BugBot is invited to each channel:
/invite @BugBot - Check token scopes:
chat:write,files:write,files:read
Screenshots not appearing in Slack messages
- The upload uses a pre-signed URL that requires
PUT, notPOST— if you see a broken image, check that the Slack token hasfiles:writescope and the channel is correct
Slash command returns "dispatch_failed"
- Your tunnel URL has changed (Cloudflare Tunnel / localhost.run URLs change on restart)
- Update the Request URL in Slack App → Slash Commands and reinstall
CSS analysis returns empty results
- Page may be behind auth — make sure you're logged in on the Chrome instance Argus is controlling
- Cross-origin stylesheets (CDN fonts, third-party widgets) can't be read due to browser security restrictions — this is expected
Screenshots are blank
- Page hasn't finished loading — increase
pageSettleMsinsrc/config/targets.js - Add a
waitForselector for that route
CI pipeline fails immediately
- Chrome may not be starting fast enough — increase the
sleep 3after Chrome launch tosleep 5in.github/workflows/argus.yml
How Argus Differs From Playwright / Cypress
Argus is not a replacement for unit or E2E tests. It's a complementary layer:
| Playwright / Cypress | Argus | |
|---|---|---|
| Tests | Your logic and API contracts | What the user actually sees |
| Catches | Regression in behaviour | CSS drift, visual regressions, API redundancy, console noise, perf budgets |
| Runs | In your test suite | Continuously, on the live running app |
| Setup | Write test files | Configure routes in targets.js |
| Output | Pass / fail | Structured Slack reports with screenshots and action buttons |
They complement each other — Argus catches what test suites miss.
Related Servers
Alpha Vantage MCP Server
sponsorAccess financial market data: realtime & historical stock, ETF, options, forex, crypto, commodities, fundamentals, technical indicators, & more
Neo
sponsorNEO MCP lets Claude Code, Cursor and VS Code hand off complex AI engineering tasks like AI model evals, AI agent optimization and more to NEO.
CodeAlive MCP
Provides semantic code search and codebase interaction features via the CodeAlive API.
Code Editor
Enables AI assistants to write, edit, and manage code files directly in a specified directory, respecting .gitignore patterns.
Thirdweb
Read/write to over 2k blockchains, enabling data querying, contract analysis/deployment, and transaction execution, powered by Thirdweb.
mcbedrock-mcp
Gives your AI assistants access to Minecraft Bedrock Edition scripting and addon documentation
IDA Pro
Interact with IDA Pro for reverse engineering and binary analysis tasks.
ImageSorcery MCP
ComputerVision-based 🪄 sorcery of image recognition and editing tools for AI assistants.
SimpleLocalize
A MCP server for SimpleLocalize, a translation management system. Requires a SimpleLocalize API key.
Unity-MCP
A bridge between the Unity game engine and AI assistants using the Model Context Protocol (MCP).
MCP Documentation Server
Integrates LLM applications with documentation sources using the Model Context Protocol.
LLMling
An MCP server with an LLMling backend that uses YAML files to configure LLM applications.