QA Radar

MCP-сервер, который подсказывает вашему агенту, какие файлы тестировать в первую очередь — на основе git-изменений, пробелов в покрытии и карты тестов с оценкой риска для каждого файла.

Документация

QA Radar

Give your AI coding agent the quality brain it doesn't have to grow from scratch.

QA Radar analyzes your codebase and produces a structured quality health report — combining git churn, test coverage, and test-to-source mapping into risk-scored modules. It works as an MCP server for AI coding agents (Claude Code, Cursor, Windsurf) and as a standalone CLI for humans and CI pipelines.

Built for developers who want their AI agent to write targeted tests, not generic ones.

Quick Start

Claude Code — one step:

/plugin marketplace add Muratkus/qaradar
/plugin install qaradar@qaradar-marketplace

Then ask your agent: "What should I test first?"

Or run directly without installing:

uvx qaradar serve

Full install options ↓

What It Does

QA Radar answers the question every new team member (and every AI agent) asks: "What should I test first?"

It scans three signals and combines them into a per-file risk score:

SignalWhat It MeasuresWhy It Matters
Git ChurnCommit frequency, lines changed, recencyHigh-churn files are regression magnets
Coverage GapsLine & branch coverage from existing reportsLow coverage = blind spots
Test MappingWhich source files have corresponding testsNo tests = no safety net at all

The output is a ranked list of modules by risk level (critical → low), with human-readable reasons for each rating.

Why Not Just Let the Agent Do It?

A capable agent with bash access could run git log --numstat, parse coverage.xml, and glob for test files. So why an MCP server?

ConcernWhat QA Radar does instead
Token costgit log over 90 days on a medium repo is hundreds of KB. QA Radar returns ~5 KB of structured JSON.
DeterminismA weighted risk score computed ad-hoc in-context is unreliable. Code is reproducible.
SpeedOne tool call vs. 4–6 sequential bash calls + reasoning between each.
Format normalizationLCOV / Cobertura / coverage.py JSON / Go cover profiles all parse differently. QA Radar normalizes across formats so the agent doesn't have to.
Convention encodingtest_x.py for Python, x.test.ts for JS/TS, x_test.go for Go, FooTest.java for Java — encoded once, not re-derived each session.
PortabilityThe same MCP tools work across Claude Code, Cursor, and Windsurf without re-prompting.

Install as Claude Code Plugin (Recommended)

The fastest path — one command wires up the MCP server and installs 4 slash commands. No manual config editing.

Step 0 — install uv (if you don't have it):

curl -LsSf https://astral.sh/uv/install.sh | sh
# or: pip install uv

uv launches qaradar on demand from PyPI — you don't need to pip install qaradar separately.

Step 1 — add the marketplace:

/plugin marketplace add Muratkus/qaradar

Step 2 — install:

/plugin install qaradar@qaradar-marketplace

What you get: 6 MCP tools auto-configured + 5 slash commands:

CommandWhat it does
/qaradar:qa-checkFull health report — risk, coverage, untested files
/qaradar:qa-riskyRanked list of riskiest files with reasons
/qaradar:qa-untestedSource files with no detected tests + scaffold suggestions
/qaradar:qa-planPrioritized test plan (chains 3 tools)
/qaradar:qa-pr-riskWhich changed files in this PR are riskiest

Example: after merging a big feature branch, run /qaradar:qa-check to see what regressed. Before opening a PR, run /qaradar:qa-pr-risk to see what you need to test first.

MCP Server (for AI Coding Agents)

Setup

Alternative: manual MCP config (if you prefer not to use the plugin):

Add to your Claude Code MCP config (~/.claude/mcp.json for user-level, or .mcp.json in the project root for project-level):

{
  "mcpServers": {
    "qaradar": {
      "command": "uvx",
      "args": ["qaradar", "serve"]
    }
  }
}

Or start it manually:

uvx qaradar serve

Example Prompts

Once connected, ask your agent:

"What should I test first in this repo?" "Which files are the riskiest right now?" "Show me the highest-churn files from the last month." "Which source files have no tests at all?" "Which of my changed files are risky?" ← diff-aware

Available MCP Tools

ToolWhen the Agent Uses It
qaradar_healthcheckFull quality overview of a repository
qaradar_risky_modulesWhat to test first; which files are riskiest
qaradar_churnHotspot detection; where regressions tend to occur
qaradar_coverage_gapsFiles with low coverage; where the blind spots are
qaradar_untested_filesSource files with no corresponding test files
qaradar_pr_riskWhich changed files in this PR need attention
qaradar_should_runAfter finishing work: should QA Radar re-analyze, and over the diff or the whole repo?

Diff-aware: what's risky in this PR?

qaradar_pr_risk scores only the files changed between a base ref and HEAD — not the whole repo. It keeps risk scores calibrated by using full-repo normalization, so a file with 2 commits in a PR isn't falsely flagged CRITICAL just because it's the only changed file the agent knows about.

Ask your agent:

"Which of my changed files are risky?" "Do any of the files I changed lack tests?" "What should I review before opening this PR?"

Or from the CLI:

# Diff against main — shows only changed files
qaradar analyze . --base main

# Diff against a specific ref
qaradar analyze . --base origin/main --days 60

qaradar_pr_risk auto-detects the base branch from GITHUB_BASE_REF (set automatically in GitHub Actions) or falls back to main/master. Pass base_ref explicitly to override.

CLI

# Full health check on current directory
qaradar analyze

# Analyze a specific repo with 180 days of history
qaradar analyze /path/to/repo --days 180

# Output as JSON (for piping to other tools)
qaradar analyze --json-output

# Show top 10 risky modules only
qaradar analyze --top 10

# Diff-aware: score only files changed since main
qaradar analyze . --base main

Install

pip install qaradar

Or run without installing:

uvx qaradar serve

From source (for development):

git clone https://github.com/Muratkus/qaradar.git
cd qaradar
pip install -e .

Language Support

All language support lives in one registry — qaradar/analyzers/languages.py — so adding a language is a single entry (extensions, test-name convention, test-function counter), consumed by both churn and test-mapping.

Tier 1 — First-class, tested

LanguageTest detectionCoverage
Pythontest_x.py, x_test.pycoverage.py JSON + XML
JavaScript / TypeScriptx.test.*, x.spec.*, x-test.* (React Native)LCOV, Jest/Istanbul JSON
Gox_test.goGo cover profile (cover.out)
SwiftXTests.swift (XCTest func test…)Cobertura / LCOV
KotlinXTest.kt (@Test)Cobertura / LCOV
Dart / Flutterx_test.dart (test(, testWidgets()LCOV (coverage/lcov.info)
Objective-CXTests.m / .mm (XCTest - (void)test…)Cobertura / LCOV

Tier 2 — Best-effort, naming-based

Java, Ruby, Rust — test detection via naming conventions. Coverage via Cobertura XML or LCOV if emitted.

Coverage parsing is format-driven, so it spans more ecosystems than test-mapping detection, which is language-specific.

Monorepos: Istanbul/Jest reports are auto-discovered under packages/*/coverage and apps/*/coverage, and absolute/package-relative coverage paths are normalized to repo-relative so they join correctly against churn and test-mapping signals.

Supported Coverage Formats

FormatTools
coverage.py JSONPython coverage run + coverage json
Istanbul / Jest JSONcoverage-final.json, coverage-summary.json (Jest/Vitest/nyc)
Cobertura XMLPython, Java/Gradle, .NET (Coverlet)
LCOVJS/TS, Flutter/Dart, C/C++, Rust (grcov)
Go cover profilego test -coverprofile=cover.out

Example Output

╭──────────────── QA Radar Health Report ─────────────────╮
│ Repository: /home/user/my-service                       │
│ Source files: 47  Test files: 23  Ratio: 0.49           │
│ Avg coverage: 62.3%  Tested: 31  Untested: 16          │
╰─────────────────────────────────────────────────────────╯

  CRITICAL risk modules: 3
  HIGH risk modules: 7

┌─────────────────────────────────────────────────────────┐
│ Risky Modules                                           │
├──────────────────────┬──────────┬───────┬───────────────┤
│ File                 │ Risk     │ Score │ Reasons       │
├──────────────────────┼──────────┼───────┼───────────────┤
│ src/payments/core.py │ CRITICAL │  0.87 │ High churn:   │
│                      │          │       │ 34 commits;   │
│                      │          │       │ No tests      │
│ src/auth/tokens.py   │ CRITICAL │  0.82 │ Low coverage: │
│                      │          │       │ 12.3%; Active │
│                      │          │       │ recently      │
└──────────────────────┴──────────┴───────┴───────────────┘

Tracking Runs Over Time

By default QA Radar is stateless. Opt in to persistence to track a repo across runs and drive incremental re-analysis (daily/weekly, or after N diffs, or after an agent finishes work).

qaradar analyze . --save      # record a snapshot to .qaradar/state.json (gitignore it)
qaradar should-run .          # exit 0 if a re-run is warranted, 1 if not — prints JSON
qaradar status .              # last run, commits/days since, current decision + risk delta

should-run is a gate, not a scheduler — wire it into whatever you already use:

# cron / CI / git hook: only do expensive work when criteria are met
qaradar should-run . && qaradar analyze . --save

It reports scope: "full" (interval elapsed) or scope: "diff" (enough files changed), so an agent calling the qaradar_should_run MCP tool knows whether to follow up with qaradar_healthcheck or qaradar_pr_risk. State is one .qaradar/state.json per repo, so a "collection of repos" is just a loop over repos in your own infra.

Tune the criteria in qaradar.toml:

[schedule]
interval_days = 7        # re-run the full healthcheck at least weekly
min_changed_files = 25   # ...or sooner, once this many files have changed

--save also reports a delta vs the previous run — which files newly became risky, which got worse, which improved or resolved.

Roadmap

  • v0.1.2 — Claude Code plugin + slash commands
  • v0.2.0 — Config file (qaradar.toml), Tier 2 language validation, hardening
  • v0.3.0 — Diff-aware mode: qaradar_pr_risk + --base CLI flag
  • v0.4.0 — Mobile/monorepo language coverage (Swift, Kotlin, Obj-C, Dart, React Native, Jest); run persistence + re-run criteria (should-run, --save, qaradar_should_run)
  • v0.5.0 — Flaky test detection from CI history (JUnit XML parsing)

Philosophy

QA Radar is built on three beliefs:

  1. The bottleneck has moved. AI makes writing tests easy. Knowing which tests matter is the hard part.
  2. Quality is a landscape, not a number. A single coverage percentage hides everything. Risk is per-module, per-signal, per-timeframe.
  3. Agents need context. An AI coding assistant that doesn't know your repo's fragile areas will write generic tests. Give it the quality landscape and it writes targeted ones.

License

MIT