QA Radar

MCP-сервер, который подсказывает вашему агенту, какие файлы тестировать в первую очередь — на основе git-изменений, пробелов в покрытии и карты тестов с оценкой риска для каждого файла.

GitHub

Документация

QA Radar

Give your AI coding agent the quality brain it doesn't have to grow from scratch.

QA Radar analyzes your codebase and produces a structured quality health report — combining git churn, test coverage, and test-to-source mapping into risk-scored modules. It works as an MCP server for AI coding agents (Claude Code, Cursor, Windsurf) and as a standalone CLI for humans and CI pipelines.

Built for developers who want their AI agent to write targeted tests, not generic ones.

Quick Start

Claude Code — one step:

/plugin marketplace add Muratkus/qaradar
/plugin install qaradar@qaradar-marketplace

Then ask your agent: "What should I test first?"

Or run directly without installing:

uvx qaradar serve

Full install options ↓

What It Does

QA Radar answers the question every new team member (and every AI agent) asks: "What should I test first?"

It scans three signals and combines them into a per-file risk score:

Signal	What It Measures	Why It Matters
Git Churn	Commit frequency, lines changed, recency	High-churn files are regression magnets
Coverage Gaps	Line & branch coverage from existing reports	Low coverage = blind spots
Test Mapping	Which source files have corresponding tests	No tests = no safety net at all

The output is a ranked list of modules by risk level (critical → low), with human-readable reasons for each rating.

Why Not Just Let the Agent Do It?

A capable agent with bash access could run git log --numstat, parse coverage.xml, and glob for test files. So why an MCP server?

Concern	What QA Radar does instead
Token cost	`git log` over 90 days on a medium repo is hundreds of KB. QA Radar returns ~5 KB of structured JSON.
Determinism	A weighted risk score computed ad-hoc in-context is unreliable. Code is reproducible.
Speed	One tool call vs. 4–6 sequential bash calls + reasoning between each.
Format normalization	LCOV / Cobertura / coverage.py JSON / Go cover profiles all parse differently. QA Radar normalizes across formats so the agent doesn't have to.
Convention encoding	`test_x.py` for Python, `x.test.ts` for JS/TS, `x_test.go` for Go, `FooTest.java` for Java — encoded once, not re-derived each session.
Portability	The same MCP tools work across Claude Code, Cursor, and Windsurf without re-prompting.

Install as Claude Code Plugin (Recommended)

The fastest path — one command wires up the MCP server and installs 4 slash commands. No manual config editing.

Step 0 — install uv (if you don't have it):

curl -LsSf https://astral.sh/uv/install.sh | sh
# or: pip install uv

uv launches qaradar on demand from PyPI — you don't need to pip install qaradar separately.

Step 1 — add the marketplace:

/plugin marketplace add Muratkus/qaradar

Step 2 — install:

/plugin install qaradar@qaradar-marketplace

What you get: 6 MCP tools auto-configured + 5 slash commands:

Command	What it does
`/qaradar:qa-check`	Full health report — risk, coverage, untested files
`/qaradar:qa-risky`	Ranked list of riskiest files with reasons
`/qaradar:qa-untested`	Source files with no detected tests + scaffold suggestions
`/qaradar:qa-plan`	Prioritized test plan (chains 3 tools)
`/qaradar:qa-pr-risk`	Which changed files in this PR are riskiest

Example: after merging a big feature branch, run /qaradar:qa-check to see what regressed. Before opening a PR, run /qaradar:qa-pr-risk to see what you need to test first.

MCP Server (for AI Coding Agents)

Setup

Alternative: manual MCP config (if you prefer not to use the plugin):

Add to your Claude Code MCP config (~/.claude/mcp.json for user-level, or .mcp.json in the project root for project-level):

{
  "mcpServers": {
    "qaradar": {
      "command": "uvx",
      "args": ["qaradar", "serve"]
    }
  }
}

Or start it manually:

uvx qaradar serve

Example Prompts

Once connected, ask your agent:

"What should I test first in this repo?" "Which files are the riskiest right now?" "Show me the highest-churn files from the last month." "Which source files have no tests at all?" "Which of my changed files are risky?" ← diff-aware

Available MCP Tools

Tool	When the Agent Uses It
`qaradar_healthcheck`	Full quality overview of a repository
`qaradar_risky_modules`	What to test first; which files are riskiest
`qaradar_churn`	Hotspot detection; where regressions tend to occur
`qaradar_coverage_gaps`	Files with low coverage; where the blind spots are
`qaradar_untested_files`	Source files with no corresponding test files
`qaradar_pr_risk`	Which changed files in this PR need attention
`qaradar_should_run`	After finishing work: should QA Radar re-analyze, and over the diff or the whole repo?

Diff-aware: what's risky in this PR?

qaradar_pr_risk scores only the files changed between a base ref and HEAD — not the whole repo. It keeps risk scores calibrated by using full-repo normalization, so a file with 2 commits in a PR isn't falsely flagged CRITICAL just because it's the only changed file the agent knows about.

Ask your agent:

"Which of my changed files are risky?" "Do any of the files I changed lack tests?" "What should I review before opening this PR?"

Or from the CLI:

# Diff against main — shows only changed files
qaradar analyze . --base main

# Diff against a specific ref
qaradar analyze . --base origin/main --days 60

qaradar_pr_risk auto-detects the base branch from GITHUB_BASE_REF (set automatically in GitHub Actions) or falls back to main/master. Pass base_ref explicitly to override.

CLI

# Full health check on current directory
qaradar analyze

# Analyze a specific repo with 180 days of history
qaradar analyze /path/to/repo --days 180

# Output as JSON (for piping to other tools)
qaradar analyze --json-output

# Show top 10 risky modules only
qaradar analyze --top 10

# Diff-aware: score only files changed since main
qaradar analyze . --base main

Install

pip install qaradar

Or run without installing:

uvx qaradar serve

From source (for development):

git clone https://github.com/Muratkus/qaradar.git
cd qaradar
pip install -e .

Language Support

All language support lives in one registry — qaradar/analyzers/languages.py — so adding a language is a single entry (extensions, test-name convention, test-function counter), consumed by both churn and test-mapping.

Tier 1 — First-class, tested

Language	Test detection	Coverage
Python	`test_x.py`, `x_test.py`	coverage.py JSON + XML
JavaScript / TypeScript	`x.test.`, `x.spec.`, `x-test.*` (React Native)	LCOV, Jest/Istanbul JSON
Go	`x_test.go`	Go cover profile (`cover.out`)
Swift	`XTests.swift` (XCTest `func test…`)	Cobertura / LCOV
Kotlin	`XTest.kt` (`@Test`)	Cobertura / LCOV
Dart / Flutter	`x_test.dart` (`test(`, `testWidgets(`)	LCOV (`coverage/lcov.info`)
Objective-C	`XTests.m` / `.mm` (XCTest `- (void)test…`)	Cobertura / LCOV

Tier 2 — Best-effort, naming-based

Java, Ruby, Rust — test detection via naming conventions. Coverage via Cobertura XML or LCOV if emitted.

Coverage parsing is format-driven, so it spans more ecosystems than test-mapping detection, which is language-specific.

Monorepos: Istanbul/Jest reports are auto-discovered under packages/*/coverage and apps/*/coverage, and absolute/package-relative coverage paths are normalized to repo-relative so they join correctly against churn and test-mapping signals.

Supported Coverage Formats

Format	Tools
coverage.py JSON	Python `coverage run` + `coverage json`
Istanbul / Jest JSON	`coverage-final.json`, `coverage-summary.json` (Jest/Vitest/nyc)
Cobertura XML	Python, Java/Gradle, .NET (Coverlet)
LCOV	JS/TS, Flutter/Dart, C/C++, Rust (grcov)
Go cover profile	`go test -coverprofile=cover.out`

Example Output

╭──────────────── QA Radar Health Report ─────────────────╮
│ Repository: /home/user/my-service                       │
│ Source files: 47  Test files: 23  Ratio: 0.49           │
│ Avg coverage: 62.3%  Tested: 31  Untested: 16          │
╰─────────────────────────────────────────────────────────╯

  CRITICAL risk modules: 3
  HIGH risk modules: 7

┌─────────────────────────────────────────────────────────┐
│ Risky Modules                                           │
├──────────────────────┬──────────┬───────┬───────────────┤
│ File                 │ Risk     │ Score │ Reasons       │
├──────────────────────┼──────────┼───────┼───────────────┤
│ src/payments/core.py │ CRITICAL │  0.87 │ High churn:   │
│                      │          │       │ 34 commits;   │
│                      │          │       │ No tests      │
│ src/auth/tokens.py   │ CRITICAL │  0.82 │ Low coverage: │
│                      │          │       │ 12.3%; Active │
│                      │          │       │ recently      │
└──────────────────────┴──────────┴───────┴───────────────┘

Tracking Runs Over Time

By default QA Radar is stateless. Opt in to persistence to track a repo across runs and drive incremental re-analysis (daily/weekly, or after N diffs, or after an agent finishes work).

qaradar analyze . --save      # record a snapshot to .qaradar/state.json (gitignore it)
qaradar should-run .          # exit 0 if a re-run is warranted, 1 if not — prints JSON
qaradar status .              # last run, commits/days since, current decision + risk delta

should-run is a gate, not a scheduler — wire it into whatever you already use:

# cron / CI / git hook: only do expensive work when criteria are met
qaradar should-run . && qaradar analyze . --save

It reports scope: "full" (interval elapsed) or scope: "diff" (enough files changed), so an agent calling the qaradar_should_run MCP tool knows whether to follow up with qaradar_healthcheck or qaradar_pr_risk. State is one .qaradar/state.json per repo, so a "collection of repos" is just a loop over repos in your own infra.

Tune the criteria in qaradar.toml:

[schedule]
interval_days = 7        # re-run the full healthcheck at least weekly
min_changed_files = 25   # ...or sooner, once this many files have changed

--save also reports a delta vs the previous run — which files newly became risky, which got worse, which improved or resolved.

Roadmap

v0.1.2 — Claude Code plugin + slash commands
v0.2.0 — Config file (qaradar.toml), Tier 2 language validation, hardening
v0.3.0 — Diff-aware mode: qaradar_pr_risk + --base CLI flag
v0.4.0 — Mobile/monorepo language coverage (Swift, Kotlin, Obj-C, Dart, React Native, Jest); run persistence + re-run criteria (should-run, --save, qaradar_should_run)
v0.5.0 — Flaky test detection from CI history (JUnit XML parsing)

Philosophy

QA Radar is built on three beliefs:

The bottleneck has moved. AI makes writing tests easy. Knowing which tests matter is the hard part.
Quality is a landscape, not a number. A single coverage percentage hides everything. Risk is per-module, per-signal, per-timeframe.
Agents need context. An AI coding assistant that doesn't know your repo's fragile areas will write generic tests. Give it the quality landscape and it writes targeted ones.

License

MIT