conKurrence
AI evaluation toolkit — measure inter-rater agreement (Fleiss' κ, Kendall's W) across multiple LLM providers
ConKurrence
One command. Find out if your AI agrees with itself.
ConKurrence is a statistically validated consensus measurement toolkit for AI evaluation pipelines. It uses multiple AI models as independent raters, measures inter-rater reliability with Fleiss' kappa and bootstrap confidence intervals, and routes contested items to human experts.
Install
npm install -g conkurrence
MCP Server
Use ConKurrence as an MCP server in Claude Desktop or any MCP-compatible client:
npx conkurrence mcp
Claude Desktop Configuration
Add to your claude_desktop_config.json:
{
"mcpServers": {
"conkurrence": {
"command": "npx",
"args": ["-y", "conkurrence", "mcp"]
}
}
}
Claude Code Plugin
/plugin marketplace add AlligatorC0der/conkurrence
Features
- Multi-model evaluation — Run your schema against Bedrock, OpenAI, and Gemini models simultaneously
- Statistical rigor — Fleiss' kappa with bootstrap confidence intervals, Kendall's W for validity
- Self-consistency mode — No API keys needed; uses the host model via MCP Sampling
- Schema suggestion — AI-powered schema design from your data
- Trend tracking — Compare runs over time, detect agreement degradation
- Cost estimation — Know the cost before running
MCP Tools
| Tool | Description |
|---|---|
conkurrence_run | Execute an evaluation across multiple AI raters |
conkurrence_report | Generate a detailed markdown report |
conkurrence_compare | Side-by-side comparison of two runs |
conkurrence_trend | Track agreement over multiple runs |
conkurrence_suggest | AI-powered schema suggestion from your data |
conkurrence_validate_schema | Validate a schema before running |
conkurrence_estimate | Estimate cost and token usage |
Links
- Homepage: conkurrence.com
- npm: npmjs.com/package/conkurrence
- Terms of Service: app.conkurrence.com/terms
- Privacy Policy: app.conkurrence.com/privacy
License
BUSL-1.1 — Business Source License 1.1
相關伺服器
Scout Monitoring MCP
贊助Put performance and error data directly in the hands of your AI assistant.
Alpha Vantage MCP Server
贊助Access financial market data: realtime & historical stock, ETF, options, forex, crypto, commodities, fundamentals, technical indicators, & more
Build-Scout
Interact with various build systems including Gradle, Maven, NPM/Yarn, Cargo, Python, Makefile, and CMake.
UUIDv7 Generator
A server for generating version 7 universally unique identifiers (UUIDv7).
Scrnr
Take website screenshots with Scrnr.io
MCP JVM Diagnostics
JVM diagnostics MCP server for thread dump and GC log analysis. Detects deadlocks, lock contention, and GC pressure. Runs via npx on Node.js — no JVM or Docker required.
MCP Tools for Open WebUI
An MCP server for Open WebUI that provides tools for secure Python code execution, time, and SDXL image generation.
OpenZeppelin MCP Servers
Model Context Protocol Servers Repository for OpenZeppelin Products
drawdb-mcp
DrawDB + MCP server
BlenderMCP
Connects Blender to Claude AI via the Model Context Protocol (MCP), enabling direct interaction and control for prompt-assisted 3D modeling, scene creation, and manipulation.
PCM
A server for reverse engineering tasks using the pcm toolkit. Requires a local clone of the pcm repository.
App Store Rejections MCP
Catch App Store rejections before they happen