systemprompt MCP Server

Môi trường chạy quản trị MCP tự lưu trữ bằng Rust — nhật ký kiểm toán, thực thi chính sách và kiểm soát chi phí cho các máy chủ MCP.

GitHub

Tài liệu

Own how your organization uses AI.

One self-hosted binary governs inference, auditing, evals, and every tool call across your AI fleet. Native integration with Claude Bridge. Works with any agent, any model, any provider.

This repository is the evaluation template: clone it, compile it, point Claude for Work, Claude Code, or any Anthropic-SDK client at http://localhost:8080, and every request lands on a host you operate — on your network, in your air-gap, under your audit table. Single Rust binary, one PostgreSQL, four commands from git clone to serving inference. Built for SOC 2, ISO 27001, HIPAA, and the OWASP Agentic Top 10.

systemprompt.io · Documentation · Guides · Enterprise factsheet (PDF) · Discord

An AI agent attempts to exfiltrate a GitHub PAT through a tool call. The secret-detection layer denies the call before the tool process spawns. One row is written to the audit table. The recording is a live capture of `./demo/governance/06-secret-breach.sh`.

_{Live capture of ./demo/governance/06-secret-breach.sh. Secret exfiltration attempt denied before spawn. One audit row written. No model touched the key.}

Quick start

git clone https://github.com/systempromptio/systemprompt-template
cd systemprompt-template
just setup-local            # prompts: pick a provider (Gemini/Anthropic/OpenAI), enter its key
just start                  # serves governance + agents + MCP + admin on :8080

Run with no key and setup-local asks which provider you have and prompts for its key; the provider you choose becomes the default (it is set as ai.default_provider, the other providers are disabled, and the gateway falls back to it). You can also pass keys non-interactively — the first one given becomes the default:

just setup-local <anthropic_key> [openai_key] [gemini_key]   # writes profile, starts Postgres, runs publish pipeline

Running a second clone side-by-side? just setup-local <keys> "" "" 8081 5433. Discover the CLI with systemprompt --help.

Prerequisites

Requirement	Purpose	Install
Docker	PostgreSQL runs in a container; `just setup-local` starts it	docker.com
Rust 1.75+	Compiles the workspace binary	rustup.rs
`just`	Task runner	just.systems
`jq`, `yq`	JSON and YAML processing in the scripts	`brew install jq yq` / `apt install jq yq`
AI API keys	At least one of Anthropic, OpenAI, or Gemini. `setup-local` writes the chosen provider's key into the profile and sets it as `ai.default_provider` in `services/ai/config.yaml`, disabling the providers you didn't supply. Pass several keys to enable several providers.	Provider dashboards
Ports 8080 + 5432	HTTP + PostgreSQL	Free on localhost

What a CISO gets

A single query answers every AI audit. Every request, scope decision, tool call, model output, and cost lands in one 18-column Postgres table. Six correlation columns (UserId, SessionId, TaskId, TraceId, ContextId, ClientId) bind identity at construction time, so a row without a trace is a programming error.
Credentials physically cannot enter the context window. The governance process is the parent of every MCP tool subprocess. Keys are decrypted from a ChaCha20-Poly1305 store and injected into the child's environment by Command::spawn(). The parent, which owns the LLM context, never writes the value. 35+ regex patterns deny any tool call that tries to pass a secret through arguments.
Self-hosted, air-gap capable, single artifact. One Rust binary. One PostgreSQL. No Redis, no Kafka, no Kubernetes, no SaaS handoff. The same binary runs on a laptop, a VM, and an air-gapped appliance without modification. Zero outbound telemetry by default.
Policy-as-code on PreToolUse hooks. Destructive operations, blocklists, department scoping, six-tier RBAC (Admin, User, Service, A2A, MCP, Anonymous). Rate limiting at 300 req/min per session with role multipliers. Every deny reason is structured and auditable.
Certifications-ready, not certification-marketing. Tiered log retention from debug (1 day) through error (90 days). 10 identity lifecycle event variants. SIEM-ready JSON events for Splunk, ELK, Datadog, Sumo. Built for SOC 2 Type II, ISO 27001, HIPAA, and the OWASP Agentic Top 10.

This repo is the evaluation template. Fork it, clone it, compile it. 43 scripted demos execute every claim above against the live binary on your own laptop.

What you'll see in the first five minutes

http://localhost:8080 — admin UI, live audit table, session viewer.
systemprompt analytics overview — conversations, tool calls, costs in microdollars, anomalies flagged above 2x/3x of rolling average.
systemprompt infra logs audit <request-id> --full — the full trace for any request: identity, scope, rule evaluations, tool call, model output, cost. One query, one row, one answer.
Point Claude Code, Claude Desktop, or any MCP client at it. Permissions follow the user, not the client. Try to exfiltrate a key through a tool argument and watch the secret-detection layer deny it before the tool process spawns.
./demo/governance/06-secret-breach.sh — the scripted version of that denial, recorded above.

The scripted demos

./demo/00-preflight.sh                    # acquire token, verify services, create admin
./demo/01-seed-data.sh                    # populate analytics + trace data

# Governance — the audit line
./demo/governance/01-happy-path.sh        # allowed tool call, full trace chain
./demo/governance/05-governance-denied.sh # scope check rejects out-of-role call
./demo/governance/06-secret-breach.sh     # secret-detection blocks exfiltration
./demo/governance/07-rate-limiting.sh     # 300 req/min per session enforced
./demo/governance/08-hooks.sh             # PreToolUse policy-as-code

# Observability — the audit table
./demo/analytics/01-overview.sh           # conversations, costs, anomalies
./demo/infrastructure/04-logs.sh          # structured JSON events, SIEM-ready

# Scale — the overhead budget
./demo/performance/02-load-test.sh        # 3,308 req/s burst, p99 22.7 ms

Full index: demo/README.md. 41 of 43 scripts are free; two cost ~$0.01 each (real model calls).

The governance pipeline

Every tool call passes five in-process checks, synchronously, before it reaches a tool process. Every decision lands in an 18-column audit row.

  LLM Agent
      │
      ▼
  Governance pipeline  (in-process, synchronous, <5 ms p99)
      │
      ├─ 1. JWT validation       (HS256, verified locally, offline-capable)
      ├─ 2. RBAC scope check     (Admin · User · Service · A2A · MCP · Anonymous)
      ├─ 3. Secret detection     (35+ regex: API keys, PATs, PEM, AWS prefixes)
      ├─ 4. Blocklist            (destructive operation categories)
      └─ 5. Rate limiting        (300 req/min per session, role multipliers)
      │
      ▼
  ALLOW or DENY   →  18-column audit row, always
      │
      ▼ (ALLOW)
  spawn_server()
      │
      ├─ decrypt secrets from ChaCha20-Poly1305 store
      └─ inject into subprocess env vars only (never parent)
      │
      ▼
  MCP tool process     credentials live here, never in the LLM context path

Governance pipeline — terminal recording

_{Run it: ./demo/governance/05-governance-denied.sh · Feature detail}

How credential injection works

When a tool call passes the pipeline, spawn_server() decrypts credentials from the ChaCha20-Poly1305 store and injects them into the child process environment. The parent process — which owns the LLM context window — never writes the value.

Source: systemprompt-core/crates/domain/mcp/src/services/process/spawner.rs.

let secrets = SecretsBootstrap::get()?;

let mut child_command = Command::new(&binary_path);

// Child env only. The parent (LLM context path) never touches the value.
if let Some(key) = &secrets.anthropic {
    child_command.env("ANTHROPIC_API_KEY", key);
}
if let Some(key) = &secrets.github {
    child_command.env("GITHUB_TOKEN", key);
}

// Detach; parent forgets the child after spawn.
let child = child_command.spawn()?;
std::mem::forget(child);

Before spawn, a secret-detection pipeline scans tool arguments for 35+ credential patterns. A tool call that tries to pass a secret through the context window is blocked even if the agent has scope to run the tool. The hero recording above is the scripted proof: ./demo/governance/06-secret-breach.sh.

Performance

Sub-5 ms governance overhead, benchmarked. Each request performs JWT validation, scope resolution, three rule evaluations, and an async audit write.

Metric	Result
Throughput	3,308 req/s burst, sustained under 100 concurrent workers
p50 latency	13.5 ms
p99 latency	22.7 ms
Added to AI response time	<1%
GC pauses	Zero

Reproduce: just benchmark. Numbers measured on the author's laptop.

Configuration & CLI

Runtime configuration is flat YAML under services/, loaded through services/config/config.yaml. Unknown keys fail loudly (#[serde(deny_unknown_fields)]). No database-stored config, no admin UI required. Every change is a diff.

services/
  config/config.yaml        Root aggregator
  agents/<id>.yaml          Agent: scope, model, tool access
  mcp/<name>.yaml           MCP server: OAuth2 config, scopes
  skills/<id>.yaml          Skill: config + markdown instruction body
  plugins/<name>.yaml       Plugin bindings (references agents, skills, MCP)
  ai/config.yaml            AI provider config (Anthropic, OpenAI, Gemini)
  scheduler/config.yaml     Background job schedule
  web/config.yaml           Web frontend, navigation, theme
  content/config.yaml       Content sources and indexing

Eight CLI domains cover every operational surface. No dashboard required for any task.

Domain	Purpose
`core`	Skills, content, files, contexts, plugins, hooks, artifacts
`infra`	Services, database, jobs, logs
`admin`	Users, agents, config, setup, session, rate limits
`cloud`	Auth, deploy, sync, secrets, tenant, domain
`analytics`	Overview, conversations, agents, tools, requests, sessions, content, traffic, costs
`web`	Content types, templates, assets, sitemap, validate
`plugins`	Extensions, MCP servers, capabilities
`build`	Build core workspace and MCP extensions

More recordings — infrastructure, integrations, analytics, agents, compliance, MCP governance

Each recording is a live capture of the named script running against the binary.

Infrastructure — one binary, one process, one database. Same artifact runs laptop to air-gap.

Self-hosted deployment

_{All data on your infrastructure, zero outbound telemetry · ./demo/infrastructure/01-services.sh · Feature}

Deploy anywhere

_{Profile YAML promotes environments without rebuilding · ./demo/cloud/01-cloud-overview.sh · Feature}

Unified control plane

_{Every operational surface has a CLI verb · ./demo/infrastructure/03-jobs.sh · Feature}

Open standards

_{MCP, OAuth 2.0, PostgreSQL, Git · zero proprietary protocols · ./demo/mcp/01-mcp-servers.sh · Feature}

MCP governance, analytics, closed-loop agents, compliance.

MCP governance

_{Each MCP server is an isolated OAuth2 resource server with per-server scope validation · ./demo/mcp/02-mcp-access-tracking.sh · Feature}

Analytics and observability

_{Nine analytics subcommands, anomaly detection, SIEM-ready JSON · ./demo/analytics/01-overview.sh · Feature}

Closed-loop agents

_{Agents query their own error rate, cost, and latency via MCP tools and adjust · ./demo/agents/04-agent-tracing.sh · Feature}

Compliance

_{Tiered retention, 10 identity lifecycle events, SOC 2 / ISO 27001 / HIPAA / OWASP Agentic Top 10 · ./demo/users/03-session-management.sh · Feature}

Integrations — any provider, Claude Desktop, web publisher, extensions.

Any AI agent

_{Anthropic, OpenAI, Gemini swap at the profile level · cost attribution in integer microdollars · ./demo/agents/01-list-agents.sh · Feature}

Claude Desktop & Bridge

_{Skills persist across sessions via OAuth2 · ./demo/skills/01-skill-lifecycle.sh · Feature}

Web server & publisher

_{Same binary serves your website, blog, and docs · systemprompt.io runs on this binary · ./demo/web/01-web-config.sh · Feature}

Extensible architecture

_{Your code compiles into your binary via the Extension trait · no runtime reflection · ./demo/skills/04-plugin-management.sh · Feature}

Governance benchmark

_{3,308 req/s burst, p99 22.7 ms · just benchmark}

Claude for Work, on your infrastructure

Claude for Work ships with extension points for inference, identity, and audit. Point them at this binary and every prompt, tool call, and cost line lands in a Postgres row you own.

  Managed Device                 Enterprise Gateway              Upstream Inference
  (Bridge via MDM)               (this binary, your VPC)         (pluggable)
  ───────────────── ──────────▶  ─────────────────────  ──────▶  ─────────────────
  Credential helper              /v1/messages                    Anthropic direct
  Managed MCP list               Governance pipeline             Bedrock / Vertex
  Signed plugins                 Audit to Postgres               OpenAI / Groq
                                                                 On-prem vLLM / Qwen
                                                                 Air-gap capable

Four governance layers enforce before a byte leaves your network:

Scope — RBAC resolved from the JWT at request construction. Admin · User · Service · A2A · MCP · Anonymous.
Secrets — 35+ regex patterns scan every tool argument and every prompt. A credential in the context path is denied before the tool process spawns.
Policy — Blocklists, destructive-operation categories, tenant rules, PreToolUse hooks as code.
Quota — 300 req/min per session with role multipliers; per-tool and per-budget limits.

In-process evaluation against a cached entitlement table. Governance stays out of the latency budget — p99 22.7 ms, <1% of AI response time.

How it compares

Dimension	Claude Enterprise	Cloud Custom	+ systemprompt.io
Data residency	Anthropic infra	Cloud region	Your datacenter or air-gap
Audit trail	Anthropic-held	OTLP only	Prompt → tool → MCP → cost in your Postgres
User revocation	SSO / seat removal	Cloud IAM	IDP disable; next TTL fails closed
Inference provider	Anthropic only	Bedrock / Vertex (Claude)	Any `/v1/messages`, per-call routing
MCP allowlist	Anthropic-curated	Device-local config	One registry, per-principal policy
Plugin catalogue	Anthropic-hosted	Files on disk	Signed, scoped, versioned distribution

Manual install is tested and works end-to-end today; signed installers, MDM packages, and Homebrew / winget distribution land in a later release. Install steps in the Advanced fold below.

Advanced — gateway routes, bridge install, org-plugins sync

Manual install is tested end-to-end. Automated distribution — signed installers, MDM packages, Homebrew / winget — is in progress; today you download a binary and drop a TOML file, documented below.

`/v1/messages` inference gateway

POST /v1/messages at the Anthropic wire format. Every inference request flows through the same governance pipeline as every tool call — on infrastructure you operate.

SDK- and Claude-Desktop-compatible. Authenticated with a systemprompt JWT in x-api-key (falls back to Authorization: Bearer). No new credential type — existing user JWTs serve as the gateway credential.
Routes by model_pattern. Built-in tags: anthropic, openai, moonshot (Kimi), qwen, gemini, minimax. Anthropic is a transparent byte proxy (extended thinking, cache-control headers, SSE events preserved verbatim). OpenAI-compatible providers get full Anthropic↔OpenAI request/response/SSE conversion. Upstream API keys resolve from the secrets file by name.
Zero overhead when disabled. The /v1 router mounts only if gateway.enabled: true in the active profile.

Profile YAML:

providers:
  - name: anthropic
    protocol: anthropic
    endpoint: https://api.anthropic.com/v1
    api_key_secret: anthropic
    models:
      - id: claude-sonnet-4-20250514
  - name: minimax
    protocol: anthropic
    endpoint: https://api.minimax.io/anthropic/v1
    api_key_secret: minimax
    models:
      - id: MiniMax-M2
gateway:
  enabled: true
  default_provider: anthropic
  routes:
    - model_pattern: "claude-*"
      provider: anthropic
    - model_pattern: "MiniMax-*"
      provider: minimax

Each provider is declared once under providers: — its wire protocol, endpoint, api_key_secret, and the models it serves (each with optional aliases and upstream_model). Gateway routes carry no connectivity; they only map a requested model_pattern to a provider by name, and default_provider forwards any model no route matches.

Routes evaluate in order; first model_pattern match wins. On a model entry, upstream_model aliases a client-requested model to a different upstream name without the client knowing.

Configuring routes from the CLI (worked example: proxy every Anthropic model to Gemini Flash). Instead of hand-editing the profile, use admin config. To make a client that asks for claude-* actually serve Google Gemini Flash:

# 1. Store the upstream key and register the provider + model in the profile registry
systemprompt admin config secret set gemini <GEMINI_API_KEY>
systemprompt admin config catalog provider add --name gemini --protocol gemini \
  --endpoint https://generativelanguage.googleapis.com/v1beta --api-key-secret gemini
systemprompt admin config catalog model add --provider gemini --id gemini-2.5-flash

# 2. Point the claude-* route at gemini and rewrite the upstream model name
systemprompt admin config gateway route add --model-pattern 'claude-*' \
  --provider gemini --upstream-model gemini-2.5-flash

A client POST /v1/messages with model: claude-haiku-4-5 then returns model: gemini-2.5-flash.

Important — routes are access-controlled. Each route is gated by an access_control_entities row keyed on its id, which is content-addressed (hash(model_pattern, provider)). Changing a route's provider mints a new id, so a freshly-edited route is denied (unknown to access control) until the entity is materialised. The admin config CLI edits the profile only; it does not reconcile the access-control catalog. Materialise it one of two ways:

Re-run the publish pipeline — systemprompt infra jobs run publish_pipeline (also runs via just publish). It registers every route in the live profile and the gateway_route: "*" wildcard in services/access-control/roles.yaml grants them. Dynamic, but must be re-run after every route change.
Pin it in services/access-control/roles.yaml (committed, survives a clean install) — add an explicit grant so the ACL loader self-materialises the row at publish time:
```
- entity_type: gateway_route
  entity_id: claude-star-39ccd3   # synthesize_route_id("claude-*", "gemini")
  access: allow
  default_included: true
  roles: [user]
```

Bridge credential helper endpoints. systemprompt-bridge is a standalone ~2.4 MB Rust binary (no tokio, no sqlx, no axum) that trades a lower-privilege credential for a short-lived JWT. Progressive capability ladder — mTLS → dashboard session → PAT — mounted under /v1/gateway/auth/bridge/:

POST /pat — Authorization: Bearer <pat> → {token, ttl, headers} with a fresh JWT and the canonical identity header map (x-user-id, x-session-id, x-trace-id, x-client-id, x-tenant-id, x-policy-version, x-call-source).
POST /session — 501 (dashboard-cookie exchange not yet wired).
POST /mtls — 501 (device-cert exchange not yet wired).
GET /capabilities — {"modes":["pat"]}; probes advertise which exchange modes this deployment accepts.

The helper writes the signed JWT + expiry to the OS cache dir with mode 0600. Stdout contract is exactly one JSON object; all diagnostics go to stderr. Released out-of-band as bridge-v* tags. Install / configure / wire-up steps below.

Extensible provider registry. GatewayRoute.provider is a free-form string resolved at dispatch time against a startup-built registry. Extension crates register new upstreams with:

inventory::submit! {
    systemprompt_api::services::gateway::GatewayUpstreamRegistration {
        tag: "my-provider",
        factory: || std::sync::Arc::new(MyUpstream),
    }
}

The GatewayUpstream trait (async fn proxy(&self, ctx: UpstreamCtx<'_>)) is the single integration seam. Built-in tags seeded automatically; extension tags may shadow built-ins (logged as a warning). Full detail: core/CHANGELOG.md.

Install the bridge credential helper

The systemprompt-bridge binary is the Credential helper script slot in Claude for Work. It turns a PAT into a short-lived JWT that Claude Desktop merges into every inference request routed at this binary. Download the prebuilt macOS, Windows, or Linux binary from systempromptio/systemprompt-core releases.

Current release: bridge-v0.10.0 — Linux x86_64, Windows x86_64 (MSVC ABI), macOS aarch64 (cosign-signed).

1. Download

Linux x86_64

curl -fsSL -o /usr/local/bin/systemprompt-bridge \
  https://github.com/systempromptio/systemprompt-core/releases/download/bridge-v0.10.0/systemprompt-bridge-x86_64-unknown-linux-gnu
chmod +x /usr/local/bin/systemprompt-bridge
curl -fsSL -O https://github.com/systempromptio/systemprompt-core/releases/download/bridge-v0.10.0/SHA256SUMS
sha256sum -c SHA256SUMS --ignore-missing

Windows x86_64 (PowerShell as Administrator):

$dir = "C:\Program Files\systemprompt"
New-Item -ItemType Directory -Force -Path $dir | Out-Null
Invoke-WebRequest `
  -Uri "https://github.com/systempromptio/systemprompt-core/releases/download/bridge-v0.10.0/systemprompt-bridge-x86_64-pc-windows-msvc.exe" `
  -OutFile "$dir\systemprompt-bridge.exe"
[Environment]::SetEnvironmentVariable("PATH", "$env:PATH;$dir", "User")

Windows Smart Screen will flag the unsigned binary on first run → "More info" → "Run anyway".

macOS (source build):

git clone https://github.com/systempromptio/systemprompt-core.git
cd systemprompt-core
cargo build --manifest-path bin/bridge/Cargo.toml --release \
  --target "$(rustc -vV | awk '/host:/ {print $2}')"
sudo install -m 755 \
  "bin/bridge/target/$(rustc -vV | awk '/host:/ {print $2}')/release/systemprompt-bridge" \
  /usr/local/bin/

2. Configure

Linux/macOS: ~/.config/systemprompt/systemprompt-bridge.toml Windows: %APPDATA%\systemprompt\systemprompt-bridge.toml

[gateway]
url = "http://localhost:8080"   # for the local-trial template; swap to your production host

[pat]
token = "sp-live-your-personal-access-token"

Issue a PAT from the running binary with systemprompt admin users pat issue <user-id> --name bridge-laptop. Absent config sections are silently skipped. Dev overrides: SP_BRIDGE_GATEWAY_URL, SP_BRIDGE_PAT.

3. Verify

systemprompt-bridge           # prints exactly one JSON {token, ttl, headers}
systemprompt-bridge --check   # exits 0 if a token can be issued

Diagnostics go to stderr only. The stdout JSON matches Anthropic's inferenceCredentialHelper contract byte-for-byte.

4. Point Claude Desktop at it

In Claude Desktop Enterprise → Settings → Inference:

Credential helper script: /usr/local/bin/systemprompt-bridge (or C:\Program Files\systemprompt\systemprompt-bridge.exe).
API base URL: the gateway.url from your TOML.

Every Claude Desktop request now lands a row in ai_requests with user_id, tenant_id, session_id, trace_id, tokens, cost, and latency — identical governance to every other tool call. Run systemprompt infra logs audit <request-id> --full after a prompt to see the trace end-to-end.

5. (Optional) Install the `org-plugins/` sync agent

The same binary manages the bridge's signed plugin / managed-MCP mount:

systemprompt-bridge install     # register launchd (macOS) / scheduled task (Windows) / systemd --user (Linux)
systemprompt-bridge sync        # pull signed plugin manifest + allowlist now
systemprompt-bridge validate    # verify the ed25519 signature
systemprompt-bridge uninstall   # remove

Mount targets: /Library/Application Support/Claude/org-plugins/ (macOS), C:\ProgramData\Claude\org-plugins\ (Windows), ${XDG_DATA_HOME:-$HOME/.local/share}/Claude/org-plugins/ (Linux).

License

This template is MIT. Fork it, modify it, use it however you like.

systemprompt-core is BSL-1.1: free for evaluation, testing, and non-production use. Production use requires a commercial license. Each version converts to Apache 2.0 four years after publication. Licensing enquiries: [email protected].