OpenGrok Go MCP Server

Source-code intelligence for agents, powered by OpenGrok and Go.

Documentation

opengrok-go-mcp mascot

opengrok-go-mcp

Source-code intelligence for agents, powered by OpenGrok and Go.

I have seen things you people wouldn’t grep. — Senior Architect, probably.


License Go CI MCP Evals


Agent-oriented MCP server for searching, navigating, and reading code through OpenGrok.

It turns OpenGrok into a safer code-intelligence surface for LLM agents:

  • capability-gated tools that only appear when the backing OpenGrok feature works
  • paginated search and file reads with stable cursors
  • citation URLs on code results so answers can point back to source
  • warnings for broad, heuristic, truncated, or best-effort results (warnings[] codes plus a legacy warning string)
  • automatic context expansion around search hits with explicit limits
  • full, compact, and experimental gateway tool surfaces for different agent styles

If you are an AI agent reading this repository, start with AGENTS.md for project constraints and agent workflow guidance.

Pre-1.0 note: this MCP server is still evolving. Some tools, responses, and configuration paths may be broken or change before a stable 1.0 release. Please report issues using docs/reporting-issues.md.

When To Use It

Use opengrok-go-mcp when you want an agent to investigate a large indexed codebase without cloning it locally. It works best for finding symbols, reading files, tracing references, narrowing broad searches, and producing answers with source citations.

It is intentionally honest about OpenGrok's limits. OpenGrok provides full-text search plus ctags definitions, not a full semantic call graph or AST engine. For structural questions, use this server to find the right files and symbols, then verify relationships with language-aware tools when needed.

Client Setup

Required: OPENGROK_MCP_BASE_URL — OpenGrok API base URL ending in /api/v1. On startup the server discovers projects, probes capabilities, and registers only tools that work. A typical reverse-proxied instance needs nothing else.

Copy a block below, replace the URL, restart the client.

The server defaults to OPENGROK_MCP_AGENT_PROFILE=economy (lean payloads, no auto context expansion). Set OPENGROK_MCP_AGENT_PROFILE=rich when you want expanded search context by default.

Released binary (no Go install)

Download the archive for your OS/arch from GitHub Releases, verify checksums.txt, and point your MCP client at the unpacked opengrok-go-mcp binary. Example (Claude Code):

{
  "mcpServers": {
    "opengrok": {
      "command": "/path/to/opengrok-go-mcp",
      "env": {
        "OPENGROK_MCP_BASE_URL": "https://your-opengrok-host/source/api/v1"
      }
    }
  }
}
Claude Code (go run)

Add to ~/.claude.json under mcpServers, or run claude mcp add:

{
  "mcpServers": {
    "opengrok": {
      "command": "go",
      "args": [
        "run",
        "github.com/rokasklive/opengrok-go-mcp/cmd/[email protected]"
      ],
      "env": {
        "OPENGROK_MCP_BASE_URL": "https://your-opengrok-host/source/api/v1"
      }
    }
  }
}
OpenCode (go run)

Add to opencode.json:

{
  "$schema": "https://opencode.ai/config.json",
  "mcp": {
    "opengrok": {
      "type": "local",
      "enabled": true,
      "command": [
        "go",
        "run",
        "github.com/rokasklive/opengrok-go-mcp/cmd/[email protected]"
      ],
      "environment": {
        "OPENGROK_MCP_BASE_URL": "https://your-opengrok-host/source/api/v1"
      }
    }
  }
}
Codex

Add to .codex/config.toml in the project root or ~/.codex/config.toml:

[[mcp_servers]]
name = "opengrok"
command = ["go", "run", "github.com/rokasklive/opengrok-go-mcp/cmd/[email protected]"]

[mcp_servers.env]
OPENGROK_MCP_BASE_URL = "https://your-opengrok-host/source/api/v1"
Other clients (Cursor, VS Code MCP, …)

Most stdio MCP clients use the same shape as Claude Code — command, args, and an env map (name may vary: environment, env):

{
  "mcpServers": {
    "opengrok": {
      "command": "go",
      "args": [
        "run",
        "github.com/rokasklive/opengrok-go-mcp/cmd/[email protected]"
      ],
      "env": {
        "OPENGROK_MCP_BASE_URL": "https://your-opengrok-host/source/api/v1"
      }
    }
  }
}

Cursor: project .cursor/mcp.json or Cursor Settings → MCP. VS Code: MCP extension config with the same server entry.

Local clone (replace go run package path)

Use the same env / environment block as above. Example command for Claude Code:

"command": "sh",
"args": [
  "-c",
  "cd /path/to/opengrok-go-mcp && go run ./cmd/opengrok-go-mcp --read-timeout=30s --write-timeout=30s"
]

Optional HTTP mode from a clone:

OPENGROK_MCP_TRANSPORT=http \
OPENGROK_MCP_BASE_URL=https://your-opengrok-host/source/api/v1 \
go run ./cmd/opengrok-go-mcp

MCP endpoint: http://127.0.0.1:8765/mcp

Common environment variables

VariableRequiredWhen to set
OPENGROK_MCP_BASE_URLyesOpenGrok API URL ending in /api/v1
OPENGROK_MCP_API_TOKENnoAuth required, or startup logs 401/403 on probes. Full Authorization value: Bearer <token> or Basic <credentials>. Never logged.
OPENGROK_MCP_DEFAULT_PROJECTnoMultiple projects and you want one implicit on calls that omit project. Auto-set when exactly one project is discovered.

If search probes return 401/403 without a token, the server still starts and logs remediation; search tools stay gated until OPENGROK_MCP_API_TOKEN is set.

All environment variables
VariableDefaultPurpose
OPENGROK_MCP_WEB_BASE_URLderivedWeb UI base for citations and raw-file fallback
OPENGROK_MCP_PROJECTSComma-separated allowlist; skips API/scrape discovery
OPENGROK_MCP_DISABLE_PROJECT_SCRAPEfalseSkip web-UI scrape when /projects/indexed fails
OPENGROK_MCP_PROJECT_REQUIREDtrueRequire project on tool calls
OPENGROK_MCP_PROBE_FILEproject/path for file-read capability probe
OPENGROK_MCP_TRANSPORTstdiohttp for Streamable HTTP (127.0.0.1:8765/mcp)
OPENGROK_MCP_LISTEN127.0.0.1:8765HTTP listen address
OPENGROK_MCP_TOOL_SURFACEcompactfull (fine-grained tools) or gateway (experimental)
OPENGROK_MCP_AGENT_PROFILEeconomyrich for expanded search context and per-result links by default. Per-call expand_context / response_mode / include_links still override
OPENGROK_MCP_MEMORY_ENABLEDtrueProcess-scoped memory tools on the full surface only (stdio). Disabled over HTTP regardless of this setting
OPENGROK_MCP_INSECURE_SKIP_TLS_VERIFYfalseTrusted internal hosts with broken TLS only
OPENGROK_MCP_CURSOR_SECRETHMAC secret for signed pagination cursors
OPENGROK_MCP_AUTO_EXPAND_CONTEXTtrueAuto-expand context around search hits
OPENGROK_MCP_CONTEXT_BEFORE / AFTER5 / 10Expansion window lines
OPENGROK_MCP_MAX_EXPANDED_RESULTS10Max results to expand
OPENGROK_MCP_MAX_EXPANDED_FILES5Max files fetched during expansion
OPENGROK_MCP_CONTEXT_FETCH_CONCURRENCY3Parallel fetches during expansion
OPENGROK_MCP_RETRY_MAX_ATTEMPTS2Retries for transient OpenGrok errors
OPENGROK_MCP_RETRY_BASE_DELAY200msRetry backoff base
OPENGROK_MCP_CACHE_ENABLEDfalseIn-process response cache
OPENGROK_MCP_CACHE_TTL5mCache entry lifetime
OPENGROK_MCP_CACHE_MAX_SIZE1000Max cache entries
DEBUGfalse1 logs OpenGrok HTTP requests to stderr

Context budget overrides: OPENGROK_MCP_BUDGET_{MINIMAL|DEFAULT|MAXIMAL}_{BEFORE|AFTER|RESULTS|FILES}.

Deprecated: OPENGROK_MCP_PROJECT_SCRAPE (use OPENGROK_MCP_DISABLE_PROJECT_SCRAPE). Removed: OPENGROK_MCP_BASIC_AUTH_TOKEN (use OPENGROK_MCP_API_TOKEN="Basic …").

Full reference: docs/configuration.md.

Security

Avoid passing secrets as CLI flags. Use OPENGROK_MCP_API_TOKEN for OpenGrok auth; the server never logs token values.

Operational caveats:

  • HTTP mode does not add inbound client authentication. Keep the default loopback bind address or put it behind trusted network/auth controls.
  • OPENGROK_MCP_INSECURE_SKIP_TLS_VERIFY=true is only for controlled internal instances with broken certificates. Do not use it for public or untrusted hosts.
  • Raw file fallback uses OPENGROK_MCP_WEB_BASE_URL with the same configured credentials. Treat that URL as part of the trusted OpenGrok boundary.
  • Memory tools are full-surface only (stdio). They are process-scoped, ephemeral, and disabled over HTTP because memory is not isolated by client session.
  • Set OPENGROK_MCP_CURSOR_SECRET for shared deployments if cursor integrity matters.

Evaluation

Hermetic stdio evals in evals/ — real MCP binary, fake OpenGrok backend, no live instance. CI runs them on every PR (ci.yml). README summaries and Δ columns compare against committed baselines in evals/baselines/. Refresh locally with scripts/update-eval-results.sh; the optional pre-push hook (scripts/install-githooks.sh) runs tests and updates those files before you push.

go test ./evals/ -count=1                        # contract + token benchmark
go test ./evals/ -run TestEvalSuite -count=1      # MCP contract only
go test ./evals/ -run TestTokenBenchmark -count=1 # token economy only

How to read the tables below

  • Contract eval — hermetic MCP calls against a fake OpenGrok backend; checks outputs, errors, and pagination fields. Δ is change vs the committed baseline in evals/baselines/. coverage@K is the fraction of eval cases exercised.
  • Token benchmark — same harness, but measures UTF-8 bytes crossing the MCP wire (tool schemas, requests, responses). Est. tokens = bytes ÷ 4 (rough heuristic, not a specific model tokenizer).
  • Surfacefull (fine-grained tools), compact (4 consolidated tools, default), or gateway (experimental discover + call).
  • ListTools — one-time cost when the client loads the tool list and schemas at session start; usually the largest line item on full.
  • Warm totalListTools plus all tool calls in a scenario (request + response bytes). For gateway, warm excludes the one-time opengrok_discover call; cold includes it (first session only). On full and compact, cold = warm.
  • Min–max — range across the four replay scenarios (symbol lookup, file browse, multi-step symbol search, search-and-read). Per-scenario breakdown is in the collapsed table.
  • Δ on token rows — change in estimated tokens vs the last committed baseline (evals/baselines/token_report.json).

Contract eval

Last run: 2026-06-24 · direct-call · harness docs →

10/10 passed · 100% (Δ ±0) · 100% coverage@K — see How to read the tables for Δ and coverage@K.

Per-tool scores
ToolScoreCases
get_file_context100% (Δ ±0)1
list_projects100% (Δ ±0)1
list_symbols100% (Δ ±0)1
read_file100% (Δ ±0)1
search_code100% (Δ ±0)4
search_symbol_definitions100% (Δ ±0)1
search_symbol_references100% (Δ ±0)1

Token economy benchmark

Last run: 2026-06-24 · deterministic-replay · est. tokens = bytes÷4 (heuristic, not model-exact)

ListTools dominates session cost on the full surface (18 tools). Compact (4) and gateway (2) register far fewer schemas.

SurfaceListTools (est. tokens)Warm total min–max (est. tokens)
full14k (Δ ±0)14k–15k (Δ ±0)
compact3.5k (Δ ±0)4.4k–6.8k (Δ ±0)
gateway261 (Δ ±0)1.2k–2.1k (Δ ±0)

Warm = ListTools + scenario tool traffic. Gateway warm omits one-time discover; full/compact cold = warm. Compact file-exploration skips files.list (no compact op).

Per-scenario warm totals (est. tokens; ListTools + calls)
Scenariofullcompactgateway
Compound symbol15k (Δ ±0)4.8k (Δ ±0)1.6k (Δ ±0)
File exploration14k (Δ ±0)4.4k (Δ ±0)1.2k (Δ ±0)
Symbol investigation (3 calls)15k (Δ ±0)5.3k (Δ ±0)2.1k (Δ ±0)
Search + read15k (Δ ±0)4.5k (Δ ±0)1.3k (Δ ±0)

Δ vs baseline from 2026-06-24.

Development

This project uses GitHub Spec Kit for non-trivial feature planning.

For meaningful behavior changes, new MCP tools, schema changes, configuration changes, or changes that affect agent-facing behavior, contributors should start from the project constitution:

  • .specify/memory/constitution.md

Feature work should generally produce:

  • specs/FEATURE/spec.md
  • specs/FEATURE/plan.md
  • specs/FEATURE/tasks.md

Small bug fixes, documentation edits, dependency bumps, and mechanical refactors do not require a full Spec Kit workflow unless they affect the public MCP contract.

All changes must preserve the MCP contract, OpenGrok semantics, security posture, compatibility expectations, and documentation requirements described in the constitution.

Known Limitations

  • Large project traversal is bounded, and some search and discovery operations are best-effort rather than language-semantic.
  • HTTP transport is intended for controlled local or internal setups and does not add inbound client authentication.

See docs/limitations.md for the detailed current list, behavioral impact, and mitigations.

License

opengrok-go-mcp is licensed under the Apache License 2.0 (Apache-2.0) for new releases starting with v0.3.0-beta.2.

Earlier published releases up to and including v0.3.0-beta.1 were released under CC0-1.0 and remain available under those terms.