OpenGrok

OpenGrok MCP Server is a native Model Context Protocol (MCP) VS Code extension that seamlessly bridges the gap between your organization's OpenGrok indices and GitHub Copilot Chat. It arms your AI assistant with the deep, instantaneous repository context required to traverse, understand, and search massive codebases using only natural language.

OpenGrok MCP Server logo

OpenGrok MCP Server

MCP server bridging OpenGrok search engine with AI for instant context across massive codebases

VS Code Marketplace Installs npm MCP Registry CI GitHub Release


📚 Table of Contents

Overview

💡 Self-Contained Architecture: The VS Code extension includes the MCP server pre-packaged. You don't need Python, external Node.js installations, or complex environment setups. Just install and go.


Installation

Option 1 — VS Code Extension (Recommended)

Install OpenGrok MCP from the VS Code Marketplace, or search "OpenGrok" in the Extensions panel.

The extension provides a visual configuration UI and manages the MCP server process automatically.

Option 2 — npm / npx

Global install:

npm install -g opengrok-mcp-server
opengrok-mcp setup      # interactive wizard: URL, credentials, MCP client registration

Or run without installing:

npx opengrok-mcp-server setup

The wizard stores credentials securely in the OS keychain (macOS Keychain, Windows Credential Manager, Linux libsecret) with an encrypted file fallback for headless Linux.


Configuration Guide

  1. Provide Connection Details:

    • After installation, the Settings panel will launch.
    • Input your OpenGrok endpoint, username, and password. Hit Save Settings. (Credentials are locked in your native OS keychain).
    • The plugin verifies the connection instantly. On your first run, VS Code will ask you to Reload the Window to register the MCP tools.
    • (Need to change this later? Use the OpenGrok: Manage Configuration command or click the gear icon in the status bar).
  2. Activate the MCP Source in Copilot:

    • Launch the GitHub Copilot Chat window. Ensure you're using Agent mode.
    • Click the paperclip/tools icon (🔧) in the prompt box.
    • (If an Update Tools button appears, click it).
    • Locate OpenGrok in the list, check the box, and confirm.

⚠️ Note that VS Code manages tool authorizations per workspace. If you open a different repository, you may need to re-check the OpenGrok box in Copilot.

CLI Commands (v7.0+)

CommandDescription
npx opengrok-mcp-server setupInteractive wizard: configures your MCP client and stores credentials securely
opengrok-mcp statusHealth check: validates connectivity and detects installed MCP clients. Reads config from ~/.claude.json, ~/.copilot/mcp-config.json, or Codex TOML when OPENGROK_BASE_URL is not in env
opengrok-mcp --versionPrint version and exit

setup supports Claude Code CLI, GitHub Copilot CLI, and Codex CLI. VS Code is configured automatically by the extension — no CLI step needed. Credentials are stored in the OS keychain with an AES-256-GCM encrypted file fallback for headless/CI environments.

🔌 Third-Party Client Support

While tailored for VS Code, the integrated server logic runs perfectly with other agents natively supporting the MCP protocol, including:

Claude Desktop | Cursor IDE | Windsurf | Claude Code | Google Antigravity

👉 Refer to MCP_CLIENTS.md for configuration snippets and advanced daemon setups.


Prompting Examples

Talk to GitHub Copilot Chat naturally about your codebase:

Find the implementation of the render_pipeline function within the graphics engine project.

Retrieve the contents of /src/utils/math.cpp from line 450 to 520.

What is the definition of TextureManager? Please show me the header file declaration too.

Look for all places in the code where ThreadPool is instantiated or referenced.

Tool Reference

Primary Operations

Tool NamePurpose
opengrok_search_codeGeneral search utility (full-text, defs, refs, path, history). Supports file_type filtering.
opengrok_find_fileLocate files by name or directory pattern.
opengrok_get_file_contentRead source code (requires start_line and end_line for large files).
opengrok_get_file_historyRetrieve commit history logs.
opengrok_browse_directoryView folder structure and contained files.
opengrok_list_projectsSee all indexed repositories/projects.
opengrok_get_file_annotateSee line-by-line git blame information.
opengrok_get_file_symbolsExtract classes, functions, macros, and structs rapidly from a single file.
opengrok_search_suggestGet query autocomplete recommendations.

🚀 Optimized Workflows (Compound Tools)

💡 These specialized tools merge multiple network requests into a single operation, reducing API chatter and cutting token usage by up to 90%.

Compound ToolFunctionality ReplacedEfficiency Gain
opengrok_get_symbol_context1) searches definition, 2) reads source, 3) fetches headers, 4) gets references~92% fewer tokens
opengrok_search_and_read1) executes search, 2) immediately fetches surrounding code context~92% fewer tokens
opengrok_batch_searchCombines 2-5 individual search queries; deduplicates file:line hits across queries~73% fewer tokens
opengrok_index_healthChecks latency, backend connectivity, staleness score, and latency trendDiagnostic utility

(Note: The search functions support language filtering. Pass file_type as java, cxx, python, golang, etc.)

🔍 Investigation & Analysis Tools (v5.6+)

ToolPurpose
opengrok_what_changedRecent line changes grouped by commit — author, date, SHA, changed lines with context. Parameters: project, path, since_days
opengrok_dependency_mapBFS traversal of #include/import chains up to configurable depth (1–3); directed graph with uses/used_by
opengrok_search_patternRegex code search via regexp=true; returns file:line:content matches
opengrok_blameGit blame with line range (start_line/end_line); returns author, date, commit per line (v5.6+)
opengrok_call_graphCall chain tracing via OpenGrok API v2 /symbol/{name}/callgraph (requires OPENGROK_API_VERSION=v2)
opengrok_get_file_diffUnified diff between two revisions with full context lines — shows surrounding code so AI understands why a change was made; use opengrok_get_file_history to discover revision hashes

🧠 Memory Tools (Code Mode only, v5.4+)

ToolPurpose
opengrok_memory_statusShows both memory files (status, bytes, 3-line preview) — helps LLM decide whether to read
opengrok_read_memoryRead active-task.md or investigation-log.md from the Living Document memory bank
opengrok_update_memoryWrite or append to memory files; auto-timestamps investigation-log.md entries

🧬 Code Mode (v5+) — For Large Multi-Language Codebases

Set OPENGROK_CODE_MODE=true to switch to a 5-tool interface optimised for multi-step investigations:

ToolPurpose
opengrok_apiGet the full API spec (call once at session start). With OPENGROK_ENABLE_ELICITATION=true, also prompts the user to select a working project if none is configured.
opengrok_executeRun JavaScript in a sandboxed QuickJS VM with access to all OpenGrok operations via env.opengrok.*

All env.opengrok.* calls appear synchronous inside your code — the sandbox bridges async HTTP calls transparently using a SharedArrayBuffer + Atomics channel. Token savings of 80–95% are typical for complex investigations.

v9.0+ sandbox methods for interactive prompts and AI assistance:

MethodPurpose
env.opengrok.elicit(message, schema)Pause execution and ask the user to select from a list — e.g., pick the correct file from multiple matches. Returns { action, content }. Requires OPENGROK_ENABLE_ELICITATION=true.
env.opengrok.sample(prompt, opts?)Request an AI-generated string from the client's LLM — e.g., reformulate a zero-result query. Returns string | null (null when client doesn't support sampling). Always null-guard the result.

When env.opengrok.search() returns zero results and sampling is available, _suggestions: string[] is automatically injected into the result — check it before calling sample() explicitly.

// Example opengrok_execute code
const refs = env.opengrok.search("handleCrash", { searchType: "refs", maxResults: 5 });
const first = refs.results[0];
const content = env.opengrok.getFileContent(first.project, first.path, {
  startLine: first.matches[0].lineNumber - 5,
  endLine: first.matches[0].lineNumber + 10,
});
return { callerFile: first.path, code: content.content };

The sandbox exposes a Living Document Memory Bank — two persistent markdown files that survive across turns:

FileSize LimitPurpose
active-task.md≤ 4 KBCurrent task state: task:, last_symbol:, next_step:, open_questions:, status:
investigation-log.md≤ 32 KBAppend-only log of findings, grouped by ## YYYY-MM-DD HH:MM: headings

Access via env.opengrok.readMemory(filename) / env.opengrok.writeMemory(filename, content) inside the sandbox, or via the opengrok_read_memory / opengrok_update_memory / opengrok_memory_status tools in classic mode. Delta encoding returns [unchanged] on repeated reads; richness-scored trimming keeps the most valuable log entries when space is tight.

⚙️ Automated Compilation Data (Optional)
Tool NameCapability
opengrok_get_compile_infoReads your local compile_commands.json to extract compiler flags, defines, and include directories for exact C/C++ accuracy.

Project Picker & Interactive Disambiguation (Elicitation)

When OPENGROK_ENABLE_ELICITATION=true, the server uses MCP Elicitation in two places:

  1. Session startopengrok_api (Code Mode) prompts the user to select a working project if no OPENGROK_DEFAULT_PROJECT is configured and more than one project exists.
  2. Mid-execution — Sandbox JS can call env.opengrok.elicit(message, schema) to ask the user to choose between multiple matching files, revisions, or projects at any point during execution.

Requires a client that supports MCP Elicitation:

  • Claude Code v2.1.76+ ✓

Enable in the VS Code configuration panel, or set OPENGROK_ENABLE_ELICITATION=true in your MCP client environment config. The server degrades gracefully to { action: "cancel" } on unsupported clients — no errors.

LLM Sampling

The server delegates LLM calls back to the client via MCP Sampling — using the client's model subscription without needing separate API keys. Used in three places:

  1. Sandbox error explanation — When opengrok_execute code fails, sampling generates a concise explanation and fix suggestion.
  2. Dependency graph summarization — Large opengrok_dependency_map graphs (>10 nodes) are summarized via sampling in legacy mode.
  3. Zero-result query reformulation (v9.0+, Code Mode) — When env.opengrok.search() returns 0 results, sampling auto-injects _suggestions into the result object. Sandbox JS can also call env.opengrok.sample(prompt) explicitly for any AI-generated text.

Supported clients:

The server degrades gracefully when sampling is unavailable — sample() returns null, _suggestions is not injected.


VS Code Integration

Palette Commands

Command PromptAction Performed
OpenGrok: Manage ConfigurationLaunches the interactive settings GUI
OpenGrok: Configure CredentialsFast CLI-style input for authentication
OpenGrok: Test ConnectionValidates API access and token validity
OpenGrok: Show Server LogsExposes background process stdout/stderr
OpenGrok: Check for UpdatesPolls GitHub for new releases
OpenGrok: Status MenuOpens the context menu directly

Core Settings Profile

Expand for JSON Settings Reference
KeyFormatPrimary Usage
opengrok-mcp.baseUrlstringThe URI of your OpenGrok deployment
opengrok-mcp.usernamestringAuthentication identity
opengrok-mcp.verifySslbooleanDisable when using corporate self-signed certs (default: false)
opengrok-mcp.proxystringOptional HTTP traffic router

Advanced Configuration (v7 — env vars)

For the standalone server (npx opengrok-mcp-server or Claude Code), set these environment variables:

Core Settings

VariableValuesDescription
OPENGROK_BASE_URLURLOpenGrok server base URL (required)
OPENGROK_USERNAMEstringAuthentication username (optional — leave unset for anonymous access)
OPENGROK_PASSWORDstringAuthentication password (prefer OS keychain via npx opengrok-mcp-server setup)
OPENGROK_VERIFY_SSLtrue (default) / falseDisable TLS verification for self-signed certs
OPENGROK_TIMEOUTinteger (seconds, default: 30)HTTP request timeout

Code Mode & Performance

VariableValuesDescription
OPENGROK_CODE_MODEtrue (default) / falseSwitch to 5-tool Code Mode (opengrok_api + opengrok_execute + 3 memory tools)
OPENGROK_CONTEXT_BUDGETstandard (default) / minimal / generousResponse size tier: 8 KB / 4 KB / 16 KB
OPENGROK_RESPONSE_FORMAT_OVERRIDEtsv / toon / yaml / text / markdownForce a response format globally for all tools
OPENGROK_DEFAULT_PROJECTstringDefault project name to scope all searches
OPENGROK_DEFAULT_MAX_RESULTSinteger (default: 25)Default search result limit
OPENGROK_LOCAL_COMPILE_DB_PATHScomma-separated pathsPaths to compile_commands.json for C/C++ compiler flag extraction
OPENGROK_ENABLE_CACHE_HINTStrue / false (default: false)Enable cache-control: immutable hints for prompt caching infrastructure

Memory Bank

VariableValuesDescription
OPENGROK_MEMORY_BANK_DIRpathOverride directory for active-task.md + investigation-log.md files
OPENGROK_ENABLE_OBSERVATION_MASKERtrue / false (default: false)Prepend compact history summaries to opengrok_execute results after the full-text window fills. Only useful for clients that truncate context (not Claude Code or Cursor).
OPENGROK_OBSERVATION_MASKER_TURNSinteger (default: 10)Full-text window size: how many of the most-recent opengrok_execute results to keep in full before older ones are replaced with compact summaries.

Rate Limiting

VariableValuesDescription
OPENGROK_RATELIMIT_ENABLEDtrue (default) / falseEnable token-bucket rate limiting
OPENGROK_RATELIMIT_RPMinteger (default: 60)Global requests-per-minute limit
OPENGROK_PER_TOOL_RATELIMITtool:rpm,tool:rpmPer-tool RPM overrides (e.g., opengrok_execute:10,opengrok_batch_search:20)

Response Cache

VariableValuesDescription
OPENGROK_CACHE_ENABLEDtrue (default) / falseEnable TTL response cache
OPENGROK_CACHE_MAX_SIZEinteger (default: 500)Max cache entries
OPENGROK_CACHE_SEARCH_TTLseconds (default: 300)Search result cache TTL
OPENGROK_CACHE_FILE_TTLseconds (default: 600)File content cache TTL
OPENGROK_CACHE_HISTORY_TTLseconds (default: 1800)File history cache TTL
OPENGROK_CACHE_PROJECTS_TTLseconds (default: 3600)Project list cache TTL

Security & Audit

VariableValuesDescription
OPENGROK_AUDIT_LOG_FILEpathFile path for structured audit log (CSV or JSON)

MCP Protocol Features

VariableValuesDescription
OPENGROK_ENABLE_ELICITATIONtrue / false (default: false)Enable project picker at opengrok_api startup (Code Mode) and env.opengrok.elicit() in sandbox. Requires a supporting MCP client.
OPENGROK_ENABLE_FILES_APItrue / false (default: false)Enable FileReferenceCache for investigation-log.md (SHA-256 content-addressed)
OPENGROK_SAMPLING_MODELstringModel preference for MCP Sampling (error explanation, graph summarization)
OPENGROK_SAMPLING_MAX_TOKENSinteger (default: 256, max: 4096)Token budget for MCP Sampling responses

OpenGrok API

VariableValuesDescription
OPENGROK_API_VERSIONv1 (default) / v2OpenGrok REST API version (v2 required for opengrok_call_graph)

HTTP Transport (v7.0+)

VariableValuesDescription
OPENGROK_HTTP_PORTintegerExpose Streamable HTTP transport on this port (in addition to stdio)
OPENGROK_HTTP_MAX_SESSIONSinteger (default: 100)Max concurrent HTTP sessions before new connections are rejected
OPENGROK_HTTP_AUTH_TOKENstringStatic Bearer token for HTTP endpoint authentication
OPENGROK_JWKS_URIURLJWKS endpoint for JWT validation (OAuth 2.1 resource server mode)
OPENGROK_RESOURCE_URIURLThis server's resource URI, advertised in RFC 9728 metadata
OPENGROK_AUTH_SERVERScomma-separated URLsTrusted authorization server URIs
OPENGROK_SCOPE_MAPscope:role,...Map JWT scopes to RBAC roles (e.g., read:readonly,admin:admin)
OPENGROK_STRICT_OAUTHtrue / falseReject requests without a valid JWT when OPENGROK_JWKS_URI is set
OPENGROK_ALLOWED_ORIGINScomma-separated originsCORS allowlist (replaces wildcard CORS)
OPENGROK_RBAC_TOKENStok1:role,tok2:roleRole-based access tokens: admin / developer / readonly

Logging

VariableValuesDescription
OPENGROK_LOG_LEVELdebug / info (default)Verbose structured logging to stderr

VS Code users can set opengrok-mcp.codeMode, opengrok-mcp.contextBudget, opengrok-mcp.memoryBankDir, opengrok-mcp.defaultProject, opengrok-mcp.responseFormatOverride, opengrok-mcp.compileDbPaths, opengrok-mcp.enableObservationMasker, and opengrok-mcp.observationMaskerTurns in VS Code settings instead.

MCP SDK Note: This version uses @modelcontextprotocol/sdk v1.29.0. MCP SDK v2 is in pre-alpha; we will migrate when stable (expected Q3-Q4 2026). v2 will enable enhanced completions for tool parameters and resource templates.


HTTP Transport (v7.0+)

By default the server communicates over stdio (standard MCP). For team deployments, you can also expose a Streamable HTTP endpoint:

OPENGROK_HTTP_PORT=3666 npm run serve
# or add to your MCP client config:
# "OPENGROK_HTTP_PORT": "3666"

Session Management

  • Each HTTP client receives an isolated McpServer instance (per-session factory pattern)
  • Sessions expire after 30 minutes of inactivity; OPENGROK_HTTP_MAX_SESSIONS caps concurrent sessions (default: 100)
  • GET /mcp/sessions returns JSON with active session count and oldest session age

Authentication

Configure one of the following:

MethodConfig
Static Bearer tokenOPENGROK_HTTP_AUTH_TOKEN=mysecret
OAuth 2.1 resource serverOPENGROK_JWKS_URI=https://idp.example.com/.well-known/jwks.json + OPENGROK_RESOURCE_URI=https://opengrok-mcp.example.com
RBAC with named rolesOPENGROK_RBAC_TOKENS='alice-token:admin,bot-token:readonly'

In resource server mode, this server validates JWTs issued by your own IdP — there is no built-in /token endpoint. RFC 9728 protected resource metadata is served at /.well-known/oauth-protected-resource.

RBAC Roles

RolePermissions
adminFull access to all tools and configuration
developerAll search, read, memory, and code tools
readonlySearch and read tools only; no memory writes, no code execution

Fail-safe: unknown or missing tokens default to readonly, not admin.


Security (v7.0+)

v7.0 includes a comprehensive security audit with the following hardening:

AreaProtection
SSRFDNS rebinding detection + IPv6-mapped address blocking in buildSafeUrl
Path traversalNFC normalization + bidirectional Unicode character blocking
HTML injectionhe.decode on all parser text nodes before display
Prompt injectionescapeMarkdownField in all formatters
Token comparisoncrypto.timingSafeEqual for all Bearer token comparisons
CORSAllowlist via OPENGROK_ALLOWED_ORIGINS (no wildcard in production)
Security headersX-Content-Type-Options, X-Frame-Options, CSP on HTTP responses
Credential encryptionAES-256-GCM (migrated from CBC; auto-upgrades existing files)
Rate limitingInteger-based token bucket (eliminates float drift)
ReDoSminimatch for glob patterns
Audit logsInjection-escaped structured audit entries

⚠️ v7.0.0 Breaking Changes

  • OPENGROK_HTTP_CLIENT_ID and OPENGROK_HTTP_CLIENT_SECRET removed. Migrate to OPENGROK_JWKS_URI + OPENGROK_RESOURCE_URI for OAuth 2.1 (resource server model — bring your own IdP).
  • Memory bank migrate() removed — the legacy 6-file layout is no longer supported. The 2-file layout (active-task.md + investigation-log.md) has been the default since v5.4.
  • CORS is now allowlist-only when OPENGROK_ALLOWED_ORIGINS is set; unauthenticated wildcard CORS is disabled.

System Architecture

Show topological diagram
 [ AI Client ]                       [ Integration Layer ]                    [ Data Source ]
                              │                                 │
 +---------------+            │       +-------------------+     │      +----------------------+
 │ GitHub        │<──(stdio)──┼──────>│ OpenGrok MCP      │<────┼─────>│ OpenGrok REST API &  │
 │ Copilot Chat  │            │       │ Server (Node.js)  │HTTP │      │ Web Interface        │
 +---------------+            │       +-------------------+     │      +----------------------+
      │    ▲                           │          │
      │    │ (Configures & Hosts)      │    (Context Optimization)
      ▼    │                           │          │
 +---------------+                     │   o Context Fetch      │      +----------------------+
 │ VS Code       │                     │   o Multi-Search       │      │ Local File System    │
 │ Extension     │                     │   o Auto-Truncate      │<─────┤ (compile_commands) │
 +---------------+                     │                        │      +----------------------+

The underlying code is completely packaged in the marketplace extension via esbuild. The server uses standard VS Code Node APIs without external VM requirements.


Building & Testing

# Initializing
npm install

# Code Quality & Tests
npm run lint           # Strict TypeScript & ESLint validation
npm test               # Execute the Vitest test suite (1113 tests)
npm run test:sandbox   # Sandbox integration tests (requires compile first)
npm run test:coverage  # Coverage report (≥89% threshold)

# Packaging
npm run compile   # Generate the esbuild artifact (includes sandbox-worker.js)
npm run vsix      # Create the downloadable extension file

We leverage GitHub Actions for automated CD. Tagging a commit (e.g., v1.2.3) automatically triggers the build matrix and attaches artifacts to a new GitHub Release.

For deep-dives into the architecture or PR guidelines, please read CONTRIBUTING.md.


Troubleshooting & Support

The MCP tools are missing in Copilot Chat

  • Click the paperclip (🔧) icon to "Update Tools"
  • Run Developer: Reload Window

"Connection failed" errors

  • Double-check your OPENGROK_BASE_URL
  • Make sure you aren't blocked by corporate VPNs/proxies

401 Unauthorized / Authentication failing

  • Run the OpenGrok: Configure Credentials command to save your username/password again

Self-Signed SSL Certificates

  • Turn off strict validation by setting opengrok-mcp.verifySsl to false

Slow queries or timeouts

  • Limit the scope using the file_type argument or targeting a specific project
  • OpenGrok might be indexing; run opengrok_index_health

Need verbose logs?

  • Set the environment variable OPENGROK_LOG_LEVEL=debug to get extensive stdout trace data

OpenGrok Version Compatibility

OpenGrok EngineStatusknown limitations
v1.13.x and aboveNative SupportNone (Full REST API functionality)
v1.7.0 — v1.12.xLegacy ModeUses HTML scraping for symbol lookups and blame
Below v1.7.0UnsupportedUnpredictable behaviour

License Information

This system is distributed under the PolyForm Noncommercial License 1.0.0.

  • Permitted: Personal use, hobby projects, academic research, education
  • Prohibited: Any commercial, business, enterprise, or paid utilization

Commercial Licensing: To use this extension in an enterprise context (internal tooling, CI pipelines, business infrastructure), a commercial license is strictly required. Reach out to [email protected] for enterprise tier pricing.

Read LICENSE-COMMERCIAL.md for full terms.

Related Servers

NotebookLM Web Importer

Import web pages and YouTube videos to NotebookLM with one click. Trusted by 200,000+ users.

Install Chrome Extension