Reexpress MCP Server

官方

為您的搜尋、軟體及資料科學工作流程啟用相似度-距離-幅度統計驗證

文件

Reexpress 模型上下文協定 (MCP) 伺服器

適用於工具調用型 LLM（例如 Claude Opus 4.7）以及在 macOS（Tahoe 26 或更新版本，Apple 晶片）或 Linux 上執行的 MCP 用戶端

影片概覽¹：此處

Screenshot image of the rendered HTML output from the Reexpress tool.

Reexpress MCP 伺服器是一個即插即用的解決方案，可為您複雜的 LLM 管線以及日常使用 LLM 進行搜尋和問答的場景（特別是在軟體開發和資料科學環境中）添加最先進的統計驗證。這是為您的 AI 工作流程提供的第一個可靠、統計上穩健的 AI 第二意見。

只需安裝 MCP 伺服器，然後將 Reexpress 提示詞附加到您的聊天文字末尾。工具調用型 LLM（例如 Anthropic 的 LLM 模型 Claude Opus 4.7）隨後將使用提供的預訓練 Reexpress 相似度-距離-幅度 (SDM) 估計器來檢查其回應，該估計器整合了 gpt-5.5-2026-04-23、gemini-3.1-pro-preview 和 gemini-embedding-2，以及來自工具調用型 LLM 的輸出，並根據 OpenVerification1 資料集中的訓練和校準範例資料庫，計算出預測不確定性的穩健估計值。Reexpress 方法的獨特之處在於，您可以輕鬆地使模型適應您的任務：只需在驗證完成後調用 ReexpressAddTrue 或 ReexpressAddFalse 工具，後續對 Reexpress 工具的調用就會在計算驗證機率時動態地將您的更新納入考量。我們還包含了模型的訓練腳本，以便在需要進行更實質性的更改，或者您想使用替代的底層 LLM 時，可以執行完整的重新訓練。

[!NOTE] 除了為您（使用者）提供一個基於原則的、對輸出結果在給定指令下可信度的估計之外，工具調用型 LLM 本身也可以使用驗證輸出來逐步完善其答案，判斷是否需要額外的外部資源或工具，或者是否已陷入僵局而需要向您請求進一步的澄清或資訊。這就是我們所說的使用 SDM 驗證進行推理——這是 AI 工具包中一項全新的能力，我們相信它將為個人和企業的 LLM 及 LLM 代理開闢更廣泛的應用場景。

資料僅透過標準的 LLM API 呼叫發送至 Azure/OpenAI 和 Google，其中 gemini-3.1-pro-preview 的呼叫透過 API 獲得了標準的網路搜尋存取權限；SDM 估計器的所有處理都在您的電腦本機上完成。Reexpress MCP 具有一個簡單、保守但有效的檔案存取系統：您可以透過檔案存取工具 ReexpressDirectorySet() 和 ReexpressFileSet() 明確指定要傳送至 LLM API 的額外檔案（如果有的話），從而控制哪些檔案會被傳送。

2.4.0 版本的新功能

模型卡可在此處取得。

2.4.0 版本使用 gpt-5.5-2026-04-23 和 gemini-3.1-pro-preview 作為生成模型。與 2.3.0.preview 版本一樣，gemini-embedding-2 取代了本機的 granite-3.3-8b-instruct 模型，作為一致性表示模型。這極大地簡化了伺服器的執行，因為您不再需要在本機執行一個數十億參數的模型。此外，我們還使用新的範例擴充了 OpenVerification1 資料集。詳情請參閱模型卡。

更多說明請見 changelog.md。

系統需求

MCP 伺服器在 Linux 和 macOS 上執行。主要需求是執行 MCP 伺服器的機器需要能夠在本機執行一個小型、僅有 3 百萬參數的 PyTorch 模型，因此計算需求極低。（正如所述：僅 3 百萬個參數；而非 3 十億個參數。該模型由基於 gemini-embedding-2 的 SDM 激活以及兩個 API 語言模型的分類輸出組成。）

安裝

請參閱 INSTALL.md。

[!TIP] 相較於其他 MCP 伺服器，Reexpress MCP 伺服器的設定相對簡單，但我們假設您對 LLM、MCP 和命令列工具有一定程度的熟悉。我們的目標受眾是開發人員和資料科學家。請僅添加來自您信任來源的其他 MCP 伺服器，並請記住，其他 MCP 工具可能會以意想不到的方式改變我們 MCP 伺服器的行為。

設定選項

請參閱 CONFIG.md。

如何使用

請參閱 documentation/HOW_TO_USE.md。

使用工具調用輸出產生靜態 HTML

請參閱 documentation/OUTPUT_HTML.md。

指南

請參閱 documentation/GUIDELINES.md。

常見問題

請參閱 documentation/FAQ.md。

訓練與校準資料

請參閱 documentation/DATA.md。

基於 OpenVerification1 的評估

請參閱 documentation/EVAL.md。

系統示範論文

我們的系統示範論文「可內省、可更新且不確定性感知的語言模型指令遵循分類」（特別聚焦於 Reexpress MCP 伺服器 2.1.0 版本）的副本收錄於此處。用於重現分析的支援腳本收錄於此處。

2.4.0 版本的模型卡（重點說明了自系統示範論文以來的變更）可在此處取得。

CAIS 2026 system demonstration poster.

引用

如果您發現此軟體有用，請考慮引用以下經同儕審查的論文：

@misc{Schmaltz-2025-SimilarityDistanceMagnitudeActivations,
      title={Similarity-Distance-Magnitude Activations}, 
      author={Allen Schmaltz},
      year={2025},
      eprint={2509.12760},
      archivePrefix={arXiv},
      primaryClass={cs.LG},
      url={https://arxiv.org/abs/2509.12760},
      note={To appear in \emph{Findings of the Association for Computational Linguistics: ACL 2026}, San Diego, CA, USA.},
}

@inproceedings{Schmaltz-2026-ReexpressMCPServer,
author = {Schmaltz, Allen},
title = {Introspectable, Updatable, and Uncertainty-aware Classification of Language Model Instruction-following},
year = {2026},
isbn = {9798400724152},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
url = {https://doi.org/10.1145/3786335.3813214},
doi = {10.1145/3786335.3813214},
abstract = {In this system demonstration paper, we introduce an open-source implementation for training and testing Similarity-Distance-Magnitude (SDM) estimators for the task of binary classification of instruction-following of closed-weight language models (LMs). This SDM estimator provides an approximately conditional estimate of the predictive uncertainty over instruction-following, conditional on multiple closed-weight LMs and the representation space of an open-weight model. While it would be more robust to use as input to the SDM estimator the hidden-states of the underlying models, this indirect, compositional proxy is more reliable than verbalized uncertainty and adds a means of auditing the predictions against data with known labels. We release the code as an MCP Server to simplify adding interpretability-by-exemplar and locally updatable, uncertainty-aware instruction-following to agent-based pipelines. We further release OpenVerification1, a balanced set of over two million examples of instruction-following and associated rationales from recent closed-weight LMs, for bootstrapping domain-specific estimators. Finally, we discuss limitations of estimating the predictive uncertainty without access to the hidden-states of the tool-calling LM and provide practical guidance for applications.},
booktitle = {Proceedings of the ACM Conference on AI and Agentic Systems},
pages = {1259–1269},
numpages = {11},
keywords = {Approximately conditional calibration, Interpretability-by-exemplar, Classification of instruction-following, Model ensembles},
location = {
},
series = {CAIS '26}
}

The 輸出格式自影片中使用的 v1.0.0 版本以來已有所變更。請參閱 changelog.md。 ↩