Infino MCP Server

Infino — keyword, vector, hybrid, and SQL retrieval over data on object storage, for AI agents.

Documentation

Infino MCP server

npm MCP Registry License: Apache-2.0

An MCP server for Infino — it lets an AI agent run keyword, semantic, hybrid, and SQL retrieval over your data on object storage, from any MCP-compatible client (Claude Code, Claude Desktop, Cursor, VS Code, and others). Published on npm as @infino-ai/mcp-server and listed on the official MCP Registry as io.github.infino-ai/mcp-server (which propagates to catalogs like Smithery, Glama, and PulseMCP).

  • No API key. Semantic search embeds queries with a local model — nothing leaves the machine for embedding.
  • Read-only by default. Writes and full SQL are opt-in behind a single environment flag.
  • Bring your own storage. Point it at a local path or your own bucket (S3, GCS, Azure, or any S3-compatible store).

Contents


Requirements

  • Node.js ≥ 18 (the server runs as a Node process over stdio).
  • An MCP-compatible client (Claude Code, Claude Desktop, Cursor, VS Code, …).
  • Data reachable by Infino — a local directory, or a bucket with credentials available in the environment (see Storage backends).
  • On first run the server downloads the local embedding model (~90 MB) once and caches it; subsequent runs are offline for embedding.

Quick start

The server is launched by your MCP client over stdio — you don't run it directly in normal use. Every client config follows the same shape: command npx -y @infino-ai/mcp-server, with configuration supplied via environment variables. Set INFINO_MCP_URI to the data you want to serve — a local path or a bucket URI. If it's omitted, the server starts an ephemeral in-process catalog (memory://) that holds no data, so set it for any real use.

{
  "command": "npx",
  "args": ["-y", "@infino-ai/mcp-server"],
  "env": {
    "INFINO_MCP_URI": "/Users/me/.infino/memory"
  }
}

The sections below show the exact place each client expects this block.


Claude Code plugin (one-step install)

For Claude Code, this repo is also a plugin marketplace. Installing the plugin wires up the MCP server plus a how-to-use skill and an /infino-search command in one step — no JSON to edit. Inside Claude Code:

/plugin marketplace add infino-ai/infino-mcp
/plugin install infino@infino-ai

On enable you'll be prompted for your Infino data URI (INFINO_MCP_URI) and whether to enable writes. That's it — the infino_* tools, the using-infino skill, and /infino-search <query> are then available. (Other clients: use the Client setup configs below.)


Client setup

Claude Code

Add the server with the CLI. Use --scope user to make it available in every project, or --scope project to commit it to the repo (writes a shared .mcp.json); the default scope is local (this project only).

claude mcp add infino \
  --scope user \
  -e INFINO_MCP_URI=/Users/me/.infino/memory \
  -- npx -y @infino-ai/mcp-server

Add more knobs with repeated -e flags, e.g. -e INFINO_MCP_ENABLE_WRITES=true. Verify with:

claude mcp list
claude mcp get infino

Claude Desktop

Edit the configuration file (create it if it doesn't exist), then fully restart Claude Desktop.

OSPath
macOS~/Library/Application Support/Claude/claude_desktop_config.json
Windows%APPDATA%\Claude\claude_desktop_config.json
Linux~/.config/Claude/claude_desktop_config.json
{
  "mcpServers": {
    "infino": {
      "command": "npx",
      "args": ["-y", "@infino-ai/mcp-server"],
      "env": {
        "INFINO_MCP_URI": "/Users/me/.infino/memory"
      }
    }
  }
}

Cursor

Add the server to ~/.cursor/mcp.json (available in all projects) or <project>/.cursor/mcp.json (this project only), then reload. The format matches Claude Desktop:

{
  "mcpServers": {
    "infino": {
      "command": "npx",
      "args": ["-y", "@infino-ai/mcp-server"],
      "env": {
        "INFINO_MCP_URI": "/Users/me/.infino/memory"
      }
    }
  }
}

VS Code

VS Code (1.102+) reads MCP servers from .vscode/mcp.json in the workspace (or your user mcp.json via the command palette → MCP: Open User Configuration). Note the top-level key is servers and each entry declares "type": "stdio":

{
  "servers": {
    "infino": {
      "type": "stdio",
      "command": "npx",
      "args": ["-y", "@infino-ai/mcp-server"],
      "env": {
        "INFINO_MCP_URI": "/Users/me/.infino/memory"
      }
    }
  }
}

Other MCP clients

Any client that speaks MCP over stdio works. Configure it to launch:

command: npx
args:    -y @infino-ai/mcp-server
env:     INFINO_MCP_URI=<path-or-bucket-uri>   (plus any options below)

Logs are written to stderr so they never corrupt the JSON-RPC stream on stdout — point your client's log capture there when debugging.


Configuration

All configuration is via environment variables — there are no config files and no command-line flags to manage.

Environment variables

VariableRequiredDefaultDescription
INFINO_MCP_URINomemory:// (ephemeral)Data to serve: a local path (/Users/me/.infino/memory) or a bucket URI (s3://…, gs://…, az://…). If unset, an ephemeral in-process catalog is used (holds no data) — set it for any real use.
INFINO_MCP_ENABLE_WRITESNooffWhen set (1/true/yes), exposes infino_add_documents and lets infino_sql run DDL/DML. Omit for a strictly read-only server.
INFINO_MCP_EMBED_MODELNoXenova/all-MiniLM-L6-v2Hugging Face feature-extraction model used for embedding. Must match the table's vector index dimension (default model is 384-dim).
INFINO_MCP_S3_ENDPOINTNoCustom S3 endpoint for non-AWS S3-compatible stores (Cloudflare R2, MinIO, Backblaze B2, …).
INFINO_MCP_S3_REGIONNoautoRegion to send with a custom S3 endpoint.

Cloud credentials are read from the standard provider environment variables — the server does not introduce its own:

BackendCredentials
AWS S3AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY (+ AWS_SESSION_TOKEN if used)
S3-compatible (R2/MinIO/B2)AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY and INFINO_MCP_S3_ENDPOINT
Google Cloud StorageGOOGLE_APPLICATION_CREDENTIALS (or ambient ADC)
Azure BlobAZURE_STORAGE_ACCOUNT, AZURE_STORAGE_KEY

Storage backends

// Local directory
"env": { "INFINO_MCP_URI": "/Users/me/.infino/memory" }

// AWS S3 — ambient AWS_* credentials, default endpoint
"env": {
  "INFINO_MCP_URI": "s3://my-bucket/infino",
  "AWS_ACCESS_KEY_ID": "…",
  "AWS_SECRET_ACCESS_KEY": "…"
}

// S3-compatible (Cloudflare R2 / MinIO / Backblaze B2) — custom endpoint
"env": {
  "INFINO_MCP_URI": "s3://my-bucket/infino",
  "INFINO_MCP_S3_ENDPOINT": "https://<account>.r2.cloudflarestorage.com",
  "INFINO_MCP_S3_REGION": "auto",
  "AWS_ACCESS_KEY_ID": "…",
  "AWS_SECRET_ACCESS_KEY": "…"
}

Tools

ToolArgumentsWhat it does
infino_semantic_searchtable, query, k, column?, vectorColumn?, filter?Find passages by meaning — embeds the query with a local model (no key) and ranks by vector similarity. Handles paraphrase and synonyms. Optional filter ({column, query, mode?}) restricts the ranking to rows whose keyword column matches first (a pushdown pre-filter).
infino_keyword_searchtable, query, k, column?BM25 full-text search — for exact terms, identifiers, error codes, product names.
infino_hybrid_searchtable, query, k, column?, vectorColumn?Fused keyword + semantic search in one ranking pass — BM25 over the text column combined with vector similarity, so rows matching the literal terms and the meaning rank highest.
infino_token_matchtable, query, column?, mode?, limit?Unranked keyword filter — the set of rows whose text column contains the token(s). Use when you need the matches, not a relevance order.
infino_exact_matchtable, value, column?, limit?Unranked exact-equality filter over an indexed column (tag, status, id string).
infino_sqlquerySQL for counts, filters, joins, aggregates. Read-only (single SELECT/WITH) by default; accepts any single statement when INFINO_MCP_ENABLE_WRITES is set.
infino_list_tablesList the tables in the connected catalog.
infino_describe_tabletableColumn names and types for a table.
infino_add_documentstable, documentsAppend rows (one call = one commit); embeds the text column for vector tables. Only when INFINO_MCP_ENABLE_WRITES is set.
infino_update_documentstable, predicate, documentsReplace the rows matching a SQL predicate with new documents, 1:1 (missing vectors are embedded). Durable storage only. Only when INFINO_MCP_ENABLE_WRITES is set.
infino_delete_documentstable, predicateDelete the rows matching a SQL predicate. Durable storage only. Only when INFINO_MCP_ENABLE_WRITES is set.

The engine's search table functions (bm25_search, vector_search, hybrid_search, …) are not callable from infino_sql — retrieval goes through the dedicated search tools above, which embed and project for you. infino_sql is for filters, joins, and aggregates.


Security & data handling

This server is designed to run locally, beside the client, and to keep data and credentials on the user's machine.

  • Local execution. It runs as a subprocess of your MCP client over stdio. There is no network listener and no remote service.
  • No data sent for embedding. Query and document embedding uses a local model — text is never sent to a third-party embedding API. There is no API key to provision or leak.
  • Credentials stay in the environment. Storage credentials are read from standard provider environment variables and used only to reach the bucket you configured. They are never logged or returned in tool output.
  • Read-only by default. Without INFINO_MCP_ENABLE_WRITES, the write tool is not even advertised to the agent, and infino_sql rejects anything but a single SELECT/WITH. Enable writes deliberately, and prefer scoping the server to data the agent is allowed to modify.
  • Least privilege. Point INFINO_MCP_URI at the narrowest dataset the task needs, and supply storage credentials scoped to that bucket/prefix.

How retrieval works

Semantic search embeds locally with Hugging Face transformers.js (all-MiniLM-L6-v2, 384-dim by default; override with INFINO_MCP_EMBED_MODEL). The server embeds both the documents it ingests (via infino_add_documents) and your queries with the same model, so they align in the same vector space.

If you change INFINO_MCP_EMBED_MODEL, the table's vector index must match the new model's dimension — embeddings produced by different models are not comparable, and a dimension mismatch will fail at search time.


Troubleshooting

SymptomLikely cause / fix
Client shows no Infino toolsServer didn't start — check the client's MCP logs (stderr). Confirm npx is on PATH and INFINO_MCP_URI is set. Fully restart the client after editing config.
INFINO_MCP_URI is requiredThe env var isn't reaching the subprocess. In GUI clients, env must be inside the server's env block (the process won't inherit your shell).
add_documents not availableINFINO_MCP_ENABLE_WRITES isn't set, or the client wasn't restarted after setting it.
Slow first queryOne-time embedding-model download (~90 MB). Subsequent runs use the cache.
Dimension / vector errors on semantic searchThe table's vector index doesn't match the embedding model's dimension. Re-ingest, or set INFINO_MCP_EMBED_MODEL to the model the index was built with.
… in SQL isn't supported from the server yetYou called a search table function inside infino_sql. Use infino_semantic_search / infino_keyword_search instead.

Local development

The server depends on the published @infino-ai/infino Node binding, which resolves from public npm like any other dependency.

npm install
npm run build
INFINO_MCP_URI=/path/to/data node dist/index.js   # runs on stdio

Point a client at node /absolute/path/dist/index.js over stdio to dogfood a local build, or use the MCP Inspector:

npx @modelcontextprotocol/inspector node dist/index.js

License

Apache-2.0