tablebridge MCP Server

Query a folder of CSV / Parquet / JSON files with SQL via DuckDB — read-only and sandboxed; scattered files become one queryable source.

GitHub

Documentation

tablebridge

Turn a folder of CSV / Parquet / JSON files into one SQL-queryable source for your AI agent.

Small businesses don't have a data warehouse — they have a folder full of exports: customers.csv, last month's orders.xlsx, a regions.json someone emailed over. tablebridge is an MCP server that points DuckDB at that folder, exposes each file as a SQL table, and lets your agent run read-only SQL — including JOINs across files — to answer questions over all of them at once. Scattered spreadsheets become one queryable source of truth.

It's read-only and sandboxed: files are loaded into an in-memory database, the data directory is the only thing it can see, and queries are validated so an agent can't write, escape to other paths, or call raw file functions.

Why you'd want this

🔗 One source over many files. JOIN orders.csv to customers.csv to regions.json in a single query — no ETL, no database to stand up.
🦆 DuckDB-powered. Fast analytical SQL over CSV, TSV, Parquet, JSON/NDJSON.
🔒 Safe by design. Files are materialized into memory; queries are validated read-only; raw file-access functions and out-of-sandbox paths are rejected.
🤖 Agent-friendly. list_sources → describe → query is a natural flow the agent can follow on its own.
🪶 Two dependencies (mcp, duckdb), fully typed and tested.

Install

uvx tablebridge          # run directly
# or
pip install tablebridge  # then run: tablebridge

Claude Code

TABLEBRIDGE_DATA_DIR=/path/to/your/data claude mcp add tablebridge -- uvx tablebridge

Claude Desktop / Cursor

{
  "mcpServers": {
    "tablebridge": {
      "command": "uvx",
      "args": ["tablebridge"],
      "env": { "TABLEBRIDGE_DATA_DIR": "/path/to/your/data" }
    }
  }
}

Run with Docker

A Dockerfile is included. The server speaks MCP over stdio. Mount the folder you want to query at /data (read-only is fine) and run interactively (-i):

docker build -t tablebridge .
docker run --rm -i -v /path/to/your/data:/data:ro tablebridge

Tools

Tool	Description
`list_sources`	List the tables (one per data file) with column counts — start here
`describe`	A table's columns and types
`preview`	First N rows of a table
`query`	Run read-only SQL (DuckDB dialect) across the tables, JOINs included
`refresh`	Re-scan the data directory for added/changed files
`server_info`	Effective config (data dir, row cap, supported formats)

Example

With a folder containing customers.csv, orders.csv, and regions.json:

You: Who are my top 3 customers by total spend, and what region are they in?

Agent: (calls list_sources, then query)
SELECT c.name, r.region, SUM(o.total) AS spend
FROM customers c
JOIN orders o   ON o.customer_id = c.id
JOIN regions r  ON r.customer_id = c.id
GROUP BY c.name, r.region
ORDER BY spend DESC
LIMIT 3;

Configuration

Variable	Default	Description
`TABLEBRIDGE_DATA_DIR`	`.`	Directory of files to expose (the sandbox boundary)
`TABLEBRIDGE_MAX_ROWS`	`1000`	Max rows returned per query/preview
`TABLEBRIDGE_RECURSIVE`	`1`	Scan subdirectories too

Supported formats: .csv, .tsv, .parquet, .json, .ndjson.

Security model

Sandboxed to TABLEBRIDGE_DATA_DIR — only files under it are loaded.
Materialized into an in-memory DuckDB, then external filesystem access is disabled — queries can't reach other paths.
Validated SQL — a single read-only statement only; writes and raw file-reader functions are rejected.

Development

git clone https://github.com/Michael-WhiteCapData/tablebridge-mcp
cd tablebridge-mcp
uv pip install -e ".[dev]"
ruff check .
pytest          # uses real DuckDB over temp files

See CONTRIBUTING.md.