Pipetable MCP Server
DuckDB के माध्यम से वास्तविक SQL का उपयोग करके स्थानीय CSV, Parquet, JSON और TSV फ़ाइलों को क्वेरी करें। आपके AI कोडिंग टूल को काल्पनिक उत्तरों के बजाय वास्तविक डेटा तक पहुंच प्रदान करता है।
दस्तावेज़
Pipetable
Gives your AI coding tool real data access.
Point it at a folder of CSV, Parquet, JSON, or TSV files — your AI can now query them with real SQL instead of hallucinating.
Works as an MCP server for Claude Code, Cursor, RooCode, and Copilot. Also ships as a standalone CLI for interactive data exploration. Powered by DuckDB. Files never leave your machine.
MIT licensed.
Install
# macOS / Linux
curl -fsSL https://pipetable.com/install | sh
# Windows
irm https://pipetable.com/install.ps1 | iex
# Rust
cargo install pipetable
MCP server setup
Claude Code
claude mcp add pipetable pipetable mcp
Cursor / RooCode
{
"mcpServers": {
"pipetable": {
"command": "pipetable",
"args": ["mcp"]
}
}
}
VS Code (Copilot)
{
"servers": {
"pipetable": {
"type": "stdio",
"command": "pipetable",
"args": ["mcp"]
}
}
}
Once configured, your AI can:
scan_folder— register all data files in a folderlist_datasets— see schemas and column typesget_schema— inspect a specific table with sample rowsexecute_sql— run real DuckDB SQL against your files
Results are ground truth from DuckDB, not generated.
CLI
pipetable ~/data/
SQL and natural language at the > prompt. SQL always works. Natural language requires Ollama running locally.
> SELECT region, SUM(revenue) AS total FROM sales GROUP BY 1 ORDER BY 2 DESC
4 row(s)
region total
─────────────
EU 141000
US 32000
APAC 17000
> show me top 5 customers by revenue
Using: customers, sales
Thinking.....
SELECT c.name, SUM(s.revenue) AS total FROM customers c
JOIN sales s ON s.customer_id = c.id
GROUP BY c.name ORDER BY total DESC LIMIT 5
...
→ piped as _last
Piping results
Every query saves its result as _last — a live DuckDB view you can query further:
> SELECT * FROM sales WHERE region = 'EU'
...
→ piped as _last
> show me top 3 from _last
Using: _last
Thinking.....
Dot commands
| Command | Description |
|---|---|
.scan <path> | Load a folder or file (Tab completes) |
.datasets | List loaded datasets |
.schema <name> | Columns + sample rows |
.drop <name> | Remove a dataset from the session |
.use <n1> <n2> | Focus NL queries on specific datasets |
.remove <name> | Remove from focus |
.clear | Reset focus to all datasets |
.model <name> | Switch Ollama model |
.help | Show help |
Tab completes dataset names after FROM, JOIN, .schema, .drop, .use.
One-shot query
pipetable ask "who has the highest revenue?" ~/data/
pipetable ask "SELECT * FROM sales LIMIT 5" ~/data/
Natural language (optional)
Set any one of these — pipetable auto-detects:
# Claude (best quality)
export ANTHROPIC_API_KEY=sk-ant-...
# OpenAI or any compatible API (LM Studio, Groq, Together, etc.)
export OPENAI_API_KEY=sk-...
export OPENAI_BASE_URL=http://localhost:1234 # optional, for local endpoints
# Ollama (local, no key needed)
ollama pull qwen2.5-coder:1.5b
ollama serve
Priority: Anthropic → OpenAI-compatible → Ollama. SQL and MCP work without any of them.
Supported formats
CSV, Parquet, JSON, NDJSON, TSV, Excel (xlsx, xls, xlsm). Files up to 2GB. Folders scanned up to 3 levels deep. Hidden files and common noise directories (node_modules, target, .git) are skipped automatically.