Securely access BigQuery datasets with intelligent caching, schema tracking, and query analytics via Supabase integration.
A FastMCP server for securely accessing BigQuery datasets with intelligent caching, schema evolution tracking, and query analytics via Supabase integration.
Using uv
(recommended):
# Clone the repository
git clone <repository-url>
cd mcp-bigquery-server
# Create virtual environment and install dependencies
uv venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate
uv pip install -e ".[dev]"
cp .env.example .env
.env
with your BigQuery and Supabase details:# BigQuery Configuration
PROJECT_ID=your-project-id
LOCATION=US
KEY_FILE=/path/to/your/service-account-key.json # Optional
DEFAULT_USER_ID=your-default-user-id # Optional
# Supabase Configuration (for enhanced features)
SUPABASE_URL=https://your-project.supabase.co
SUPABASE_SERVICE_KEY=your-service-role-key # Recommended for full access
SUPABASE_ANON_KEY=your-anon-key # Alternative with RLS
For enhanced caching and analytics features, you'll need a Supabase project with the following tables and policies.
query_cache
- Stores cached query resultstable_dependencies
- Tracks table dependencies for cache invalidationquery_history
- Historical query execution patternsquery_templates
- Reusable query templatescolumn_documentation
- Business context for table columnsevent_log
- System event trackingRun these SQL queries in your Supabase SQL editor to set up the schema:
-- Enable UUID extension
CREATE EXTENSION IF NOT EXISTS "uuid-ossp";
-- 1. Query Result Caching Tables
CREATE TABLE query_cache (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
query_hash TEXT UNIQUE NOT NULL,
sql_query TEXT NOT NULL,
result_data JSONB NOT NULL,
metadata JSONB NOT NULL, -- bytes processed, execution time, etc.
created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW(),
expires_at TIMESTAMP WITH TIME ZONE NOT NULL,
hit_count INTEGER DEFAULT 0
);
-- Indexes for query_cache
CREATE INDEX idx_query_cache_hash ON query_cache(query_hash);
CREATE INDEX idx_query_cache_expires ON query_cache(expires_at);
CREATE INDEX idx_query_cache_created ON query_cache(created_at);
CREATE TABLE table_dependencies (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
query_cache_id UUID REFERENCES query_cache(id) ON DELETE CASCADE,
project_id TEXT NOT NULL,
dataset_id TEXT NOT NULL,
table_id TEXT NOT NULL,
created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW()
);
-- Indexes for table_dependencies
CREATE INDEX idx_table_deps_lookup ON table_dependencies(project_id, dataset_id, table_id);
CREATE INDEX idx_table_deps_cache ON table_dependencies(query_cache_id);
-- 2. Schema Evolution & Knowledge Base Tables
CREATE TABLE schema_snapshots (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
project_id TEXT NOT NULL,
dataset_id TEXT NOT NULL,
table_id TEXT NOT NULL,
schema_version INTEGER NOT NULL DEFAULT 1,
schema_data JSONB NOT NULL,
row_count BIGINT,
size_bytes BIGINT,
created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW()
);
-- Indexes for schema_snapshots
CREATE UNIQUE INDEX idx_schema_version ON schema_snapshots(project_id, dataset_id, table_id, schema_version);
CREATE INDEX idx_schema_table ON schema_snapshots(project_id, dataset_id, table_id);
CREATE TABLE column_documentation (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
project_id TEXT NOT NULL,
dataset_id TEXT NOT NULL,
table_id TEXT NOT NULL,
column_name TEXT NOT NULL,
description TEXT,
business_rules TEXT[],
sample_values JSONB,
data_quality_notes TEXT,
updated_by TEXT,
updated_at TIMESTAMP WITH TIME ZONE DEFAULT NOW()
);
-- Indexes for column_documentation
CREATE UNIQUE INDEX idx_column_docs_unique ON column_documentation(project_id, dataset_id, table_id, column_name);
CREATE INDEX idx_column_docs_table ON column_documentation(project_id, dataset_id, table_id);
-- 3. Query Analytics & Pattern Recognition Tables
CREATE TABLE query_history (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
user_id TEXT,
sql_query TEXT NOT NULL,
execution_time_ms INTEGER,
bytes_processed BIGINT,
success BOOLEAN NOT NULL,
error_message TEXT,
tables_accessed TEXT[],
created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW()
);
-- Indexes for query_history
CREATE INDEX idx_query_history_user ON query_history(user_id);
CREATE INDEX idx_query_history_success ON query_history(success);
CREATE INDEX idx_query_history_created ON query_history(created_at);
CREATE INDEX idx_query_history_tables ON query_history USING GIN(tables_accessed);
CREATE TABLE query_templates (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
name TEXT NOT NULL,
description TEXT,
template_sql TEXT NOT NULL,
parameters JSONB NOT NULL DEFAULT '{}',
usage_count INTEGER DEFAULT 0,
tags TEXT[],
created_by TEXT,
created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW(),
updated_at TIMESTAMP WITH TIME ZONE DEFAULT NOW()
);
-- Indexes for query_templates
CREATE INDEX idx_query_templates_usage ON query_templates(usage_count DESC);
CREATE INDEX idx_query_templates_tags ON query_templates USING GIN(tags);
-- 4. Real-time Event Tracking
CREATE TABLE event_log (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
event_type TEXT NOT NULL,
event_data JSONB NOT NULL,
user_id TEXT,
session_id TEXT,
created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW()
);
-- Indexes for event_log
CREATE INDEX idx_event_log_type ON event_log(event_type);
CREATE INDEX idx_event_log_user ON event_log(user_id);
CREATE INDEX idx_event_log_session ON event_log(session_id);
CREATE INDEX idx_event_log_created ON event_log(created_at);
-- 5. User Preferences & Settings
CREATE TABLE user_preferences (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
user_id TEXT UNIQUE NOT NULL,
preferences JSONB NOT NULL DEFAULT '{}',
query_defaults JSONB NOT NULL DEFAULT '{}',
favorite_queries UUID[] DEFAULT '{}',
created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW(),
updated_at TIMESTAMP WITH TIME ZONE DEFAULT NOW()
);
-- Row Level Security (RLS) Policies for BigQuery Cache Tables
-- First, ensure RLS is enabled on the tables
ALTER TABLE query_cache ENABLE ROW LEVEL SECURITY;
ALTER TABLE query_history ENABLE ROW LEVEL SECURITY;
ALTER TABLE table_dependencies ENABLE ROW LEVEL SECURITY;
ALTER TABLE schema_snapshots ENABLE ROW LEVEL SECURITY;
ALTER TABLE column_documentation ENABLE ROW LEVEL SECURITY;
ALTER TABLE query_templates ENABLE ROW LEVEL SECURITY;
ALTER TABLE event_log ENABLE ROW LEVEL SECURITY;
-- Option 1: Allow all operations (least secure, but simplest for development)
-- Use this during development/testing phases
-- Allow all operations on query_cache
CREATE POLICY "Allow all operations on query_cache" ON query_cache
FOR ALL USING (true) WITH CHECK (true);
-- Allow all operations on query_history
CREATE POLICY "Allow all operations on query_history" ON query_history
FOR ALL USING (true) WITH CHECK (true);
-- Allow all operations on table_dependencies
CREATE POLICY "Allow all operations on table_dependencies" ON table_dependencies
FOR ALL USING (true) WITH CHECK (true);
-- Allow all operations on schema_snapshots
CREATE POLICY "Allow all operations on schema_snapshots" ON schema_snapshots
FOR ALL USING (true) WITH CHECK (true);
-- Allow all operations on column_documentation
CREATE POLICY "Allow all operations on column_documentation" ON column_documentation
FOR ALL USING (true) WITH CHECK (true);
-- Allow all operations on query_templates
CREATE POLICY "Allow all operations on query_templates" ON query_templates
FOR ALL USING (true) WITH CHECK (true);
-- Allow all operations on event_log
CREATE POLICY "Allow all operations on event_log" ON event_log
FOR ALL USING (true) WITH CHECK (true);
-- Option 2: User-based policies (more secure)
-- Uncomment and use these instead if you have user authentication
/*
-- Allow users to manage their own cache entries
CREATE POLICY "Users can manage own cache entries" ON query_cache
FOR ALL USING (auth.uid()::text = user_id OR user_id IS NULL);
-- Allow users to manage their own query history
CREATE POLICY "Users can manage own query history" ON query_history
FOR ALL USING (auth.uid()::text = user_id OR user_id IS NULL);
-- Allow all operations on system tables (no user-specific data)
CREATE POLICY "Allow system operations" ON table_dependencies
FOR ALL USING (true) WITH CHECK (true);
CREATE POLICY "Allow system operations" ON schema_snapshots
FOR ALL USING (true) WITH CHECK (true);
CREATE POLICY "Allow system operations" ON column_documentation
FOR ALL USING (true) WITH CHECK (true);
CREATE POLICY "Allow system operations" ON query_templates
FOR ALL USING (true) WITH CHECK (true);
-- Allow users to create their own event log entries
CREATE POLICY "Users can create event logs" ON event_log
FOR INSERT WITH CHECK (auth.uid()::text = user_id OR user_id IS NULL);
CREATE POLICY "Users can read event logs" ON event_log
FOR SELECT USING (auth.uid()::text = user_id OR user_id IS NULL);
*/
-- Option 3: Service role policies (for backend services)
-- If your application uses a service role key, you might want to create
-- policies that allow the service role to perform all operations
/*
-- Create a function to check if the current role is the service role
CREATE OR REPLACE FUNCTION is_service_role()
RETURNS BOOLEAN AS $$
BEGIN
RETURN current_setting('role') = 'service_role';
EXCEPTION
WHEN others THEN
RETURN false;
END;
$$ LANGUAGE plpgsql SECURITY DEFINER;
-- Service role policies
CREATE POLICY "Service role can manage query_cache" ON query_cache
FOR ALL USING (is_service_role()) WITH CHECK (is_service_role());
CREATE POLICY "Service role can manage query_history" ON query_history
FOR ALL USING (is_service_role()) WITH CHECK (is_service_role());
*/
-- Triggers for automatic timestamp updates
CREATE OR REPLACE FUNCTION update_updated_at_column()
RETURNS TRIGGER AS $$
BEGIN
NEW.updated_at = NOW();
RETURN NEW;
END;
$$ LANGUAGE plpgsql;
CREATE TRIGGER update_column_documentation_updated_at
BEFORE UPDATE ON column_documentation
FOR EACH ROW EXECUTE FUNCTION update_updated_at_column();
CREATE TRIGGER update_query_templates_updated_at
BEFORE UPDATE ON query_templates
FOR EACH ROW EXECUTE FUNCTION update_updated_at_column();
CREATE TRIGGER update_user_preferences_updated_at
BEFORE UPDATE ON user_preferences
FOR EACH ROW EXECUTE FUNCTION update_updated_at_column();
-- Cache Cleanup Function
CREATE OR REPLACE FUNCTION cleanup_expired_cache()
RETURNS INTEGER AS $$
DECLARE
deleted_count INTEGER;
BEGIN
DELETE FROM query_cache WHERE expires_at < NOW();
GET DIAGNOSTICS deleted_count = ROW_COUNT;
RETURN deleted_count;
END;
$$ LANGUAGE plpgsql;
These tables enable:
Row Level Security (RLS) is enabled for all tables.
You can choose from several RLS policy options:
Customize the RLS policies as needed for your environment.
The server will work without Supabase but with limited functionality.
# HTTP mode (default)
mcp-bigquery --transport http --host 0.0.0.0 --port 8000
# Stdio mode (for MCP clients)
mcp-bigquery --transport stdio
# SSE mode
mcp-bigquery --transport sse --host 0.0.0.0 --port 8000
from mcp_bigquery.main import main
import sys
# Set command line arguments
sys.argv = ['mcp-bigquery', '--transport', 'http', '--port', '8000']
main()
To use this MCP BigQuery server with Claude Desktop, you need to configure it in your Claude Desktop configuration file.
First, ensure the server is installed and configured:
# Clone and install the server
git clone <repository-url>
cd mcp-bigquery-server
uv venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate
uv pip install -e ".[dev]"
# Set up environment variables
cp .env.example .env
# Edit .env with your BigQuery and Supabase project details
Add the server to your Claude Desktop configuration file:
Configuration file locations:
~/Library/Application Support/Claude/claude_desktop_config.json
%APPDATA%\Claude\claude_desktop_config.json
{
"mcpServers": {
"mcp-bigquery": {
"command": "/path/to/your/project/.venv/bin/mcp-bigquery",
"args": ["--transport", "stdio"],
"env": {
"PROJECT_ID": "your-project-id",
"LOCATION": "US",
"KEY_FILE": "/path/to/your/service-account-key.json",
"SUPABASE_URL": "https://your-project.supabase.co",
"SUPABASE_SERVICE_KEY": "your-service-role-key",
"DEFAULT_USER_ID": "your-user-id"
}
}
}
}
{
"mcpServers": {
"mcp-bigquery": {
"command": "C:\\path\\to\\your\\project\\.venv\\Scripts\\mcp-bigquery.exe",
"args": ["--transport", "stdio"],
"env": {
"PROJECT_ID": "your-project-id",
"LOCATION": "US",
"KEY_FILE": "C:\\path\\to\\your\\service-account-key.json",
"SUPABASE_URL": "https://your-project.supabase.co",
"SUPABASE_SERVICE_KEY": "your-service-role-key",
"DEFAULT_USER_ID": "your-user-id"
}
}
}
}
BigQuery Authentication - Choose one of two authentication methods:
Option A: Service Account Key File
KEY_FILE
environment variable to the path of this fileOption B: Default Credentials
gcloud auth application-default login
KEY_FILE
from the environment variablesSupabase Authentication (Optional but recommended):
After saving the configuration file, restart Claude Desktop completely for the changes to take effect.
Once configured, you can interact with your BigQuery data through Claude Desktop with enhanced capabilities:
Basic Operations:
Enhanced Features (with Supabase):
GET /resources/list
- List all available datasets and tablesGET /bigquery/{project_id}/{dataset_id}/{table_id}
- Get table metadataPOST /tools/execute_bigquery_sql
- Execute read-only SQL queries with cachingPOST /tools/get_datasets
- Get list of datasets with metadataPOST /tools/get_tables
- Get tables in a dataset with documentationPOST /tools/get_table_schema
- Get table schema with business contextPOST /tools/get_query_suggestions
- Get AI-powered query recommendationsPOST /tools/explain_table
- Get comprehensive table documentationPOST /tools/analyze_query_performance
- Analyze query performance patternsPOST /tools/get_schema_changes
- Track schema evolution over timePOST /tools/manage_cache
- Cache management operationsPOST /tools/health_check
- System health checkGET /events/system
- System status eventsGET /events/queries
- Query execution eventsGET /events/resources
- Resource update eventsGET /health
- Health check endpointresources://list
- List all BigQuery resourcesbigquery://{project}/{dataset}/{table}
- Access specific table metadataexecute_bigquery_sql
- Execute a read-only SQL query with intelligent caching
sql
, maximum_bytes_billed
, use_cache
, user_id
, force_refresh
get_datasets
- Get list of datasets with metadataget_tables
- Get tables in a dataset with column documentationget_table_schema
- Get comprehensive table schema details
dataset_id
, table_id
, include_samples
, include_documentation
get_query_suggestions
- Get AI-powered query recommendations
tables_mentioned
, query_context
, limit
, user_id
explain_table
- Get comprehensive table documentation and business context
project_id
, dataset_id
, table_id
, include_usage_stats
, user_id
analyze_query_performance
- Analyze historical query performance patterns
sql
, tables_accessed
, time_range_hours
, user_id
, include_recommendations
get_schema_changes
- Track schema evolution and changes over time
project_id
, dataset_id
, table_id
, limit
, include_impact_analysis
, user_id
manage_cache
- Comprehensive cache management operations
action
, target
, project_id
, dataset_id
, table_id
, user_id
health_check
- System health check including BigQuery, Supabase, and cache status
user_id
The server includes a sophisticated caching system powered by Supabase:
# Cache a query result (automatic)
result = await execute_bigquery_sql(sql="SELECT * FROM dataset.table", use_cache=True)
# Force cache refresh
result = await execute_bigquery_sql(sql="SELECT * FROM dataset.table", force_refresh=True)
# Get cache statistics
stats = await manage_cache(action="stats")
# Clean up expired entries
cleanup = await manage_cache(action="cleanup")
# Install with development dependencies
uv pip install -e ".[dev]"
# Run tests
pytest
# Run with coverage
pytest --cov=src/mcp_bigquery --cov-report=html
# Format code
black src/ tests/
isort src/ tests/
# Type checking
mypy src/
mcp-bigquery-server/
├── src/mcp_bigquery/ # Main package
│ ├── config/ # Configuration management
│ ├── core/ # Core utilities (BigQuery client, Supabase client, JSON encoder)
│ ├── events/ # Event management system
│ ├── handlers/ # Business logic handlers
│ │ ├── resources.py # Resource handlers
│ │ └── tools.py # Tool handlers (query execution, analytics)
│ ├── api/ # FastAPI and FastMCP applications
│ ├── routes/ # FastAPI route definitions
│ └── main.py # Entry point
├── tests/ # Test suite
├── pyproject.toml # Project configuration
└── README.md # This file
The server supports two authentication methods:
KEY_FILE
environment variablemaximum_bytes_billed
parameterThe server provides real-time events via Server-Sent Events (SSE):
The server provides comprehensive monitoring capabilities:
[Add your license information here]
Manage and query databases, tenants, users, auth using LLMs
A MCP server that provides real-time cryptocurrency market data through CoinCap's public API without requiring authentication
Query your ClickHouse database server.
Connect to a Hologres instance, get table metadata, query and analyze data.
A read-only MCP server to query live Adobe Analytics data. Requires the CData JDBC Driver for Adobe Analytics.
Query Onchain data, like ERC20 tokens, transaction history, smart contract state.
MCP server for dbt-core (OSS) users as the official dbt MCP only supports dbt Cloud. Supports project metadata, model and column-level lineage and dbt documentation.
Interact with the SingleStore database platform
MCP server for libSQL databases with comprehensive security and management tools. Supports file, local HTTP, and remote Turso databases with connection pooling, transaction support, and 6 specialized database tools.
Knowledge graph-based persistent memory system