Provides natural language access to relational databases using advanced language models, supporting multiple database types.
Synechron's Text2SQL MCP Server is a Model Context Protocol (MCP) server. It provides MCP clients like Cursor, VSCode, and Claude Desktop, with natural language access to relational databases of various types – enabling users to ask questions of their data using natural language capabilities.
The server optimizes the quality of the results it generates by using a combination of schema and row-aware semantic search. Schemas and a sample of rows are indexed, enabling the server to more effectively generate suitable queries.
The Text2SQL MCP server provides the following capabilities:
The server ships with two example datasets.
A sample REPL is also included.
Note: To use your own databases, further configuration is required, see the Configuration section below.
docker run -it -p8000:8000 \
-v ~/.aws:/home/appuser/.aws:ro \
-e MODEL_API_TYPE=Bedrock \
-e AWS_PROFILE=AWSAdministratorAccess-880502554482 \
-e FASTMCP_HOST=0.0.0.0 \
709825985650.dkr.ecr.us-east-1.amazonaws.com/synechron/mcp-text2sql:0.0.10-2025.07.04-rc3 server
docker run -it \
-v ~/.aws:/home/appuser/.aws:ro \
-e AWS_PROFILE=AWSAdministratorAccess-880502554482 \
-e BEDROCK_MODEL_ID=us.anthropic.claude-3-7-sonnet-20250219-v1:0 \
-e BEDROCK_API_VERSION=2025-01-01-preview \
-e MCP_SERVER_HOST=host.docker.internal \
709825985650.dkr.ecr.us-east-1.amazonaws.com/synechron/mcp-text2sql:0.0.10-2025.07.04-rc3 client
Connected to server with tools: ['Text2Sql_Cyber_Threats', 'Text2Sql_ESG_Ratings']
MCP Client Started!
Type your queries or 'quit' to exit.
Query: What kind of data is in the Cyber Threat datasource?
The Cyber Threat datasource contains comprehensive information about cybersecurity incidents with the following data fields:
1. country - The country where the cyber attack took place
2. year - The year when the cyber attack occurred
3. attack_type - The type of cyber attack (e.g., Phishing, Ransomware, DDoS, Man-in-the-Middle, SQL Injection)
4. target_industry - The industry that was targeted (e.g., Education, Retail, IT, Telecommunications, Healthcare, Government, Banking)
5. financial_loss_(in_million_$) - The financial impact of the attack in millions of dollars
6. number_of_affected_users - How many users were affected by the attack
7. attack_source - Where the attack originated from (e.g., Hacker Group, Nation-state, Insider, Unknown)
8. security_vulnerability_type - The type of security vulnerability exploited (e.g., Unpatched Software, Weak Passwords, Social Engineering)
9. defense_mechanism_used - What defense was in place (e.g., VPN, Firewall, Antivirus, AI-based Detection)
10. incident_resolution_time_(in_hours) - How long it took to resolve the incident in hours
The database tracks cybersecurity incidents across different countries, industries, and years, including details about attack methods, financial impact, affected users, vulnerabilities exploited, and resolution times.
The server should be run with Docker. Use either of the following configurations to add it to a MCP client:
{
"servers": {
"text2sql": {
"type": "streamable-http",
"url": "http://localhost:8000/mcp"
}
}
}
{
"mcpServers": {
"text2sql": {
"command": "docker",
"args": [
"run",
"-i",
"--rm",
"--env-file",
".env",
"709825985650.dkr.ecr.us-east-1.amazonaws.com/synechron/mcp-text2sql:0.0.10-2025.07.04-rc3",
"server"
]
}
}
}
The Text2SQL MCP server can be configured with a combination of environment variables and JSON configuration.
Valves Config | Environment Variable | Default | Description |
---|---|---|---|
ENABLED | False | Enable tool | |
ENABLE_RAG | True | Use AI to synthesize a response using data collected. | |
ENABLE_MARKDOWN | False | Format response in Markdown | |
ENABLE_PANDAS | True | Use pandas to execute and parse SQL query, otherwise use llama_index | |
ENABLE_TABLE_INDEXING | True | Use table retriever to create embeddings for database schema and sample data to optimize context for LLM | |
FORCE_REINDEXING | False | Force tool to regenerate vector indexes for tables and sample data | |
DATABASE_CONFIG | DB_CONFIG | Default db config | JSON formatted tool configuration with database connection string, tables, context, and prompts |
IGNORE_SCHEMA | Default system tables | Comma separated list of schemas to ignore | |
CREDENTIAL_SERVICE | CREDENTIAL_SERVICE | Select authentication type (Azure, AWS, Google, None) | |
MODEL_API_TYPE / MODEL_TYPE | MODEL_TYPE | OpenAI | Select model API type (OpenAI, Azure OpenAI, Bedrock, Ollama) |
API_ENDPOINT | OPENAI_ENDPOINT | http://litellm-proxy:4000 | OpenAI API endpoint |
API_ID | API_ACCESS_ID | Access Key ID | |
API_KEY | OPENAI_API_KEY | OpenAI API key | |
API_REGION | AWS_REGION | us-east-1 | Region for cloud-hosted models |
API_VERSION | 2025-01-01-preview | OpenAI API version | |
LLM_MODEL_NAME | gpt-4.1-mini | Text-to-SQL model | |
EMBED_MODEL_NAME | text-embedding-3-small | Embedding model | |
MAX_RETRY | 3 | Maximum number of retries for failing queries | |
MAX_RESULTS | 200 | Maximum number of rows in a response | |
MAX_DATAFRAME | 1000 | Maximum number of rows in a dataframe sent to LLM to synthesize a response | |
SAMPLE_RATIO | 0.0001 | Percentage of dataset to index. Higher values will take longer to index but give better results. | |
SAMPLE_MAX_SIZE | 50 | Maximum number of rows to sample. Supersedes SAMPLE_RATIO | |
DEBUG | False | Debug mode | |
MODE | MODE | stdio | Mode in which the MCP server should operate. Use shttp for Streamable HTTP. |
AUTH_ENABLED | AUTH_ENABLED | false | Whether or not to require MCP clients to be authenticated. |
AUTH_ISSUER | AUTH_ISSUER | Required if AUTH_ENABLED is true . | |
AUTH_JWKS_URI | AUTH_JWKS_URI | Required if AUTH_ENABLED is true . | |
AWS_ACCESS_KEY_ID | AWS_ACCESS_KEY_ID | Required if using AWS Bedrock for embeddings | |
AWS_SECRET_ACCESS_KEY | AWS_SECRET_ACCESS_KEY | Required if using AWS Bedrock for embeddings | |
LOG_LEVEL | LOG_LEVEL | INFO | Logging level (DEBUG, INFO, WARNING, ERROR). |
MCP_SERVER_API_KEY | MCP_SERVER_API_KEY | API Key for access to LLM / Embeddings. Configure with AWS_SECRET_ACCESS_KEY, LiteLLM key, or OpenAI API key. | |
MCP_SERVER_DATA | MCP_SERVER_DATA | data | Directory for temporary data files |
TEXT2SQL_VALVES_JSON | TEXT2SQL_VALVES_JSON | valves.json | Full path to configuration JSON, see above for example. |
LOG_LEVEL=INFO
PYTHONUNBUFFERED=1
TEXT2SQL_VALVES_JSON=/home/appuser/config/config.json
MODE=shttp
FASTMCP_HOST=0.0.0.0
FASTMCP_PORT=8000
AUTH_ENABLED=true
AUTH_JWKS_URI=https://cognito-idp.us-east-1.amazonaws.com/us-east-1_/.well-known/jwks.json
AUTH_ISSUER=https://cognito-idp.us-east-1.amazonaws.com/us-east-1_
AWS_PROFILE=default
The Text2SQL MCP server must be configured using a JSON file that specifies database connections, model settings, and other parameters. Here's an example configuration:
{
"ENABLED": true,
"ENABLE_RAG": false,
"ENABLE_MARKDOWN": true,
"ENABLE_PANDAS": true,
"ENABLE_TABLE_INDEXING": true,
"FORCE_REINDEXING": false,
"DATABASE_CONFIG": {
"Cyber_Threats": {
"description": "A comprehensive dataset tracking cybersecurity incidents, attack vectors, threat types, and affected countries.",
"topics": [
"Cyber Threats",
"Attacks",
"Targets"
],
"url": "sqlite:///file:/home/appuser/config/cyber_threats.db?uri=true",
"tables": {
"cyber_threats": "The Global Cybersecurity Threats Dataset (2015-2024) provides extensive data on cyberattacks, malware types, targeted industries, and affected countries."
},
"prompts": [
{
"name": "biggest_cyber_threat",
"description": "Which type of cyber threat has caused the biggest financial loss?"
}
]
},
"ESG_Ratings": {
"description": "S&P 500 Companies ESG Insights & Risk Scores for Informed Decisions.",
"topics": [
"ESG Ratings",
"ESG Risk",
"Sustainability"
],
"url": "sqlite:///file:/home/appuser/config/esg_ratings.db?uri=true",
"tables": {
"esg_ratings": "A comprehensive dataset tracking ESG ratings for S&P 500 companies."
}
}
},
"IGNORE_SCHEMA": "information_schema, INFORMATION_SCHEMA, _rsc, db_accessadmin, db_backupoperator, db_datareader, db_datawriter, db_ddladmin, db_denydatareader, db_denydatawriter, db_owner, db_securityadmin, guest, queryinsights, sys, pg_catalog",
"MODEL_API_TYPE": "Bedrock",
"AWS_PROFILE": "default",
"API_VERSION": "2025-01-01-preview",
"LLM_MODEL_NAME": "us.anthropic.claude-3-7-sonnet-20250219-v1:0",
"EMBED_MODEL_NAME": "amazon.titan-embed-text-v2:0",
"MAX_RETRY": 3,
"MAX_RESULTS": 100,
"MAX_DATAFRAME": 2000,
"LOG_LEVEL": "INFO"
}
The following environment variables can be provided to the sample MCP client:
Option | Default | Description |
---|---|---|
JWT_ACCESS_TOKEN | false | Token used to authenticate client with server if the server has AUTH_ENABLED true |
MCP_SERVER_HOST | Host for MCP server. For a server running in Docker on the same host use host.docker.internal | |
MCP_SERVER_PORT | 8000 | Port for MCP server. |
BEDROCK_MODEL_ID | Required if using AWS Bedrock, e.g. us.anthropic.claude-3-7-sonnet-20250219-v1:0 | |
BEDROCK_API_VERSION | Required if using AWS Bedrock, e.g. 2023-06-01-preview | |
AWS_PROFILE | Required if using AWS Bedrock, e.g. default | |
AZURE_DEPLOYMENT_MODEL | Required if using Azure OpenAI. Ignored if BEDROCK_MODEL_ID is set. For Azure authentication see DefaultAzureCredential. | |
AZURE_API_VERSION | Required If using Azure OpenAI. Ignored if BEDROCK_MODEL_ID is set. | |
OTEL_SDK_DISABLED | false | Enable telemetry for Crew.AI client. |
CREWAI_DISABLE_TELEMETRY | false | Enable telemetry for Crew.AI client. |
The Text2SQL MCP server dynamically creates MCP Tools and Prompts for the configured databases in the following format:
query
: Natural language query to execute against the databaseThis project is licensed under the MIT License.
MCP Server for OceanBase database and its tools
Interact with Neon Postgres databases using natural language to manage projects, branches, queries, and migrations via the Neon API.
Build robust data workflows, integrations, and analytics on a single intuitive platform.
Implement semantic memory layer on top of the Qdrant vector search engine
Access the Quran API from alquran.cloud to retrieve accurate Quranic text and reduce LLM hallucinations.
Query Onchain data, like ERC20 tokens, transaction history, smart contract state.
An MCP server for NocoDB, the open-source Airtable alternative. It allows interaction with your NocoDB instance via API.
Read-only access to Snowflake databases. Requires Snowflake connection information provided via MCP client configuration.
A MCP server that provides real-time cryptocurrency market data through CoinCap's public API without requiring authentication
Access Socrata Open Data APIs from government data portals.