Synechron Text2SQL MCP Server

Provides natural language access to relational databases using advanced language models, supporting multiple database types.

Synechron Text2SQL MCP Server

Synechron's Text2SQL MCP Server is a Model Context Protocol (MCP) server. It provides MCP clients like Cursor, VSCode, and Claude Desktop, with natural language access to relational databases of various types – enabling users to ask questions of their data using natural language capabilities.

The server optimizes the quality of the results it generates by using a combination of schema and row-aware semantic search. Schemas and a sample of rows are indexed, enabling the server to more effectively generate suitable queries.

Features and Capabilities

The Text2SQL MCP server provides the following capabilities:

  • Natural Language to SQL: Natural language questions are converted into optimized SQL queries using advanced language models.
  • Multi-Database Support: Compatibility with PostgreSQL, MySQL, SQLite and Microsoft Fabric databases.
  • Retrieval Augmented Generation (RAG): Enhanced query generation by combining schema metadata and sampled row content.
  • Table Indexing: Enables embedding for database schemas and sample data, improving query relevance.
  • Response Synthesis: Generates natural language summaries of query results.
  • Markdown Formatting: Creates more readable responses.
  • Multiple Model Support: Works with OpenAI, Azure OpenAI, AWS Bedrock, and Ollama models.

Sample Data

The server ships with two example datasets.

  • Cyber Threats
  • ESG Ratings

Sample Client

A sample REPL is also included.


Quick Start with Docker

Note: To use your own databases, further configuration is required, see the Configuration section below.

Running the Server with Docker

docker run -it -p8000:8000 \
  -v ~/.aws:/home/appuser/.aws:ro \
  -e MODEL_API_TYPE=Bedrock \
  -e AWS_PROFILE=AWSAdministratorAccess-880502554482 \
  -e FASTMCP_HOST=0.0.0.0 \
  709825985650.dkr.ecr.us-east-1.amazonaws.com/synechron/mcp-text2sql:0.0.10-2025.07.04-rc3 server

Running the Sample Client with Docker

docker run -it \
    -v ~/.aws:/home/appuser/.aws:ro \
    -e AWS_PROFILE=AWSAdministratorAccess-880502554482 \
    -e BEDROCK_MODEL_ID=us.anthropic.claude-3-7-sonnet-20250219-v1:0 \
    -e BEDROCK_API_VERSION=2025-01-01-preview \
    -e MCP_SERVER_HOST=host.docker.internal \
     709825985650.dkr.ecr.us-east-1.amazonaws.com/synechron/mcp-text2sql:0.0.10-2025.07.04-rc3 client

Example Client Usage

Connected to server with tools: ['Text2Sql_Cyber_Threats', 'Text2Sql_ESG_Ratings']

MCP Client Started!
Type your queries or 'quit' to exit.

Query: What kind of data is in the Cyber Threat datasource?

The Cyber Threat datasource contains comprehensive information about cybersecurity incidents with the following data fields:

1. country - The country where the cyber attack took place
2. year - The year when the cyber attack occurred
3. attack_type - The type of cyber attack (e.g., Phishing, Ransomware, DDoS, Man-in-the-Middle, SQL Injection)
4. target_industry - The industry that was targeted (e.g., Education, Retail, IT, Telecommunications, Healthcare, Government, Banking)
5. financial_loss_(in_million_$) - The financial impact of the attack in millions of dollars
6. number_of_affected_users - How many users were affected by the attack
7. attack_source - Where the attack originated from (e.g., Hacker Group, Nation-state, Insider, Unknown)
8. security_vulnerability_type - The type of security vulnerability exploited (e.g., Unpatched Software, Weak Passwords, Social Engineering)
9. defense_mechanism_used - What defense was in place (e.g., VPN, Firewall, Antivirus, AI-based Detection)
10. incident_resolution_time_(in_hours) - How long it took to resolve the incident in hours

The database tracks cybersecurity incidents across different countries, industries, and years, including details about attack methods, financial impact, affected users, vulnerabilities exploited, and resolution times.

Using the Server with an Existing MCP Client

The server should be run with Docker. Use either of the following configurations to add it to a MCP client:

{
  "servers": {
    "text2sql": {
      "type": "streamable-http",
      "url": "http://localhost:8000/mcp"
    }
  }
}
{
  "mcpServers": {
    "text2sql": {
      "command": "docker",
      "args": [
        "run",
        "-i",
        "--rm",
        "--env-file",
        ".env",
        "709825985650.dkr.ecr.us-east-1.amazonaws.com/synechron/mcp-text2sql:0.0.10-2025.07.04-rc3",
        "server"
      ]
    }
  }
}

Configuration

The Text2SQL MCP server can be configured with a combination of environment variables and JSON configuration.

Server Configuration

Valves ConfigEnvironment VariableDefaultDescription
ENABLEDFalseEnable tool
ENABLE_RAGTrueUse AI to synthesize a response using data collected.
ENABLE_MARKDOWNFalseFormat response in Markdown
ENABLE_PANDASTrueUse pandas to execute and parse SQL query, otherwise use llama_index
ENABLE_TABLE_INDEXINGTrueUse table retriever to create embeddings for database schema and sample data to optimize context for LLM
FORCE_REINDEXINGFalseForce tool to regenerate vector indexes for tables and sample data
DATABASE_CONFIGDB_CONFIGDefault db configJSON formatted tool configuration with database connection string, tables, context, and prompts
IGNORE_SCHEMADefault system tablesComma separated list of schemas to ignore
CREDENTIAL_SERVICECREDENTIAL_SERVICESelect authentication type (Azure, AWS, Google, None)
MODEL_API_TYPE / MODEL_TYPEMODEL_TYPEOpenAISelect model API type (OpenAI, Azure OpenAI, Bedrock, Ollama)
API_ENDPOINTOPENAI_ENDPOINThttp://litellm-proxy:4000OpenAI API endpoint
API_IDAPI_ACCESS_IDAccess Key ID
API_KEYOPENAI_API_KEYOpenAI API key
API_REGIONAWS_REGIONus-east-1Region for cloud-hosted models
API_VERSION2025-01-01-previewOpenAI API version
LLM_MODEL_NAMEgpt-4.1-miniText-to-SQL model
EMBED_MODEL_NAMEtext-embedding-3-smallEmbedding model
MAX_RETRY3Maximum number of retries for failing queries
MAX_RESULTS200Maximum number of rows in a response
MAX_DATAFRAME1000Maximum number of rows in a dataframe sent to LLM to synthesize a response
SAMPLE_RATIO0.0001Percentage of dataset to index. Higher values will take longer to index but give better results.
SAMPLE_MAX_SIZE50Maximum number of rows to sample. Supersedes SAMPLE_RATIO
DEBUGFalseDebug mode
MODEMODEstdioMode in which the MCP server should operate. Use shttp for Streamable HTTP.
AUTH_ENABLEDAUTH_ENABLEDfalseWhether or not to require MCP clients to be authenticated.
AUTH_ISSUERAUTH_ISSUERRequired if AUTH_ENABLED is true.
AUTH_JWKS_URIAUTH_JWKS_URIRequired if AUTH_ENABLED is true.
AWS_ACCESS_KEY_IDAWS_ACCESS_KEY_IDRequired if using AWS Bedrock for embeddings
AWS_SECRET_ACCESS_KEYAWS_SECRET_ACCESS_KEYRequired if using AWS Bedrock for embeddings
LOG_LEVELLOG_LEVELINFOLogging level (DEBUG, INFO, WARNING, ERROR).
MCP_SERVER_API_KEYMCP_SERVER_API_KEYAPI Key for access to LLM / Embeddings. Configure with AWS_SECRET_ACCESS_KEY, LiteLLM key, or OpenAI API key.
MCP_SERVER_DATAMCP_SERVER_DATAdataDirectory for temporary data files
TEXT2SQL_VALVES_JSONTEXT2SQL_VALVES_JSONvalves.jsonFull path to configuration JSON, see above for example.

Example .env File

LOG_LEVEL=INFO
PYTHONUNBUFFERED=1
TEXT2SQL_VALVES_JSON=/home/appuser/config/config.json

MODE=shttp
FASTMCP_HOST=0.0.0.0
FASTMCP_PORT=8000

AUTH_ENABLED=true
AUTH_JWKS_URI=https://cognito-idp.us-east-1.amazonaws.com/us-east-1_/.well-known/jwks.json
AUTH_ISSUER=https://cognito-idp.us-east-1.amazonaws.com/us-east-1_

AWS_PROFILE=default

Example TEXT2SQL_VALVES_JSON (valves.json)

The Text2SQL MCP server must be configured using a JSON file that specifies database connections, model settings, and other parameters. Here's an example configuration:

{
  "ENABLED": true,
  "ENABLE_RAG": false,
  "ENABLE_MARKDOWN": true,
  "ENABLE_PANDAS": true,
  "ENABLE_TABLE_INDEXING": true,
  "FORCE_REINDEXING": false,
  "DATABASE_CONFIG": {
    "Cyber_Threats": {
      "description": "A comprehensive dataset tracking cybersecurity incidents, attack vectors, threat types, and affected countries.",
      "topics": [
        "Cyber Threats",
        "Attacks",
        "Targets"
      ],
      "url": "sqlite:///file:/home/appuser/config/cyber_threats.db?uri=true",
      "tables": {
        "cyber_threats": "The Global Cybersecurity Threats Dataset (2015-2024) provides extensive data on cyberattacks, malware types, targeted industries, and affected countries."
      },
      "prompts": [
        {
          "name": "biggest_cyber_threat",
          "description": "Which type of cyber threat has caused the biggest financial loss?"
        }
      ]
    },
    "ESG_Ratings": {
      "description": "S&P 500 Companies ESG Insights & Risk Scores for Informed Decisions.",
      "topics": [
        "ESG Ratings",
        "ESG Risk",
        "Sustainability"
      ],
      "url": "sqlite:///file:/home/appuser/config/esg_ratings.db?uri=true",
      "tables": {
        "esg_ratings": "A comprehensive dataset tracking ESG ratings for S&P 500 companies."
      }
    }
  },
  "IGNORE_SCHEMA": "information_schema, INFORMATION_SCHEMA, _rsc, db_accessadmin, db_backupoperator, db_datareader, db_datawriter, db_ddladmin, db_denydatareader, db_denydatawriter, db_owner, db_securityadmin, guest, queryinsights, sys, pg_catalog",
  "MODEL_API_TYPE": "Bedrock",
  "AWS_PROFILE": "default",
  "API_VERSION": "2025-01-01-preview",
  "LLM_MODEL_NAME": "us.anthropic.claude-3-7-sonnet-20250219-v1:0",
  "EMBED_MODEL_NAME": "amazon.titan-embed-text-v2:0",
  "MAX_RETRY": 3,
  "MAX_RESULTS": 100,
  "MAX_DATAFRAME": 2000,
  "LOG_LEVEL": "INFO"
}

Client Configuration

The following environment variables can be provided to the sample MCP client:

OptionDefaultDescription
JWT_ACCESS_TOKENfalseToken used to authenticate client with server if the server has AUTH_ENABLED true
MCP_SERVER_HOSTHost for MCP server. For a server running in Docker on the same host use host.docker.internal
MCP_SERVER_PORT8000Port for MCP server.
BEDROCK_MODEL_IDRequired if using AWS Bedrock, e.g. us.anthropic.claude-3-7-sonnet-20250219-v1:0
BEDROCK_API_VERSIONRequired if using AWS Bedrock, e.g. 2023-06-01-preview
AWS_PROFILERequired if using AWS Bedrock, e.g. default
AZURE_DEPLOYMENT_MODELRequired if using Azure OpenAI. Ignored if BEDROCK_MODEL_ID is set. For Azure authentication see DefaultAzureCredential.
AZURE_API_VERSIONRequired If using Azure OpenAI. Ignored if BEDROCK_MODEL_ID is set.
OTEL_SDK_DISABLEDfalseEnable telemetry for Crew.AI client.
CREWAI_DISABLE_TELEMETRYfalseEnable telemetry for Crew.AI client.

Tools

The Text2SQL MCP server dynamically creates MCP Tools and Prompts for the configured databases in the following format:

  • Text2Sql_DatabaseName: Query a specific database using natural language
    • Parameters:
      • query: Natural language query to execute against the database

License

This project is licensed under the MIT License.

Related Servers