STRING-MCP
Interact with the STRING protein-protein interaction database API.
STRING-MCP
A comprehensive Python package for interacting with the STRING database API through a Model Context Protocol (MCP) bridge.
Installation
Install the package in development mode:
pip install -e .
Or install from PyPI (when available):
pip install string-mcp
Claude config
"mcpServers": {
"string-mcp": {
"command": "/path/to/python/env/bin/string-mcp-server",
"env": {}
}
}
}
Usage
MCP Server (Primary Use Case)
The package provides an MCP server for integration with MCP-compatible clients:
# Run the MCP server
string-mcp-server
The MCP server provides the following tools:
- map_identifiers: Map protein identifiers to STRING IDs
- get_network_interactions: Get network interactions data
- get_functional_enrichment: Perform functional enrichment analysis
- get_network_image: Generate network visualization images
- get_version_info: Get STRING database version information
Command Line Interface
The package also provides a string-mcp command for standalone usage:
# Run demo
string-mcp demo
# Get help
string-mcp --help
# Map protein identifiers
string-mcp map TP53 BRCA1 EGFR --species 9606
# Get network interactions
string-mcp network TP53 BRCA1 --species 9606
# Generate network image
string-mcp image TP53 BRCA1 --output network.png --species 9606
Python API
from stringmcp.main import StringDBBridge
# Initialize the bridge
bridge = StringDBBridge()
# Map protein identifiers
proteins = ["TP53", "BRCA1", "EGFR"]
mapped = bridge.map_identifiers(proteins, species=9606) # 9606 = human
# Get network interactions
interactions = bridge.get_network_interactions(proteins, species=9606)
# Perform functional enrichment
enrichment = bridge.get_functional_enrichment(proteins, species=9606)
Features
- Protein Identifier Mapping: Convert various protein identifiers to STRING IDs
- Network Analysis: Retrieve protein-protein interaction networks
- Functional Enrichment: Perform gene ontology and pathway enrichment analysis
- Network Visualization: Generate network images in various formats
- Interaction Partners: Find all interaction partners for proteins
- Functional Annotations: Get detailed functional annotations
- Protein Similarity: Calculate similarity scores between proteins
- PPI Enrichment: Test for protein-protein interaction enrichment
- MCP Integration: Full Model Context Protocol server implementation
API Methods
Core Methods
map_identifiers(): Map protein identifiers to STRING IDsget_network_interactions(): Get network interaction dataget_network_image(): Generate network visualization imagesget_interaction_partners(): Find all interaction partnersget_functional_enrichment(): Perform enrichment analysisget_functional_annotation(): Get functional annotationsget_protein_similarity(): Calculate similarity scoresget_ppi_enrichment(): Test for PPI enrichmentget_version_info(): Get STRING database version
Configuration
The package uses a StringConfig class for configuration:
from stringmcp.main import StringConfig, StringDBBridge
config = StringConfig(
base_url="https://string-db.org/api",
version_url="https://version-12-0.string-db.org/api",
caller_identity="my_app",
request_delay=1.0 # Delay between requests in seconds
)
bridge = StringDBBridge(config)
Output Formats
The package supports multiple output formats:
JSON: Structured data (default)TSV: Tab-separated valuesXML: XML formatIMAGE: Network visualization imagesSVG: Scalable vector graphicsPSI_MI: PSI-MI format
Species Support
The package supports all species available in STRING. Common species IDs:
- Human: 9606
- Mouse: 10090
- Rat: 10116
- Yeast: 4932
- E. coli: 511145
MCP Server Configuration
To use the MCP server with an MCP client, configure it as follows:
{
"mcpServers": {
"string-mcp": {
"command": "string-mcp-server",
"env": {}
}
}
}
The server will automatically handle:
- JSON-RPC communication
- Tool discovery and invocation
- Error handling and reporting
- Base64 encoding for image data
Development
Setup Development Environment
# Install in development mode with dev dependencies
pip install -e .[dev]
# Format code
black stringmcp/
# Type checking
mypy stringmcp/
# Lint code
flake8 stringmcp/
Note: Test files are not currently included in this repository. To add tests, create a tests/ directory and add test files following the pytest configuration in pyproject.toml.
Project Structure
STRINGmcp/
├── pyproject.toml # Package configuration and dependencies
├── README.md # This file
├── LICENSE # MIT License
├── .gitignore # Git ignore patterns
├── stringmcp/ # Main package
│ ├── __init__.py # Package initialization
│ └── main.py # Core STRING API bridge and MCP server
└── string_mcp.egg-info/ # Package metadata (generated during install)
├── PKG-INFO # Package information
├── SOURCES.txt # Source files list
├── dependency_links.txt
├── entry_points.txt # CLI entry points
├── requires.txt # Dependencies
└── top_level.txt # Top-level package names
License
MIT License - see LICENSE file for details.
Contributing
- Fork the repository
- Create a feature branch
- Make your changes
- Add tests
- Run the test suite
- Submit a pull request
Support
For issues and questions, please use the GitHub issue tracker.
Example Usage
Complete DNA Repair Protein Analysis
This example demonstrates the comprehensive functionality of the STRING-DB MCP bridge by analyzing a set of well-known human DNA repair proteins: TP53, BRCA1, BRCA2, ATM, and ATR.
2. Protein Identifier Mapping
Map gene symbols to STRING identifiers:
[
{
"queryIndex": 0,
"queryItem": "TP53",
"stringId": "9606.ENSP00000269305",
"ncbiTaxonId": 9606,
"taxonName": "Homo sapiens",
"preferredName": "TP53",
"annotation": "Cellular tumor antigen p53; Acts as a tumor suppressor in many tumor types; induces growth arrest or apoptosis depending on the physiological circumstances and cell type..."
},
{
"queryIndex": 1,
"queryItem": "BRCA1",
"stringId": "9606.ENSP00000418960",
"ncbiTaxonId": 9606,
"taxonName": "Homo sapiens",
"preferredName": "BRCA1",
"annotation": "Breast cancer type 1 susceptibility protein; E3 ubiquitin-protein ligase that specifically mediates the formation of 'Lys-6'-linked polyubiquitin chains..."
},
{
"queryIndex": 2,
"queryItem": "BRCA2",
"stringId": "9606.ENSP00000369497",
"ncbiTaxonId": 9606,
"taxonName": "Homo sapiens",
"preferredName": "BRCA2",
"annotation": "Breast cancer type 2 susceptibility protein; Involved in double-strand break repair and/or homologous recombination..."
},
{
"queryIndex": 3,
"queryItem": "ATM",
"stringId": "9606.ENSP00000278616",
"ncbiTaxonId": 9606,
"taxonName": "Homo sapiens",
"preferredName": "ATM",
"annotation": "Serine-protein kinase ATM; Serine/threonine protein kinase which activates checkpoint signaling upon double strand breaks..."
},
{
"queryIndex": 4,
"queryItem": "ATR",
"stringId": "9606.ENSP00000343741",
"ncbiTaxonId": 9606,
"taxonName": "Homo sapiens",
"preferredName": "ATR",
"annotation": "Serine/threonine-protein kinase ATR; Serine/threonine protein kinase which activates checkpoint signaling upon genotoxic stresses..."
}
]
3. Protein-Protein Interaction Network
Examine network interactions between these proteins:
[
{
"stringId_A": "9606.ENSP00000269305",
"stringId_B": "9606.ENSP00000369497",
"preferredName_A": "TP53",
"preferredName_B": "BRCA2",
"score": 0.995
},
{
"stringId_A": "9606.ENSP00000269305",
"stringId_B": "9606.ENSP00000343741",
"preferredName_A": "TP53",
"preferredName_B": "ATR",
"score": 0.996
},
{
"stringId_A": "9606.ENSP00000269305",
"stringId_B": "9606.ENSP00000278616",
"preferredName_A": "TP53",
"preferredName_B": "ATM",
"score": 0.999
},
{
"stringId_A": "9606.ENSP00000269305",
"stringId_B": "9606.ENSP00000418960",
"preferredName_A": "TP53",
"preferredName_B": "BRCA1",
"score": 0.999
},
{
"stringId_A": "9606.ENSP00000278616",
"stringId_B": "9606.ENSP00000369497",
"preferredName_A": "ATM",
"preferredName_B": "BRCA2",
"score": 0.995
},
{
"stringId_A": "9606.ENSP00000278616",
"stringId_B": "9606.ENSP00000418960",
"preferredName_A": "ATM",
"preferredName_B": "BRCA1",
"score": 0.999
},
{
"stringId_A": "9606.ENSP00000278616",
"stringId_B": "9606.ENSP00000343741",
"preferredName_A": "ATM",
"preferredName_B": "ATR",
"score": 0.999
},
{
"stringId_A": "9606.ENSP00000343741",
"stringId_B": "9606.ENSP00000369497",
"preferredName_A": "ATR",
"preferredName_B": "BRCA2",
"score": 0.831
},
{
"stringId_A": "9606.ENSP00000343741",
"stringId_B": "9606.ENSP00000418960",
"preferredName_A": "ATR",
"preferredName_B": "BRCA1",
"score": 0.996
},
{
"stringId_A": "9606.ENSP00000369497",
"stringId_B": "9606.ENSP00000418960",
"preferredName_A": "BRCA2",
"preferredName_B": "BRCA1",
"score": 0.999
}
]
Key Findings: All interactions show very high confidence scores (>0.8), with most exceeding 0.99, indicating these proteins form a tightly interconnected functional module.
4. Network Statistics
Check if this network is significantly enriched for interactions:
{
"number_of_nodes": 5,
"number_of_edges": 10,
"average_node_degree": 4.0,
"local_clustering_coefficient": 1.0,
"expected_number_of_edges": 5,
"p_value": 0.0122
}
Statistical Significance: The network shows perfect clustering (coefficient = 1.0) and is significantly enriched for interactions (p = 0.0122), with twice as many edges as expected by chance.
5. Functional Enrichment Analysis
Analyze which biological pathways are enriched in this protein set:
Top DNA Repair Pathways (Selected Results):
[
{
"category": "Process",
"term": "GO:0071479",
"number_of_genes": 5,
"preferredNames": ["TP53", "ATM", "ATR", "BRCA2", "BRCA1"],
"p_value": 9.72e-13,
"fdr": 1.52e-08,
"description": "Cellular response to ionizing radiation"
},
{
"category": "Process",
"term": "GO:0042770",
"number_of_genes": 5,
"preferredNames": ["TP53", "ATM", "ATR", "BRCA2", "BRCA1"],
"p_value": 1.69e-11,
"fdr": 1.32e-07,
"description": "Signal transduction in response to DNA damage"
},
{
"category": "Process",
"term": "GO:0006281",
"number_of_genes": 5,
"preferredNames": ["TP53", "ATM", "ATR", "BRCA2", "BRCA1"],
"p_value": 1.05e-08,
"fdr": 1.10e-05,
"description": "DNA repair"
},
{
"category": "KEGG",
"term": "hsa03440",
"number_of_genes": 3,
"preferredNames": ["ATM", "BRCA2", "BRCA1"],
"p_value": 8.34e-08,
"fdr": 2.80e-05,
"description": "Homologous recombination"
},
{
"category": "KEGG",
"term": "hsa04115",
"number_of_genes": 3,
"preferredNames": ["TP53", "ATM", "ATR"],
"p_value": 5.27e-07,`
"fdr": 5.44e-05,`
"description": "p53 signaling pathway"
}
]
Disease Associations:
[
{
"category": "DISEASES",
"term": "DOID:1612",
"number_of_genes": 4,
"preferredNames": ["TP53", "ATM", "BRCA2", "BRCA1"],
"p_value": 5.72e-10,
"fdr": 2.02e-06,
"description": "Breast cancer"
},
{
"category": "DISEASES",
"term": "DOID:3012",
"number_of_genes": 3,
"preferredNames": ["TP53", "BRCA2", "BRCA1"],
"p_value": 6.59e-10,
"fdr": 2.02e-06,
"description": "Li-Fraumeni syndrome"
}
]
The package can generate protein interaction network visualizations showing evidence-based functional associations.
Example Network Visualization: View Protein Interaction Network
This visualization shows the protein-protein interaction network for TP53, BRCA1, BRCA2, ATM, and ATR with high-confidence interactions (score ≥ 400).
7. Functional Enrichment Visualization
The package can also create enrichment scatter plots showing the most significantly enriched biological processes.
Example Enrichment Visualization: View Functional Enrichment Plot
This visualization displays the top 10 most significantly enriched biological processes and pathways for the DNA repair protein set, showing p-values and gene counts for each enriched term.
Summary
This comprehensive analysis demonstrates that the STRING-DB MCP bridge successfully:
- Identified all 5 DNA repair proteins with detailed annotations
- Discovered 10 high-confidence protein interactions (all >0.8 score)
- Revealed significant pathway enrichments with p-values < 1e-8
- Confirmed statistical significance of the network (p = 0.0122)
- Generated both network and enrichment visualizations
The results validate these proteins as a core DNA damage response module, with exceptionally strong enrichment for:
- Cellular response to ionizing radiation (p = 1.52e-8)
- DNA damage signaling (p = 1.32e-7)
- Homologous recombination (p = 2.8e-5)
- p53 signaling pathway (p = 5.44e-5)
- Breast cancer associations (p = 2.02e-6)v This showcases the complete functionality of the STRING-DB MCP bridge for protein interaction network analysis and functional annotation.
Related Servers
DeepMemory
DeepMemory MCP is a small Model Context Protocol (MCP) server that provides long-term memory storage for conversational agents.
Fresha
Access the Fresha Data Connector through Snowflake.
OSV
Access the OSV (Open Source Vulnerabilities) database for vulnerability information. Query vulnerabilities by package version or commit, batch query multiple packages, and get detailed vulnerability information by ID.
Binance Cryptocurrency MCP
Access real-time Binance cryptocurrency market data, including prices, order books, and trading history.
MarkLogic MCP Server by CData
A read-only MCP server by CData for querying live MarkLogic data with LLMs. Requires a separate CData JDBC Driver.
Redis Cloud
Interact with the Redis Cloud API to manage your Redis databases.
CData Cloudant MCP Server
A read-only MCP server by CData for querying live Cloudant data with LLMs. Requires the CData JDBC Driver for Cloudant.
Snowflake MCP Server by CData
A read-only MCP server for querying live Snowflake data, powered by CData.
FHIR MCP Server by CData
A read-only MCP server for FHIR, enabling LLMs to query live FHIR data. Requires the CData JDBC Driver for FHIR.
Yargı MCP
Access Turkish legal databases and decision sources through a standardized MCP server.