STRING-MCP
Interact with the STRING protein-protein interaction database API.
STRING-MCP
A comprehensive Python package for interacting with the STRING database API through a Model Context Protocol (MCP) bridge.
Installation
Install the package in development mode:
pip install -e .
Or install from PyPI (when available):
pip install string-mcp
Claude config
"mcpServers": {
"string-mcp": {
"command": "/path/to/python/env/bin/string-mcp-server",
"env": {}
}
}
}
Usage
MCP Server (Primary Use Case)
The package provides an MCP server for integration with MCP-compatible clients:
# Run the MCP server
string-mcp-server
The MCP server provides the following tools:
- map_identifiers: Map protein identifiers to STRING IDs
- get_network_interactions: Get network interactions data
- get_functional_enrichment: Perform functional enrichment analysis
- get_network_image: Generate network visualization images
- get_version_info: Get STRING database version information
Command Line Interface
The package also provides a string-mcp
command for standalone usage:
# Run demo
string-mcp demo
# Get help
string-mcp --help
# Map protein identifiers
string-mcp map TP53 BRCA1 EGFR --species 9606
# Get network interactions
string-mcp network TP53 BRCA1 --species 9606
# Generate network image
string-mcp image TP53 BRCA1 --output network.png --species 9606
Python API
from stringmcp.main import StringDBBridge
# Initialize the bridge
bridge = StringDBBridge()
# Map protein identifiers
proteins = ["TP53", "BRCA1", "EGFR"]
mapped = bridge.map_identifiers(proteins, species=9606) # 9606 = human
# Get network interactions
interactions = bridge.get_network_interactions(proteins, species=9606)
# Perform functional enrichment
enrichment = bridge.get_functional_enrichment(proteins, species=9606)
Features
- Protein Identifier Mapping: Convert various protein identifiers to STRING IDs
- Network Analysis: Retrieve protein-protein interaction networks
- Functional Enrichment: Perform gene ontology and pathway enrichment analysis
- Network Visualization: Generate network images in various formats
- Interaction Partners: Find all interaction partners for proteins
- Functional Annotations: Get detailed functional annotations
- Protein Similarity: Calculate similarity scores between proteins
- PPI Enrichment: Test for protein-protein interaction enrichment
- MCP Integration: Full Model Context Protocol server implementation
API Methods
Core Methods
map_identifiers()
: Map protein identifiers to STRING IDsget_network_interactions()
: Get network interaction dataget_network_image()
: Generate network visualization imagesget_interaction_partners()
: Find all interaction partnersget_functional_enrichment()
: Perform enrichment analysisget_functional_annotation()
: Get functional annotationsget_protein_similarity()
: Calculate similarity scoresget_ppi_enrichment()
: Test for PPI enrichmentget_version_info()
: Get STRING database version
Configuration
The package uses a StringConfig
class for configuration:
from stringmcp.main import StringConfig, StringDBBridge
config = StringConfig(
base_url="https://string-db.org/api",
version_url="https://version-12-0.string-db.org/api",
caller_identity="my_app",
request_delay=1.0 # Delay between requests in seconds
)
bridge = StringDBBridge(config)
Output Formats
The package supports multiple output formats:
JSON
: Structured data (default)TSV
: Tab-separated valuesXML
: XML formatIMAGE
: Network visualization imagesSVG
: Scalable vector graphicsPSI_MI
: PSI-MI format
Species Support
The package supports all species available in STRING. Common species IDs:
- Human: 9606
- Mouse: 10090
- Rat: 10116
- Yeast: 4932
- E. coli: 511145
MCP Server Configuration
To use the MCP server with an MCP client, configure it as follows:
{
"mcpServers": {
"string-mcp": {
"command": "string-mcp-server",
"env": {}
}
}
}
The server will automatically handle:
- JSON-RPC communication
- Tool discovery and invocation
- Error handling and reporting
- Base64 encoding for image data
Development
Setup Development Environment
# Install in development mode with dev dependencies
pip install -e .[dev]
# Format code
black stringmcp/
# Type checking
mypy stringmcp/
# Lint code
flake8 stringmcp/
Note: Test files are not currently included in this repository. To add tests, create a tests/
directory and add test files following the pytest configuration in pyproject.toml
.
Project Structure
STRINGmcp/
├── pyproject.toml # Package configuration and dependencies
├── README.md # This file
├── LICENSE # MIT License
├── .gitignore # Git ignore patterns
├── stringmcp/ # Main package
│ ├── __init__.py # Package initialization
│ └── main.py # Core STRING API bridge and MCP server
└── string_mcp.egg-info/ # Package metadata (generated during install)
├── PKG-INFO # Package information
├── SOURCES.txt # Source files list
├── dependency_links.txt
├── entry_points.txt # CLI entry points
├── requires.txt # Dependencies
└── top_level.txt # Top-level package names
License
MIT License - see LICENSE file for details.
Contributing
- Fork the repository
- Create a feature branch
- Make your changes
- Add tests
- Run the test suite
- Submit a pull request
Support
For issues and questions, please use the GitHub issue tracker.
Example Usage
Complete DNA Repair Protein Analysis
This example demonstrates the comprehensive functionality of the STRING-DB MCP bridge by analyzing a set of well-known human DNA repair proteins: TP53, BRCA1, BRCA2, ATM, and ATR.
2. Protein Identifier Mapping
Map gene symbols to STRING identifiers:
[
{
"queryIndex": 0,
"queryItem": "TP53",
"stringId": "9606.ENSP00000269305",
"ncbiTaxonId": 9606,
"taxonName": "Homo sapiens",
"preferredName": "TP53",
"annotation": "Cellular tumor antigen p53; Acts as a tumor suppressor in many tumor types; induces growth arrest or apoptosis depending on the physiological circumstances and cell type..."
},
{
"queryIndex": 1,
"queryItem": "BRCA1",
"stringId": "9606.ENSP00000418960",
"ncbiTaxonId": 9606,
"taxonName": "Homo sapiens",
"preferredName": "BRCA1",
"annotation": "Breast cancer type 1 susceptibility protein; E3 ubiquitin-protein ligase that specifically mediates the formation of 'Lys-6'-linked polyubiquitin chains..."
},
{
"queryIndex": 2,
"queryItem": "BRCA2",
"stringId": "9606.ENSP00000369497",
"ncbiTaxonId": 9606,
"taxonName": "Homo sapiens",
"preferredName": "BRCA2",
"annotation": "Breast cancer type 2 susceptibility protein; Involved in double-strand break repair and/or homologous recombination..."
},
{
"queryIndex": 3,
"queryItem": "ATM",
"stringId": "9606.ENSP00000278616",
"ncbiTaxonId": 9606,
"taxonName": "Homo sapiens",
"preferredName": "ATM",
"annotation": "Serine-protein kinase ATM; Serine/threonine protein kinase which activates checkpoint signaling upon double strand breaks..."
},
{
"queryIndex": 4,
"queryItem": "ATR",
"stringId": "9606.ENSP00000343741",
"ncbiTaxonId": 9606,
"taxonName": "Homo sapiens",
"preferredName": "ATR",
"annotation": "Serine/threonine-protein kinase ATR; Serine/threonine protein kinase which activates checkpoint signaling upon genotoxic stresses..."
}
]
3. Protein-Protein Interaction Network
Examine network interactions between these proteins:
[
{
"stringId_A": "9606.ENSP00000269305",
"stringId_B": "9606.ENSP00000369497",
"preferredName_A": "TP53",
"preferredName_B": "BRCA2",
"score": 0.995
},
{
"stringId_A": "9606.ENSP00000269305",
"stringId_B": "9606.ENSP00000343741",
"preferredName_A": "TP53",
"preferredName_B": "ATR",
"score": 0.996
},
{
"stringId_A": "9606.ENSP00000269305",
"stringId_B": "9606.ENSP00000278616",
"preferredName_A": "TP53",
"preferredName_B": "ATM",
"score": 0.999
},
{
"stringId_A": "9606.ENSP00000269305",
"stringId_B": "9606.ENSP00000418960",
"preferredName_A": "TP53",
"preferredName_B": "BRCA1",
"score": 0.999
},
{
"stringId_A": "9606.ENSP00000278616",
"stringId_B": "9606.ENSP00000369497",
"preferredName_A": "ATM",
"preferredName_B": "BRCA2",
"score": 0.995
},
{
"stringId_A": "9606.ENSP00000278616",
"stringId_B": "9606.ENSP00000418960",
"preferredName_A": "ATM",
"preferredName_B": "BRCA1",
"score": 0.999
},
{
"stringId_A": "9606.ENSP00000278616",
"stringId_B": "9606.ENSP00000343741",
"preferredName_A": "ATM",
"preferredName_B": "ATR",
"score": 0.999
},
{
"stringId_A": "9606.ENSP00000343741",
"stringId_B": "9606.ENSP00000369497",
"preferredName_A": "ATR",
"preferredName_B": "BRCA2",
"score": 0.831
},
{
"stringId_A": "9606.ENSP00000343741",
"stringId_B": "9606.ENSP00000418960",
"preferredName_A": "ATR",
"preferredName_B": "BRCA1",
"score": 0.996
},
{
"stringId_A": "9606.ENSP00000369497",
"stringId_B": "9606.ENSP00000418960",
"preferredName_A": "BRCA2",
"preferredName_B": "BRCA1",
"score": 0.999
}
]
Key Findings: All interactions show very high confidence scores (>0.8), with most exceeding 0.99, indicating these proteins form a tightly interconnected functional module.
4. Network Statistics
Check if this network is significantly enriched for interactions:
{
"number_of_nodes": 5,
"number_of_edges": 10,
"average_node_degree": 4.0,
"local_clustering_coefficient": 1.0,
"expected_number_of_edges": 5,
"p_value": 0.0122
}
Statistical Significance: The network shows perfect clustering (coefficient = 1.0) and is significantly enriched for interactions (p = 0.0122), with twice as many edges as expected by chance.
5. Functional Enrichment Analysis
Analyze which biological pathways are enriched in this protein set:
Top DNA Repair Pathways (Selected Results):
[
{
"category": "Process",
"term": "GO:0071479",
"number_of_genes": 5,
"preferredNames": ["TP53", "ATM", "ATR", "BRCA2", "BRCA1"],
"p_value": 9.72e-13,
"fdr": 1.52e-08,
"description": "Cellular response to ionizing radiation"
},
{
"category": "Process",
"term": "GO:0042770",
"number_of_genes": 5,
"preferredNames": ["TP53", "ATM", "ATR", "BRCA2", "BRCA1"],
"p_value": 1.69e-11,
"fdr": 1.32e-07,
"description": "Signal transduction in response to DNA damage"
},
{
"category": "Process",
"term": "GO:0006281",
"number_of_genes": 5,
"preferredNames": ["TP53", "ATM", "ATR", "BRCA2", "BRCA1"],
"p_value": 1.05e-08,
"fdr": 1.10e-05,
"description": "DNA repair"
},
{
"category": "KEGG",
"term": "hsa03440",
"number_of_genes": 3,
"preferredNames": ["ATM", "BRCA2", "BRCA1"],
"p_value": 8.34e-08,
"fdr": 2.80e-05,
"description": "Homologous recombination"
},
{
"category": "KEGG",
"term": "hsa04115",
"number_of_genes": 3,
"preferredNames": ["TP53", "ATM", "ATR"],
"p_value": 5.27e-07,`
"fdr": 5.44e-05,`
"description": "p53 signaling pathway"
}
]
Disease Associations:
[
{
"category": "DISEASES",
"term": "DOID:1612",
"number_of_genes": 4,
"preferredNames": ["TP53", "ATM", "BRCA2", "BRCA1"],
"p_value": 5.72e-10,
"fdr": 2.02e-06,
"description": "Breast cancer"
},
{
"category": "DISEASES",
"term": "DOID:3012",
"number_of_genes": 3,
"preferredNames": ["TP53", "BRCA2", "BRCA1"],
"p_value": 6.59e-10,
"fdr": 2.02e-06,
"description": "Li-Fraumeni syndrome"
}
]
The package can generate protein interaction network visualizations showing evidence-based functional associations.
Example Network Visualization: View Protein Interaction Network
This visualization shows the protein-protein interaction network for TP53, BRCA1, BRCA2, ATM, and ATR with high-confidence interactions (score ≥ 400).
7. Functional Enrichment Visualization
The package can also create enrichment scatter plots showing the most significantly enriched biological processes.
Example Enrichment Visualization: View Functional Enrichment Plot
This visualization displays the top 10 most significantly enriched biological processes and pathways for the DNA repair protein set, showing p-values and gene counts for each enriched term.
Summary
This comprehensive analysis demonstrates that the STRING-DB MCP bridge successfully:
- Identified all 5 DNA repair proteins with detailed annotations
- Discovered 10 high-confidence protein interactions (all >0.8 score)
- Revealed significant pathway enrichments with p-values < 1e-8
- Confirmed statistical significance of the network (p = 0.0122)
- Generated both network and enrichment visualizations
The results validate these proteins as a core DNA damage response module, with exceptionally strong enrichment for:
- Cellular response to ionizing radiation (p = 1.52e-8)
- DNA damage signaling (p = 1.32e-7)
- Homologous recombination (p = 2.8e-5)
- p53 signaling pathway (p = 5.44e-5)
- Breast cancer associations (p = 2.02e-6)v This showcases the complete functionality of the STRING-DB MCP bridge for protein interaction network analysis and functional annotation.
Related Servers
FrankfurterMCP
MCP server acting as an interface to the Frankfurter API for currency exchange data.
Borsa MCP
Provides programmatic access to Borsa Istanbul (BIST) stock data, financials, news, and analysis using KAP and Yahoo Finance.
Mina Archive Node API
Access Mina blockchain data, including events, actions, and network state, through the Mina Archive Node API.
InstantDB
Create, manage, and update applications on InstantDB, the modern Firebase.
MCP Toolbox for Databases
Open source MCP server specializing in easy, fast, and secure tools for Databases.
Astro MCP
A modular server providing unified access to multiple astronomical datasets, including astroquery services and DESI data sources.
VikingDB
A server for storing and searching data in a VikingDB instance, configurable via command line or environment variables.
Memory Cache Server
An MCP server that reduces token consumption by efficiently caching data between language model interactions.
CData SAP BusinessObjects BI
A read-only MCP server for SAP BusinessObjects BI, powered by the CData JDBC Driver.
Odoo Accounting MCP Server
Integrates with Odoo Accounting via XML-RPC, allowing AI tools to query and analyze account journal entries for auditing purposes.