Bio-MCP FastQC Server
Provides quality control for biological sequence data using the FastQC and MultiQC tools.
Bio-MCP FastQC Server š¬
Quality Control Analysis via Model Context Protocol
An MCP server that enables AI assistants to run FastQC and MultiQC quality control analysis on sequencing data. Part of the Bio-MCP ecosystem.
šÆ Purpose
FastQC is essential for quality assessment of high-throughput sequencing data. This MCP server allows AI assistants to:
- Analyze single files - Get detailed QC reports for individual FASTQ/FASTA files
- Batch process - Run QC on multiple files simultaneously
- Generate summary reports - Create MultiQC reports combining multiple analyses
- Handle large datasets - Queue system support for computationally intensive jobs
š Quick Start
Prerequisites
Install FastQC and MultiQC:
# Via conda (recommended)
conda install -c bioconda fastqc multiqc
# Via package managers
# Ubuntu/Debian
sudo apt-get install fastqc
pip install multiqc
# macOS
brew install fastqc
pip install multiqc
Installation
# Clone and install
git clone https://github.com/bio-mcp/bio-mcp-fastqc.git
cd bio-mcp-fastqc
pip install -e .
# Or install directly
pip install git+https://github.com/bio-mcp/bio-mcp-fastqc.git
Claude Desktop Configuration
Add to your claude_desktop_config.json:
{
"mcpServers": {
"bio-fastqc": {
"command": "python",
"args": ["-m", "src.server"],
"cwd": "/path/to/bio-mcp-fastqc"
}
}
}
š§ Available Tools
Core Analysis Tools
fastqc_single
Run FastQC on a single FASTQ/FASTA file.
Parameters:
input_file(required): Path to FASTQ or FASTA filethreads(optional): Number of threads (default: 1)contaminants(optional): Path to custom contaminants fileadapters(optional): Path to custom adapters filelimits(optional): Path to custom limits file
Example:
User: "Run quality control on my_sample.fastq.gz"
AI: [calls fastqc_single] ā Returns detailed QC report with pass/warn/fail status for each module
fastqc_batch
Run FastQC on multiple files in a directory.
Parameters:
input_dir(required): Directory containing FASTQ/FASTA filesfile_pattern(optional): File pattern to match (default: ".fastq")threads(optional): Number of threads (default: 4)
Example:
User: "Analyze all fastq files in the data/ directory"
AI: [calls fastqc_batch] ā Processes all files and returns summary statistics
multiqc_report
Generate MultiQC report from FastQC results.
Parameters:
input_dir(required): Directory containing FastQC and other analysis resultstitle(optional): Custom title for the reportcomment(optional): Comment to add to the reporttemplate(optional): Report template (default, simple, sections, gathered)
Example:
User: "Create a summary report from all the QC results"
AI: [calls multiqc_report] ā Generates interactive HTML report combining all analyses
Queue System Tools (when queue enabled)
For large datasets or batch processing:
fastqc_single_async- Queue single file analysisfastqc_batch_async- Queue batch analysismultiqc_report_async- Queue report generationget_job_status- Check job progressget_job_result- Retrieve completed resultscancel_job- Cancel running jobs
š” Usage Examples
Basic Quality Assessment
User: "What's the quality of my sequencing data in sample1.fastq.gz?"
AI Response:
ā
FastQC Analysis Complete for sample1
Module Status Summary:
ā
Basic Statistics: PASS
ā
Per base sequence quality: PASS
ā ļø Per sequence quality scores: WARN
ā
Per base sequence content: PASS
ā
Per sequence GC content: PASS
ā
Per base N content: PASS
ā
Sequence Length Distribution: PASS
ā Sequence Duplication Levels: FAIL
ā
Overrepresented sequences: PASS
ā
Adapter Content: PASS
Basic Statistics:
⢠Total Sequences: 2,450,000
⢠Sequences flagged as poor quality: 0
⢠Sequence length: 150
⢠%GC: 47
Batch Analysis
User: "Analyze all FASTQ files in my project/raw_data/ folder"
AI Response:
FastQC Batch Analysis Complete
========================================
Processed 12 files:
ā
sample01_R1.fastq.gz: 8P/2W/0F
ā
sample01_R2.fastq.gz: 8P/2W/0F
ā ļø sample02_R1.fastq.gz: 7P/2W/1F
ā ļø sample02_R2.fastq.gz: 7P/2W/1F
...
Overall Summary:
⢠Total PASS: 94
⢠Total WARN: 24
⢠Total FAIL: 2
Tip: Run multiqc_report on this directory to generate a combined report!
Complete Workflow
User: "I have a directory of paired-end FASTQ files. Can you run quality control and create a summary report?"
AI: I'll run a complete QC workflow on your paired-end data:
1. First, let me analyze all FASTQ files in batch:
[runs fastqc_batch on directory]
2. Now I'll generate a MultiQC summary report:
[runs multiqc_report on results]
ā
Complete QC workflow finished!
Summary:
- 24 FASTQ files processed (12 samples, paired-end)
- Average quality score: 32.5
- 2 samples have adapter contamination warnings
- 1 sample shows high duplication levels
- Interactive HTML report generated: multiqc_report.html
The MultiQC report provides detailed visualizations of:
- Quality score distributions across all samples
- GC content comparison
- Sequence length distributions
- Adapter content analysis
- Sample correlation analysis
š³ Docker Usage
Build and Run
# Build the image
docker build -t bio-mcp-fastqc .
# Run with data mounting
docker run -v /path/to/data:/data bio-mcp-fastqc
Docker Compose (with Queue System)
services:
fastqc-server:
build: .
volumes:
- ./data:/data
environment:
- BIO_MCP_QUEUE_URL=http://queue-api:8000
depends_on:
- queue-api
āļø Configuration
Environment Variables
BIO_MCP_FASTQC_PATH- Path to FastQC executable (default: "fastqc")BIO_MCP_MULTIQC_PATH- Path to MultiQC executable (default: "multiqc")BIO_MCP_MAX_FILE_SIZE- Maximum file size in bytes (default: 10GB)BIO_MCP_TIMEOUT- Command timeout in seconds (default: 1800)BIO_MCP_TEMP_DIR- Temporary directory for processing
Queue System Integration
To enable async processing for large datasets:
from src.server_with_queue import FastQCServerWithQueue
server = FastQCServerWithQueue(queue_url="http://localhost:8000")
š Output Files
FastQC generates several output files:
- HTML Report (
*_fastqc.html) - Interactive quality report - Data File (
fastqc_data.txt) - Raw metrics and statistics - Summary File (
summary.txt) - Pass/warn/fail status for each module - Plots - Various quality plots and charts
MultiQC combines these into:
- MultiQC Report (
multiqc_report.html) - Combined interactive report - Data Directory (
multiqc_data/) - Processed data and statistics - General Stats (
multiqc_general_stats.txt) - Summary table
š Quality Metrics Explained
FastQC analyzes multiple quality aspects:
Key Modules
- Per base sequence quality - Quality scores across read positions
- Per sequence quality scores - Distribution of mean quality scores
- Per base sequence content - A/T/G/C content across positions
- Per sequence GC content - GC% distribution vs expected
- Sequence duplication levels - PCR duplication assessment
- Adapter content - Contaminating adapter sequences
Status Interpretation
- ā PASS - Analysis indicates no problems
- ā ļø WARN - Slightly unusual, may not be problematic
- ā FAIL - Likely problematic, requires attention
𧬠Integration with Bio-MCP Ecosystem
FastQC works seamlessly with other Bio-MCP tools:
User: "Run the complete preprocessing pipeline on my samples"
AI Workflow:
1. fastqc_batch ā Initial quality assessment
2. trimmomatic ā Trim low-quality bases and adapters
3. fastqc_batch ā Post-trimming QC
4. multiqc_report ā Combined before/after report
š¤ Contributing
We welcome contributions! See the Bio-MCP contributing guide.
Development Setup
git clone https://github.com/bio-mcp/bio-mcp-fastqc.git
cd bio-mcp-fastqc
pip install -e ".[dev]"
pytest
š License
MIT License - see LICENSE file.
š Acknowledgments
- FastQC by Simon Andrews at Babraham Bioinformatics
- MultiQC by Phil Ewels and the MultiQC community
- Bio-MCP project and contributors
Part of the Bio-MCP ecosystem - Making bioinformatics accessible to AI assistants.
For more tools: Bio-MCP Organization
Related Servers
Alpha Vantage MCP Server
sponsorAccess financial market data: realtime & historical stock, ETF, options, forex, crypto, commodities, fundamentals, technical indicators, & more
MCP Low-Level Server Streamable HTTP
A low-level MCP server implementation with streamable HTTP support, configured via environment variables.
Riza
Arbitrary code execution and tool-use platform for LLMs by Riza
Dify Plugin Agent
An agent that supports Function Calling and ReAct for the MCP protocol via HTTP with SSE or Streamable HTTP transport.
BoostSecurity
BoostSecurity MCP acts as a safeguard preventing agents from adding vulnerable packages into projects. It analyzes every package an AI agent introduces, flags unsafe dependencies, and recommends secure, maintained alternatives to keep projects protected.
Kitsune MCP
Shape-shifting MCP hub ā shapeshift() into 10,000+ servers at runtime. One entry point, no restarts, 7 registries.
Prover MCP
Integrates with the Succinct Prover Network to monitor, calibrate, and optimize prover operations.
ScreenHand
Native desktop + browser automation MCP server with 82 tools ā accessibility APIs (macOS/Windows), Chrome DevTools Protocol, anti-detection, memory, jobs, and reusable playbooks.
OpenAPI Schema
Exposes OpenAPI schema information to Large Language Models (LLMs). The server loads OpenAPI schema files specified via command line.
FluidMCP CLI
A command-line tool to run MCP servers from a single file, with support for automatic dependency resolution, environment setup, and package installation from local or S3 sources.
Prefect
Interact with the Prefect API for workflow orchestration and management.