Bio-MCP FastQC Server
Provides quality control for biological sequence data using the FastQC and MultiQC tools.
Bio-MCP FastQC Server š¬
Quality Control Analysis via Model Context Protocol
An MCP server that enables AI assistants to run FastQC and MultiQC quality control analysis on sequencing data. Part of the Bio-MCP ecosystem.
šÆ Purpose
FastQC is essential for quality assessment of high-throughput sequencing data. This MCP server allows AI assistants to:
- Analyze single files - Get detailed QC reports for individual FASTQ/FASTA files
- Batch process - Run QC on multiple files simultaneously
- Generate summary reports - Create MultiQC reports combining multiple analyses
- Handle large datasets - Queue system support for computationally intensive jobs
š Quick Start
Prerequisites
Install FastQC and MultiQC:
# Via conda (recommended)
conda install -c bioconda fastqc multiqc
# Via package managers
# Ubuntu/Debian
sudo apt-get install fastqc
pip install multiqc
# macOS
brew install fastqc
pip install multiqc
Installation
# Clone and install
git clone https://github.com/bio-mcp/bio-mcp-fastqc.git
cd bio-mcp-fastqc
pip install -e .
# Or install directly
pip install git+https://github.com/bio-mcp/bio-mcp-fastqc.git
Claude Desktop Configuration
Add to your claude_desktop_config.json
:
{
"mcpServers": {
"bio-fastqc": {
"command": "python",
"args": ["-m", "src.server"],
"cwd": "/path/to/bio-mcp-fastqc"
}
}
}
š§ Available Tools
Core Analysis Tools
fastqc_single
Run FastQC on a single FASTQ/FASTA file.
Parameters:
input_file
(required): Path to FASTQ or FASTA filethreads
(optional): Number of threads (default: 1)contaminants
(optional): Path to custom contaminants fileadapters
(optional): Path to custom adapters filelimits
(optional): Path to custom limits file
Example:
User: "Run quality control on my_sample.fastq.gz"
AI: [calls fastqc_single] ā Returns detailed QC report with pass/warn/fail status for each module
fastqc_batch
Run FastQC on multiple files in a directory.
Parameters:
input_dir
(required): Directory containing FASTQ/FASTA filesfile_pattern
(optional): File pattern to match (default: ".fastq")threads
(optional): Number of threads (default: 4)
Example:
User: "Analyze all fastq files in the data/ directory"
AI: [calls fastqc_batch] ā Processes all files and returns summary statistics
multiqc_report
Generate MultiQC report from FastQC results.
Parameters:
input_dir
(required): Directory containing FastQC and other analysis resultstitle
(optional): Custom title for the reportcomment
(optional): Comment to add to the reporttemplate
(optional): Report template (default, simple, sections, gathered)
Example:
User: "Create a summary report from all the QC results"
AI: [calls multiqc_report] ā Generates interactive HTML report combining all analyses
Queue System Tools (when queue enabled)
For large datasets or batch processing:
fastqc_single_async
- Queue single file analysisfastqc_batch_async
- Queue batch analysismultiqc_report_async
- Queue report generationget_job_status
- Check job progressget_job_result
- Retrieve completed resultscancel_job
- Cancel running jobs
š” Usage Examples
Basic Quality Assessment
User: "What's the quality of my sequencing data in sample1.fastq.gz?"
AI Response:
ā
FastQC Analysis Complete for sample1
Module Status Summary:
ā
Basic Statistics: PASS
ā
Per base sequence quality: PASS
ā ļø Per sequence quality scores: WARN
ā
Per base sequence content: PASS
ā
Per sequence GC content: PASS
ā
Per base N content: PASS
ā
Sequence Length Distribution: PASS
ā Sequence Duplication Levels: FAIL
ā
Overrepresented sequences: PASS
ā
Adapter Content: PASS
Basic Statistics:
⢠Total Sequences: 2,450,000
⢠Sequences flagged as poor quality: 0
⢠Sequence length: 150
⢠%GC: 47
Batch Analysis
User: "Analyze all FASTQ files in my project/raw_data/ folder"
AI Response:
FastQC Batch Analysis Complete
========================================
Processed 12 files:
ā
sample01_R1.fastq.gz: 8P/2W/0F
ā
sample01_R2.fastq.gz: 8P/2W/0F
ā ļø sample02_R1.fastq.gz: 7P/2W/1F
ā ļø sample02_R2.fastq.gz: 7P/2W/1F
...
Overall Summary:
⢠Total PASS: 94
⢠Total WARN: 24
⢠Total FAIL: 2
Tip: Run multiqc_report on this directory to generate a combined report!
Complete Workflow
User: "I have a directory of paired-end FASTQ files. Can you run quality control and create a summary report?"
AI: I'll run a complete QC workflow on your paired-end data:
1. First, let me analyze all FASTQ files in batch:
[runs fastqc_batch on directory]
2. Now I'll generate a MultiQC summary report:
[runs multiqc_report on results]
ā
Complete QC workflow finished!
Summary:
- 24 FASTQ files processed (12 samples, paired-end)
- Average quality score: 32.5
- 2 samples have adapter contamination warnings
- 1 sample shows high duplication levels
- Interactive HTML report generated: multiqc_report.html
The MultiQC report provides detailed visualizations of:
- Quality score distributions across all samples
- GC content comparison
- Sequence length distributions
- Adapter content analysis
- Sample correlation analysis
š³ Docker Usage
Build and Run
# Build the image
docker build -t bio-mcp-fastqc .
# Run with data mounting
docker run -v /path/to/data:/data bio-mcp-fastqc
Docker Compose (with Queue System)
services:
fastqc-server:
build: .
volumes:
- ./data:/data
environment:
- BIO_MCP_QUEUE_URL=http://queue-api:8000
depends_on:
- queue-api
āļø Configuration
Environment Variables
BIO_MCP_FASTQC_PATH
- Path to FastQC executable (default: "fastqc")BIO_MCP_MULTIQC_PATH
- Path to MultiQC executable (default: "multiqc")BIO_MCP_MAX_FILE_SIZE
- Maximum file size in bytes (default: 10GB)BIO_MCP_TIMEOUT
- Command timeout in seconds (default: 1800)BIO_MCP_TEMP_DIR
- Temporary directory for processing
Queue System Integration
To enable async processing for large datasets:
from src.server_with_queue import FastQCServerWithQueue
server = FastQCServerWithQueue(queue_url="http://localhost:8000")
š Output Files
FastQC generates several output files:
- HTML Report (
*_fastqc.html
) - Interactive quality report - Data File (
fastqc_data.txt
) - Raw metrics and statistics - Summary File (
summary.txt
) - Pass/warn/fail status for each module - Plots - Various quality plots and charts
MultiQC combines these into:
- MultiQC Report (
multiqc_report.html
) - Combined interactive report - Data Directory (
multiqc_data/
) - Processed data and statistics - General Stats (
multiqc_general_stats.txt
) - Summary table
š Quality Metrics Explained
FastQC analyzes multiple quality aspects:
Key Modules
- Per base sequence quality - Quality scores across read positions
- Per sequence quality scores - Distribution of mean quality scores
- Per base sequence content - A/T/G/C content across positions
- Per sequence GC content - GC% distribution vs expected
- Sequence duplication levels - PCR duplication assessment
- Adapter content - Contaminating adapter sequences
Status Interpretation
- ā PASS - Analysis indicates no problems
- ā ļø WARN - Slightly unusual, may not be problematic
- ā FAIL - Likely problematic, requires attention
𧬠Integration with Bio-MCP Ecosystem
FastQC works seamlessly with other Bio-MCP tools:
User: "Run the complete preprocessing pipeline on my samples"
AI Workflow:
1. fastqc_batch ā Initial quality assessment
2. trimmomatic ā Trim low-quality bases and adapters
3. fastqc_batch ā Post-trimming QC
4. multiqc_report ā Combined before/after report
š¤ Contributing
We welcome contributions! See the Bio-MCP contributing guide.
Development Setup
git clone https://github.com/bio-mcp/bio-mcp-fastqc.git
cd bio-mcp-fastqc
pip install -e ".[dev]"
pytest
š License
MIT License - see LICENSE file.
š Acknowledgments
- FastQC by Simon Andrews at Babraham Bioinformatics
- MultiQC by Phil Ewels and the MultiQC community
- Bio-MCP project and contributors
Part of the Bio-MCP ecosystem - Making bioinformatics accessible to AI assistants.
For more tools: Bio-MCP Organization
Related Servers
Image Tools MCP
Retrieve image dimensions and compress images from URLs or local files using Tinify and Figma APIs.
USolver
A server for solving combinatorial, convex, integer, and non-linear optimization problems.
Polarion MCP Servers
MCP server for integrating with Polarion Application Lifecycle Management (ALM).
Remote MCP Server (Authless)
A template for deploying a remote, auth-less MCP server on Cloudflare Workers.
MCP Sandbox
Execute Python code and install packages safely within isolated Docker containers.
MCPCLIHost
A CLI host that allows Large Language Models (LLMs) to interact with external tools using the Model Context Protocol (MCP).
CircleCI
Enable AI Agents to fix build failures from CircleCI.
MCP ZepAi Server
A server for Zep, a long-term memory store for AI applications, requiring a ZEP_API_KEY for access.
Intlayer
A MCP Server that enhance your IDE with AI-powered assistance for Intlayer i18n / CMS tool: smart CLI access, versioned docs.
CocoaPods Package README
Retrieve README files and package information from CocoaPods.