Sequential Thinking Multi-Agent System (MAS)
An MCP agent that utilizes a Multi-Agent System (MAS) for sequential thinking and problem-solving.
Sequential Thinking Multi-Agent System (MAS) 
English | 简体中文
This project implements an advanced sequential thinking process using a Multi-Agent System (MAS) built with the Agno framework and served via MCP. It represents a significant evolution from simpler state-tracking approaches by leveraging coordinated, specialized agents for deeper analysis and problem decomposition.
What is This?
This is an MCP server - not a standalone application. It runs as a background service that extends your LLM client (like Claude Desktop) with sophisticated sequential thinking capabilities. The server provides a sequentialthinking tool that processes thoughts through multiple specialized AI agents, each examining the problem from a different cognitive angle.
Core Architecture: Multi-Dimensional Thinking Agents
The system employs 6 specialized thinking agents, each focused on a distinct cognitive perspective:
1. Factual Agent
- Focus: Objective facts and verified data
- Approach: Analytical, evidence-based reasoning
- Capabilities:
- Web research for current facts (via ExaTools)
- Data verification and source citation
- Information gap identification
- Time allocation: 120 seconds for thorough analysis
2. Emotional Agent
- Focus: Intuition and emotional intelligence
- Approach: Gut reactions and feelings
- Capabilities:
- Quick intuitive responses (30-second snapshots)
- Visceral reactions without justification
- Emotional pattern recognition
- Time allocation: 30 seconds (quick reaction mode)
3. Critical Agent
- Focus: Risk assessment and problem identification
- Approach: Logical scrutiny and devil's advocate
- Capabilities:
- Research counterexamples and failures (via ExaTools)
- Identify logical flaws and risks
- Challenge assumptions constructively
- Time allocation: 120 seconds for deep analysis
4. Optimistic Agent
- Focus: Benefits, opportunities, and value
- Approach: Positive exploration with realistic grounding
- Capabilities:
- Research success stories (via ExaTools)
- Identify feasible opportunities
- Explore best-case scenarios logically
- Time allocation: 120 seconds for balanced optimism
5. Creative Agent
- Focus: Innovation and alternative solutions
- Approach: Lateral thinking and idea generation
- Capabilities:
- Cross-industry innovation research (via ExaTools)
- Divergent thinking techniques
- Multiple solution generation
- Time allocation: 240 seconds (creativity needs time)
6. Synthesis Agent
- Focus: Integration and metacognitive orchestration
- Approach: Holistic synthesis and final answer generation
- Capabilities:
- Integrate all perspectives into coherent response
- Answer the original question directly
- Provide actionable, user-friendly insights
- Time allocation: 60 seconds for synthesis
- Note: Uses enhanced model, does NOT include ExaTools (focuses on integration)
AI-Powered Intelligent Routing
The system uses AI-driven complexity analysis to determine the optimal thinking sequence:
Processing Strategy:
- Single fixed strategy:
full_explorationis mandatory for all requests - No legacy modes: single/double/triple routing paths are removed
- Complexity analysis retained: metrics are still generated for observability
The AI analyzer still evaluates:
- Problem complexity and semantic depth
- Primary problem type (factual, emotional, creative, philosophical, etc.)
- Required thinking modes for observability and diagnostics
- Model behavior metadata (Enhanced vs Standard usage)
AI Routing Flow Diagram
flowchart TD
A[Input Thought] --> B[AI Complexity Analyzer]
B --> C[Complexity Metadata Stored]
C --> D[Fixed Strategy: full_exploration]
D --> E[Step 1: Initial Synthesis]
E --> F[Step 2: Parallel Specialist Agents]
F --> G[Step 3: Final Synthesis]
G --> H[Unified Response]
Key Insights:
- Deterministic behavior: every request runs the same full multi-step path
- Parallel execution: non-synthesis agents still run simultaneously
- Synthesis integration: orchestration and final answer are both synthesis-driven
Research Capabilities (ExaTools Integration)
4 out of 6 agents are equipped with web research capabilities via ExaTools:
- Factual Agent: Search for current facts, statistics, verified data
- Critical Agent: Find counterexamples, failed cases, regulatory issues
- Optimistic Agent: Research success stories, positive case studies
- Creative Agent: Discover innovations across different industries
- Emotional & Synthesis Agents: No ExaTools (focused on internal processing)
Research is optional - requires EXA_API_KEY environment variable. The system works perfectly without it, using pure reasoning capabilities.
Model Intelligence
Dual Model Strategy:
- Enhanced Model: Used for Synthesis agent (complex integration tasks)
- Standard Model: Used for individual thinking agents
- AI Selection: System automatically chooses the right model based on task complexity
Supported Providers:
- DeepSeek (default) - High performance, cost-effective
- Groq - Ultra-fast inference
- OpenRouter - Access to multiple models
- GitHub Models - OpenAI models via GitHub API
- Anthropic - Claude models with prompt caching
- Ollama - Local model execution
Key Differences from Original Version (TypeScript)
This Python/Agno implementation marks a fundamental shift from the original TypeScript version:
| Feature/Aspect | Python/Agno Version (Current) | TypeScript Version (Original) |
|---|---|---|
| Architecture | Multi-Agent System (MAS); Active processing by a team of agents. | Single Class State Tracker; Simple logging/storing. |
| Intelligence | Distributed Agent Logic; Embedded in specialized agents & Coordinator. | External LLM Only; No internal intelligence. |
| Processing | Active Analysis & Synthesis; Agents act on the thought. | Passive Logging; Merely recorded the thought. |
| Frameworks | Agno (MAS) + FastMCP (Server); Uses dedicated MAS library. | MCP SDK only. |
| Coordination | Explicit Team Coordination Logic (Team in coordinate mode). | None; No coordination concept. |
| Validation | Pydantic Schema Validation; Robust data validation. | Basic Type Checks; Less reliable. |
| External Tools | Integrated (Exa via Researcher); Can perform research tasks. | None. |
| Logging | Structured Python Logging (File + Console); Configurable. | Console Logging with Chalk; Basic. |
| Language & Ecosystem | Python; Leverages Python AI/ML ecosystem. | TypeScript/Node.js. |
In essence, the system evolved from a passive thought recorder to an active thought processor powered by a collaborative team of AI agents.
How it Works (Multi-Dimensional Processing)
- Initiation: An external LLM uses the
sequentialthinkingtool to define the problem and initiate the process. - Tool Call: The LLM calls the
sequentialthinkingtool with the current thought, structured according to theThoughtDatamodel. - AI Complexity Analysis: The system still performs AI-powered analysis to capture complexity metadata and diagnostic signals.
- Fixed Strategy Execution: The system always runs the mandatory
full_explorationmulti-step sequence. - Parallel Processing: Multiple thinking agents process the thought simultaneously from their specialized perspectives:
- Factual agents gather objective data (with optional web research)
- Critical agents identify risks and problems
- Optimistic agents explore opportunities and benefits
- Creative agents generate innovative solutions
- Emotional agents provide intuitive insights
- Research Integration: Agents equipped with ExaTools conduct targeted web research to enhance their analysis.
- Synthesis & Integration: The Synthesis agent integrates all perspectives into a coherent, actionable response using enhanced models.
- Response Generation: The system returns a comprehensive analysis with guidance for next steps.
- Iteration: The calling LLM uses the synthesized response to formulate the next thinking step or conclude the process.
Token Consumption Warning
High Token Usage: Due to the Multi-Agent System architecture, this tool consumes significantly more tokens than single-agent alternatives or the previous TypeScript version. Each sequentialthinking call invokes multiple specialized agents simultaneously, leading to substantially higher token usage (potentially 5-10x more than simple approaches).
This parallel processing leads to substantially higher token usage (potentially 5-10x more) compared to simpler sequential approaches, but provides correspondingly deeper and more comprehensive analysis.
MCP Tool: sequentialthinking
The server exposes a single MCP tool that processes sequential thoughts:
Parameters:
{
thought: string, // One focused reasoning step
thoughtNumber: number, // 1-based step index; increment each call
totalThoughts: number, // Planned number of steps
nextThoughtNeeded: boolean, // true for intermediate steps, false on final step
isRevision: boolean, // true only when revising earlier conclusions
branchFromThought?: number, // Set with branchId to branch from a prior step
branchId?: string, // Branch identifier (required when branching)
needsMoreThoughts: boolean // true only when extending beyond totalThoughts
}
Response:
The tool returns both:
content: human-readable synthesis textstructuredContent: machine-readable loop control fields
{
should_continue: boolean, // Canonical continuation signal
next_thought_number: number?, // Recommended next thoughtNumber
stop_reason: string, // Why to continue/stop/retry
current_thought_number: number,
total_thoughts: number,
next_call_arguments?: { // Suggested next-call arguments when applicable
thoughtNumber: number,
totalThoughts: number,
nextThoughtNeeded: boolean,
needsMoreThoughts: boolean
},
parameter_usage: Record<string, string>
}
Call Contract (Important)
- Use this tool as a multi-step loop, not a one-shot call.
- After every response, read
structuredContent.should_continue. - Continue calling
sequentialthinkinguntilshould_continueisfalse. - Actively use reflection: when a step is weak or incorrect, send a revision step with
isRevision=true. - Prefer
structuredContent.next_thought_numberandnext_call_argumentswhen building the next request.
Installation
Prerequisites
- Python 3.10+
- LLM API access (choose one):
- DeepSeek:
DEEPSEEK_API_KEY(default, recommended) - Groq:
GROQ_API_KEY - OpenRouter:
OPENROUTER_API_KEY - GitHub Models:
GITHUB_TOKEN - Anthropic:
ANTHROPIC_API_KEY - Ollama: Local installation (no API key)
- DeepSeek:
- Optional:
EXA_API_KEYfor web research capabilities uvpackage manager (recommended) orpip
Quick Start
1. Install via Smithery (Recommended)
npx -y @smithery/cli install @FradSer/mcp-server-mas-sequential-thinking --client claude
2. Manual Installation
# Clone the repository
git clone https://github.com/FradSer/mcp-server-mas-sequential-thinking.git
cd mcp-server-mas-sequential-thinking
# Install with uv (recommended)
uv pip install .
# Or with pip
pip install .
Configuration
For MCP Clients (Claude Desktop, etc.)
Add to your MCP client configuration:
{
"mcpServers": {
"sequential-thinking": {
"command": "mcp-server-mas-sequential-thinking",
"env": {
"LLM_PROVIDER": "deepseek",
"DEEPSEEK_API_KEY": "your_api_key",
"EXA_API_KEY": "your_exa_key_optional"
}
}
}
}
Environment Variables
Create a .env file or set these variables:
# LLM Provider (required)
LLM_PROVIDER="deepseek" # deepseek, groq, openrouter, github, anthropic, ollama
DEEPSEEK_API_KEY="sk-..."
# Optional: Enhanced/Standard Model Selection
# DEEPSEEK_ENHANCED_MODEL_ID="deepseek-chat" # For synthesis
# DEEPSEEK_STANDARD_MODEL_ID="deepseek-chat" # For other agents
# Optional: Web Research (enables ExaTools)
# EXA_API_KEY="your_exa_api_key"
# Optional: Custom endpoint
# LLM_BASE_URL="https://custom-endpoint.com"
Model Configuration Examples
# Groq with different models
GROQ_ENHANCED_MODEL_ID="openai/gpt-oss-120b"
GROQ_STANDARD_MODEL_ID="openai/gpt-oss-20b"
# Anthropic with Claude models
ANTHROPIC_ENHANCED_MODEL_ID="claude-3-5-sonnet-20241022"
ANTHROPIC_STANDARD_MODEL_ID="claude-3-5-haiku-20241022"
# GitHub Models
GITHUB_ENHANCED_MODEL_ID="gpt-4o"
GITHUB_STANDARD_MODEL_ID="gpt-4o-mini"
Usage
As MCP Server
Once installed and configured in your MCP client:
- The
sequentialthinkingtool becomes available - Your LLM can use it to process complex thoughts
- The system automatically routes to appropriate thinking agents
- Results are synthesized and returned to your LLM
Direct Execution
Run the server manually for testing:
# Using installed script
mcp-server-mas-sequential-thinking
# Using uv
uv run mcp-server-mas-sequential-thinking
# Using Python
python src/mcp_server_mas_sequential_thinking/main.py
Development
Setup
# Clone repository
git clone https://github.com/FradSer/mcp-server-mas-sequential-thinking.git
cd mcp-server-mas-sequential-thinking
# Create virtual environment
python -m venv .venv
source .venv/bin/activate # On Windows: .venv\Scripts\activate
# Install with dev dependencies
uv pip install -e ".[dev]"
Code Quality
# Format and lint
uv run ruff check . --fix
uv run ruff format .
uv run mypy .
# Run tests (when available)
uv run pytest
Testing with MCP Inspector
npx @modelcontextprotocol/inspector uv run mcp-server-mas-sequential-thinking
Open http://127.0.0.1:6274/ and test the sequentialthinking tool.
System Characteristics
Strengths:
- Multi-perspective analysis: 6 different cognitive approaches
- AI-powered analysis: Complexity metrics for observability
- Research capabilities: 4 agents with web search (optional)
- Deterministic processing: Fixed full multi-step sequence
- Model optimization: Enhanced/Standard model selection
- Provider agnostic: Works with multiple LLM providers
Considerations:
- Token usage: Multi-agent processing uses more tokens than single-agent
- Processing time: Complex sequences take longer but provide deeper insights
- API costs: Research capabilities require separate Exa API subscription
- Model selection: Enhanced models cost more but provide better synthesis
Project Structure
mcp-server-mas-sequential-thinking/
├── src/mcp_server_mas_sequential_thinking/
│ ├── main.py # MCP server entry point
│ ├── processors/
│ │ ├── multi_thinking_core.py # 6 thinking agents definition
│ │ └── multi_thinking_processor.py # Sequential processing logic
│ ├── routing/
│ │ ├── ai_complexity_analyzer.py # AI-powered analysis
│ │ └── multi_thinking_router.py # Intelligent routing
│ ├── services/
│ │ ├── server_core.py # ThoughtProcessor implementation
│ │ ├── workflow_executor.py
│ │ └── context_builder.py
│ └── config/
│ ├── modernized_config.py # Provider strategies
│ └── constants.py # System constants
├── pyproject.toml # Project configuration
└── README.md # This file
Changelog
See CHANGELOG.md for version history.
Contributing
Contributions are welcome! Please ensure:
- Code follows project style (ruff, mypy)
- Commit messages use conventional commits format
- All tests pass before submitting PR
- Documentation is updated as needed
License
This project is licensed under the MIT License - see the LICENSE file for details.
Acknowledgments
- Built with Agno v2.0+ framework
- Model Context Protocol by Anthropic
- Research capabilities powered by Exa (optional)
- Multi-dimensional thinking inspired by Edward de Bono's work
Support
- GitHub Issues: Report bugs or request features
- Documentation: Check CLAUDE.md for detailed implementation notes
- MCP Protocol: Official MCP Documentation
Note: This is an MCP server, designed to work with MCP-compatible clients like Claude Desktop. It is not a standalone chat application.
Related Servers
Scout Monitoring MCP
sponsorPut performance and error data directly in the hands of your AI assistant.
Alpha Vantage MCP Server
sponsorAccess financial market data: realtime & historical stock, ETF, options, forex, crypto, commodities, fundamentals, technical indicators, & more
Playwright MCP
Generate Playwright tests with AI assistants by providing real-time access to the browser DOM, interactions, and screenshots.
Percepta MCP Server
An AI-driven platform for frontend semantic cognition and automation.
Frappe MCP Server
An MCP server for the Frappe Framework, enabling AI assistants to interact with Frappe's REST API for document management and schema operations.
PlantUML-MCP-Server
MCP server that provides PlantUML diagram generation capabilities
Vibe Check
The definitive Vibe Coder's sanity check MCP server: Prevents cascading errors by calling a "Vibe-check" agent to ensure alignment and prevent scope creep
MCP Proxy
A thin proxy that allows clients to connect to MCP servers over HTTP without streaming transport.
MCP Music Analysis
Analyze audio from local files, YouTube, or direct links using librosa.
Tree-Hugger-JS
Analyze and transform JavaScript/TypeScript code using the tree-hugger-js library.
Shell MCP
Securely execute shell commands with whitelisting, resource limits, and timeout controls for LLMs.
Google Workspace Developers
Developer documentation for Google Workspace APIs
