Decompose
Decompose text into classified semantic units — authority, risk, attention, entities. No LLM. Deterministic.
Decompose
Stop prompting. Start decomposing.
Deterministic text classification for AI agents. Decompose turns any text into classified, structured semantic units — instantly. No LLM. No setup. One function call.
Before: your agent reads this
The contractor shall provide all materials per ASTM C150-20. Maximum load
shall not exceed 500 psf per ASCE 7-22. Notice to proceed within 14 calendar
days of contract execution. Retainage of 10% applies to all payments.
For general background, the project is located in Denver, CO...
After: your agent reads this
[
{
"text": "The contractor shall provide all materials per ASTM C150-20.",
"authority": "mandatory",
"risk": "compliance",
"type": "requirement",
"irreducible": true,
"attention": 8.0,
"entities": ["ASTM C150-20"]
},
{
"text": "Maximum load shall not exceed 500 psf per ASCE 7-22.",
"authority": "prohibitive",
"risk": "safety_critical",
"type": "constraint",
"irreducible": true,
"attention": 10.0,
"entities": ["ASCE 7-22"]
}
]
Every unit classified. Every standard extracted. Every risk scored. Your agent knows what matters.
Install
pip install decompose-mcp
Use as MCP Server
Add to your agent's MCP config (Claude Code, Cursor, Windsurf, etc.):
{
"mcpServers": {
"decompose": {
"command": "uvx",
"args": ["decompose-mcp", "--serve"]
}
}
}
Your agent gets two tools:
decompose_text— decompose any textdecompose_url— fetch a URL and decompose its content
OpenClaw
Install the skill from ClawHub or configure directly:
{
"mcpServers": {
"decompose": {
"command": "python3",
"args": ["-m", "decompose", "--serve"]
}
}
}
Or install the skill: clawdhub install decompose-mcp
Use as CLI
# Pipe text
cat spec.txt | decompose --pretty
# Inline
decompose --text "The contractor shall provide all materials per ASTM C150-20."
# Compact output (smaller JSON)
cat document.md | decompose --compact
Use as Library
from decompose import decompose_text, filter_for_llm
result = decompose_text("The contractor shall provide all materials per ASTM C150-20.")
for unit in result["units"]:
print(f"[{unit['authority']}] [{unit['risk']}] {unit['text'][:60]}...")
# Pre-filter for LLM context — keep only high-value units
filtered = filter_for_llm(result, max_tokens=4000)
print(f"{filtered['meta']['reduction_pct']}% token reduction")
llm_input = filtered["text"] # Ready for your LLM
What Each Field Means
| Field | Values | What It Tells Your Agent |
|---|---|---|
authority | mandatory, prohibitive, directive, permissive, conditional, informational | Is this a hard requirement or background? |
risk | safety_critical, security, compliance, financial, contractual, advisory, informational | How much does this matter? |
type | requirement, definition, reference, constraint, narrative, data | What kind of content is this? |
irreducible | true/false | Must this be preserved verbatim? |
attention | 0.0 - 10.0 | How much compute should the agent spend here? |
entities | standards, codes, regulations | What formal references are cited? |
actionable | true/false | Does someone need to do something? |
What to Build With This
Decompose is not the destination. It's the step before the LLM that most developers skip — not because it's hard, but because nobody showed them it exists. Documents have structure. That structure is classifiable. And classification should happen before reasoning.
Without: document → chunk → embed → retrieve → LLM → answer (100% of tokens)
With: document → decompose → filter/route → LLM → answer (20-40% of tokens)
Filter: built-in LLM pre-filter
filter_for_llm() keeps mandatory, safety-critical, financial, and compliance units — drops boilerplate before it reaches your LLM or vector store.
from decompose import decompose_text, filter_for_llm
result = decompose_text(open("contract.md").read())
filtered = filter_for_llm(result, max_tokens=4000)
# filtered["text"] = high-value units only, ready for LLM
# filtered["meta"]["reduction_pct"] = how much was dropped (typically 60-80%)
# Or use the units directly for embedding
for unit in filtered["units"]:
embed_and_store(unit["text"], metadata={
"authority": unit["authority"],
"risk": unit["risk"],
"attention": unit["attention"],
})
Route: risk-based processing
Safety-critical content goes to one chain. Financial content goes to another. Boilerplate gets skipped.
from decompose import decompose_text
result = decompose_text(spec_text)
for unit in result["units"]:
if unit["risk"] == "safety_critical":
safety_chain.process(unit) # Full analysis + human review
elif unit["risk"] == "financial":
audit_chain.process(unit) # Flag for finance team
elif unit["attention"] < 0.5:
pass # Skip boilerplate
else:
general_chain.process(unit) # Standard LLM analysis
Measure: token cost reduction
from decompose import decompose_text
result = decompose_text(spec_text)
total = len(result["units"])
high = [u for u in result["units"] if u["attention"] >= 1.0]
print(f"{len(high)}/{total} units need LLM analysis")
print(f"{100 - len(high) * 100 // total}% token reduction")
See examples/ for runnable scripts.
Why No LLM?
Decompose runs on pure regex and heuristics. No Ollama, no API key, no GPU, no inference cost.
This is intentional:
- Fast: <500ms for a 50-page spec
- Deterministic: Same input always produces same output
- Offline: Works air-gapped, on a plane, on CI
- Composable: Your agent's LLM reasons over the structured output — decompose handles the preprocessing
The LLM is what your agent uses. Decompose makes whatever model you're running work better.
Built by Echology
Decompose is built by Echology and extracted from AECai, a document intelligence platform for Architecture, Engineering, and Construction firms. The classification patterns, entity extraction, and irreducibility detection are battle-tested against thousands of real AEC documents — specs, contracts, RFIs, inspection reports, pay applications.
Decompose earned its independence — it started as AECai's text classification module, proved general enough to work across domains (insurance, trading, regulatory), and was released standalone. Free, MIT-licensed.
Case Study: Open Scripture Intelligence
The same chunking and entity extraction patterns that classify engineering specs also structure the Bible. Open Scripture Intelligence uses Decompose's Markdown-aware chunker and regex entity extraction to transform 31,100 verses into a knowledge graph with 344,799 cross-reference edges and semantic embeddings — proving the methodology is domain-agnostic.
Blog
- When Regex Beats an LLM — Decompose classifies the MCP spec in 3.78ms
- Why Your Agent Needs a Cognitive Primitive — attention scoring, irreducibility, and routing
- What "Simulation-Aware" Actually Means — the architecture behind AECai
License: MIT — Copyright (c) 2025-2026 Echology, Inc.
İlgili Sunucular
Silver Fox Medication MCP Server
Look up half-life and brand details on medications then calculate medication concentration levels over time
Lightweight PowerShell MCP framework
Lightweight PowerShell MCP framework. Expose custom PowerShell functions through a pure-PowerShell MCP server to integrate with MCP-compliant AI agents.
Vintage Chocolate Recipes (1914)
146 historic chocolate recipes from 1914. Search cakes, candies, and beverages from Maria Parloa's classic cookbook.
aTars MCP by aarna
aTars MCP by aarna provides AI agents with structured access to crypto market signals, technical indicators, and sentiment analysis.
pop-pay
Stop AI agents leaking your payment info or making hallucinated purchases. No SaaS, No login, No pain, fully local.
Librarian Nexus
Librarian Nexus is a shared knowledge library for AI agents. Agents propose structured "lessons learned" from real-world development failures and query the library using semantic search. Lessons include epic (what was attempted), failure (what went wrong), and solution (what fixed it). Queries are paywalled at $0.01 USDC via the x402 payment protocol on Base.
MCP-Airflow-API
MCP-Airflow-API is an MCP server that leverages the Model Context Protocol (MCP) to transform Apache Airflow REST API operations into natural language tools. This project hides the complexity of API structures and enables intuitive management of Airflow clusters through natural language commands.
Microsoft Ads MCP
Bring your Microsoft Ads data into Claude or ChatGPT with Two Minute Reports MCP to analyze keyword performance, search queries, CTR, CPC, and conversion trends.
CoinMarketCap MCP Server
Crypto prices, market caps, rankings, and token metadata via CoinMarketCap.
swarmix-mcp
AI client acquisition autopilot. 15 MCP tools to deploy outreach agents across LinkedIn, Email, X, Instagram & Blog from Claude.