SwissArmyHammer

Manage AI prompts as local markdown files.

SwissArmyHammer

Agent-driven engineering

AI coding agents are powerful -- but without structure, they wander. They lose track of the plan. They skip tests. They write code that works but isn't reviewed. They forget what they were doing halfway through.

SwissArmyHammer fixes this. It gives your agent real engineering discipline: a kanban board to track work, code review that creates actionable findings, test coverage analysis that finds the gaps, and workflows that chain it all together. One install, and your agent goes from "smart intern" to "senior engineer who follows the process."

Works with Claude Code, Cursor, Windsurf, or any MCP-compatible agent.

Get Started in 30 Seconds

brew install swissarmyhammer/tap/swissarmyhammer-cli
sah init

That's it. Your agent now has skills, tools, and workflows.

Two Ways to Plan

The /plan skill is the front door to SwissArmyHammer. It works two ways:

Hand it a spec

Write your requirements in a markdown file and point the agent at it:

> /plan my-feature-spec.md

The agent reads your spec, explores the codebase to understand what exists, then creates a kanban board with ordered tasks, subtasks, acceptance criteria, and test requirements. Each card has enough context that the agent (or a teammate) can pick it up and implement it without re-reading the spec.

Your spec can be as simple or detailed as you want -- a few bullet points, a full PRD, or anything in between. The agent fills in the implementation details by reading your actual code.

Or just talk it through

You don't need a spec file. Start a conversation and plan interactively:

> I want to add OAuth2 support to the API

The agent enters planning mode, asks clarifying questions, explores your codebase, and builds the kanban board incrementally as you discuss. You can steer the plan in real time:

> Split that auth card into separate cards for Google and GitHub providers
> Add a card for the token refresh flow -- we'll need that too
> Actually, let's do GitHub first and Google in a follow-up PR

The plan evolves through conversation. Cards get added, split, merged, and reordered based on your feedback. When you're happy, say "go" and the agent starts implementing.

Then execute

Either way, once the plan is on the board:

> /implement                     # Execute every task, autonomously
> /review                        # Code review -- findings become new cards
> /test                          # Run tests, fix failures
> /commit                        # Clean conventional commit

What You Get

The Problem With Agents Today

You ask an agent to build a feature. It writes some code, maybe runs some tests. But:

It doesn't plan first -- it just starts coding
It doesn't track what's done vs. what's left
It doesn't review its own work
It doesn't check coverage or find gaps
Each session starts from scratch

The SwissArmyHammer Approach

Every stage of the software development lifecycle gets the same treatment: a tool for the raw capability, a skill that defines the workflow, and a specialized agent that executes it autonomously.

Stage	What happens	Command
Plan	Read your spec, explore the codebase, create a kanban board with ordered tasks	`/plan`
Implement	Pick up tasks one-by-one, write code, run tests, mark complete	`/implement`
Test	Run the full suite, analyze failures, fix them	`/test`
Coverage	Find untested code, create cards for the gaps	`/coverage`
Review	Structured code review -- findings become kanban cards	`/review`
Commit	Stage changes, write a conventional commit message	`/commit`
Explore	Semantic code search and symbol lookup across 25+ languages	`/code-context`
Deduplicate	Find near-duplicate code and refactor it	`/deduplicate`

These stages connect into pipelines. /plan creates tasks. /implement works through them. /review finds issues and creates new cards. /implement picks those up too. /coverage finds untested code and creates test cards. It's a closed loop.

Context Management -- The Hidden Superpower

The biggest bottleneck for AI agents isn't intelligence -- it's context. Long test output blows the context window. The agent can't find the function it needs. It re-reads files it already scanned. SwissArmyHammer solves this at every level.

Smart Shell (not just `bash -c`)

The built-in shell isn't a thin wrapper around subprocess exec. It's a virtual shell with persistent history, process management, and searchable output:

Every command's output is stored and indexed -- even if the response was truncated to save tokens
Semantic search across all output -- ask "find the authentication error" and it matches "403 forbidden" and "login denied"
Regex grep across history -- error\[E\d+\] finds every Rust compiler error from every command you've run
Line-range retrieval -- output was truncated? Fetch lines 450-500 of command #3 without re-running it
Configurable output limits -- return 50 lines, 200, or zero (fire-and-forget). Full output is always saved for later

This means your agent can run cargo test with 10,000 lines of output, get a 50-line summary, and then surgically search for the failure -- without burning context tokens on scrollback.

Automatic Code Intelligence (tree-sitter + LSP)

SwissArmyHammer automatically indexes your codebase using tree-sitter and LSP. No configuration, no manual setup -- open a project and it starts parsing in the background.

Symbol lookup -- jump to any definition with fuzzy matching (MyStruct::new, process_req, partial names)
Call graph traversal -- who calls this function? What does it call? Trace execution flow across files
Blast radius analysis -- before you change validate_token, see every file and function transitively affected
Semantic diffs -- git diff shows line changes; sah shows entity-level changes (Added, Modified, Deleted, Moved, Renamed)
25+ languages -- Rust, Python, TypeScript, Go, Java, C/C++, Ruby, Swift, Kotlin, and more

This is what lets /plan actually understand your codebase before creating tasks, and what lets /review catch real architectural issues instead of just style nits.

The Suite

SwissArmyHammer is three tools that work together:

sah -- Skills and Tools for Any Agent

The core. An MCP server that gives your agent everything it needs:

Tools -- the building blocks:

Tool	What it does
Files	Read, write, edit, glob, grep -- with .gitignore support
Git	Branch, commit, diff, status, PR workflows
Shell	Safe command execution with security hardening
Kanban	File-backed task boards -- cards, subtasks, dependencies, tags
Code Search	Tree-sitter powered semantic search across 25+ languages
Web	Fetch pages and convert to markdown, search the web
JavaScript	Embedded QuickJS expression evaluation
Questions	Elicitation-based Q&A for capturing decisions

Skills -- the workflows that use those tools:

Skills are markdown files. They teach your agent how to do things, not just what to do. Each skill defines a step-by-step process, and a specialized agent type executes it. This is what turns a generic LLM into a focused engineer.

You can write your own skills too -- drop a SKILL.md in .swissarmyhammer/skills/my-skill/ and your agent picks it up automatically.

avp -- Agent Validator Protocol

Guardrails. avp hooks into your agent as a pre/post validator. Every time the agent writes or edits a file, validators run automatically and block bad changes before they land.

code-quality -- cognitive complexity, function length, naming, magic numbers, dead code
security-rules -- no secrets in code, input validation
command-safety -- safe shell command checks
test-integrity -- catches test cheating (skipping tests, weakening assertions, hardcoding expected values)

avp init        # Install validators into your project
avp list        # See what's active

mirdan -- Package Manager for Agent Skills

npm for AI agents. Install skills and validators from a shared registry. Publish your own.

mirdan search code-quality     # Find packages
mirdan install code-quality    # Install one
mirdan agents                  # See your agents
mirdan publish                 # Share your own

Architecture

Everything is markdown. Skills, validators, workflows, prompts -- all markdown with YAML frontmatter and Liquid templating. No proprietary formats, no databases, no cloud lock-in. Everything lives in your repo or your home directory, fully version-controllable.

~/.swissarmyhammer/
  skills/           # Installed skills (markdown)
  validators/       # Installed validators (markdown)
  prompts/          # Custom prompts (markdown + Liquid)
  workflows/        # State machine workflows (markdown + Mermaid)

Project-level overrides go in .swissarmyhammer/ in your repo. Project settings win over user settings.

The MCP server itself is a single Rust binary -- fast startup, no runtime dependencies, no Docker, no cloud services. It runs locally alongside your agent.

Why SwissArmyHammer?

For individual developers: Your agent becomes dramatically more capable. Instead of babysitting it through each step, you hand it a spec and walk away. It plans, implements, tests, reviews, and commits -- following the same engineering process you would.

For teams: Consistent engineering process across every developer's agent. The same skills, the same validators, the same quality gates. Install once via mirdan, and every team member's agent works the same way.

For tool builders: The skill/validator format is just markdown. Build a skill in minutes, share it with mirdan publish, and it works in any MCP-compatible agent. No SDK to learn, no API to integrate.

License

MIT OR Apache-2.0

Related Servers

Scout Monitoring MCP

sponsor

Put performance and error data directly in the hands of your AI assistant.

Alpha Vantage MCP Server

sponsor

Access financial market data: realtime & historical stock, ETF, options, forex, crypto, commodities, fundamentals, technical indicators, & more

LetzAI

An MCP server for image generation using the LetzAI API.

Tidymodels MCP Server

An MCP server for accessing tidymodels GitHub information and generating code.

AWS DynamoDB

The official developer experience MCP Server for Amazon DynamoDB. This server provides DynamoDB expert design guidance and data modeling assistance.

Moatless MCP Server

An advanced code analysis and editing server with semantic search capabilities using vector embeddings.

Praison AI

AI Agents framework with 64+ built-in MCP tools for search, memory, workflows, code execution, and file operations. Install via `uvx praisonai-mcp`

Next.js MCP Server

A Next.js-based MCP server with OAuth 2.1 authentication support using Google as the default provider. Requires a PostgreSQL database and optionally Redis for SSE transport.

MCP Jenkins Intelligence

AI-powered Jenkins pipeline intelligence platform with natural language interface. Provides comprehensive pipeline analysis, failure prediction, optimization suggestions, and automated Jenkinsfile reconstruction using Model Context Protocol (MCP) integration.

MCP Proxy Hub

Aggregates multiple MCP resource servers into a single interface using a JSON configuration file.

Authless Remote MCP Server

An authless remote MCP server designed for deployment on Cloudflare Workers. It can be set up locally using npm create.

Ilograph MCP Server

Create and validate Ilograph diagrams with access to documentation and guidance.

SwissArmyHammer

SwissArmyHammer

Get Started in 30 Seconds

Two Ways to Plan

Hand it a spec

Or just talk it through

Then execute

What You Get

The Problem With Agents Today

The SwissArmyHammer Approach

Context Management -- The Hidden Superpower

Smart Shell (not just bash -c)

Automatic Code Intelligence (tree-sitter + LSP)

The Suite

sah -- Skills and Tools for Any Agent

avp -- Agent Validator Protocol

mirdan -- Package Manager for Agent Skills

Architecture

Why SwissArmyHammer?

License

Related Servers

Scout Monitoring MCP

Alpha Vantage MCP Server

LetzAI

Tidymodels MCP Server

AWS DynamoDB

Moatless MCP Server

Praison AI

Next.js MCP Server

MCP Jenkins Intelligence

MCP Proxy Hub

Authless Remote MCP Server

Ilograph MCP Server

Smart Shell (not just `bash -c`)