improve-codebase-architecture

Encuentra oportunidades de profundización en una base de código, informado por el lenguaje del dominio en CONTEXT.md y las decisiones en docs/adr/. Úsalo cuando el usuario quiera mejorar la arquitectura, encontrar oportunidades de refactorización, consolidar módulos fuertemente acoplados o hacer que una base de código sea más testeable y navegable por IA.

npx skills add https://github.com/mattpocock/skills --skill improve-codebase-architecture

Improve Codebase Architecture

Surface architectural friction and propose deepening opportunities — refactors that turn shallow modules into deep ones. The aim is testability and AI-navigability.

Glossary

Use these terms exactly in every suggestion. Consistent language is the point — don't drift into "component," "service," "API," or "boundary." Full definitions in LANGUAGE.md.

  • Module — anything with an interface and an implementation (function, class, package, slice).
  • Interface — everything a caller must know to use the module: types, invariants, error modes, ordering, config. Not just the type signature.
  • Implementation — the code inside.
  • Depth — leverage at the interface: a lot of behaviour behind a small interface. Deep = high leverage. Shallow = interface nearly as complex as the implementation.
  • Seam — where an interface lives; a place behaviour can be altered without editing in place. (Use this, not "boundary.")
  • Adapter — a concrete thing satisfying an interface at a seam.
  • Leverage — what callers get from depth.
  • Locality — what maintainers get from depth: change, bugs, knowledge concentrated in one place.

Key principles (see LANGUAGE.md for the full list):

  • Deletion test: imagine deleting the module. If complexity vanishes, it was a pass-through. If complexity reappears across N callers, it was earning its keep.
  • The interface is the test surface.
  • One adapter = hypothetical seam. Two adapters = real seam.

This skill is informed by the project's domain model. The domain language gives names to good seams; ADRs record decisions the skill should not re-litigate.

Process

1. Explore

Read the project's domain glossary and any ADRs in the area you're touching first.

Then use the Agent tool with subagent_type=Explore to walk the codebase. Don't follow rigid heuristics — explore organically and note where you experience friction:

  • Where does understanding one concept require bouncing between many small modules?
  • Where are modules shallow — interface nearly as complex as the implementation?
  • Where have pure functions been extracted just for testability, but the real bugs hide in how they're called (no locality)?
  • Where do tightly-coupled modules leak across their seams?
  • Which parts of the codebase are untested, or hard to test through their current interface?

Apply the deletion test to anything you suspect is shallow: would deleting it concentrate complexity, or just move it? A "yes, concentrates" is the signal you want.

2. Present candidates as an HTML report

Write a self-contained HTML file to the OS temp directory so nothing lands in the repo. Resolve the temp dir from $TMPDIR, falling back to /tmp (or %TEMP% on Windows), and write to <tmpdir>/architecture-review-<timestamp>.html so each run gets a fresh file. Open it for the user — xdg-open <path> on Linux, open <path> on macOS, start <path> on Windows — and tell them the absolute path.

The report uses Tailwind via CDN for layout and styling, and Mermaid via CDN for diagrams where a graph/flow/sequence reliably communicates the structure. Mix Mermaid with hand-crafted CSS/SVG visuals — use Mermaid when relationships are graph-shaped (call graphs, dependencies, sequences), and hand-built divs/SVG when you want something more editorial (mass diagrams, cross-sections, collapse animations). Each candidate gets a before/after visualisation. Be visual.

For each candidate, the same template as before, but rendered as a card:

  • Files — which files/modules are involved
  • Problem — why the current architecture is causing friction
  • Solution — plain English description of what would change
  • Benefits — explained in terms of locality and leverage, and how tests would improve
  • Before / After diagram — side-by-side, custom-drawn, illustrating the shallowness and the deepening
  • Recommendation strength — one of Strong, Worth exploring, Speculative, rendered as a badge

End the report with a Top recommendation section: which candidate you'd tackle first and why.

Use CONTEXT.md vocabulary for the domain, and LANGUAGE.md vocabulary for the architecture. If CONTEXT.md defines "Order," talk about "the Order intake module" — not "the FooBarHandler," and not "the Order service."

ADR conflicts: if a candidate contradicts an existing ADR, only surface it when the friction is real enough to warrant revisiting the ADR. Mark it clearly in the card (e.g. a warning callout: "contradicts ADR-0007 — but worth reopening because…"). Don't list every theoretical refactor an ADR forbids.

See HTML-REPORT.md for the full HTML scaffold, diagram patterns, and styling guidance.

Do NOT propose interfaces yet. After the file is written, ask the user: "Which of these would you like to explore?"

3. Grilling loop

Once the user picks a candidate, drop into a grilling conversation. Walk the design tree with them — constraints, dependencies, the shape of the deepened module, what sits behind the seam, what tests survive.

Side effects happen inline as decisions crystallize:

  • Naming a deepened module after a concept not in CONTEXT.md? Add the term to CONTEXT.md — same discipline as /grill-with-docs (see CONTEXT-FORMAT.md). Create the file lazily if it doesn't exist.
  • Sharpening a fuzzy term during the conversation? Update CONTEXT.md right there.
  • User rejects the candidate with a load-bearing reason? Offer an ADR, framed as: "Want me to record this as an ADR so future architecture reviews don't re-suggest it?" Only offer when the reason would actually be needed by a future explorer to avoid re-suggesting the same thing — skip ephemeral reasons ("not worth it right now") and self-evident ones. See ADR-FORMAT.md.
  • Want to explore alternative interfaces for the deepened module? See INTERFACE-DESIGN.md.

Más skills de mattpocock

grill-me
mattpocock
Entrevista al usuario sin descanso sobre un plan o diseño hasta alcanzar un entendimiento compartido, resolviendo cada rama del árbol de decisiones. Úsalo cuando el usuario quiera poner a prueba un plan, ser interrogado sobre su diseño, o mencione "grill me".
researchcommunicationproject-management
grill-with-docs
mattpocock
Sesión de parrilla que pone a prueba tu plan contra el modelo de dominio existente, afina la terminología y actualiza la documentación (CONTEXT.md, ADRs) en línea a medida que las decisiones se cristalizan. Úsalo cuando el usuario quiera poner a prueba un plan contra el lenguaje de su proyecto y las decisiones documentadas.
developmentdocumentresearch
teach
mattpocock
Enseñar al usuario una nueva habilidad o concepto, dentro de este espacio de trabajo.
communicationproductivity
tdd
mattpocock
Desarrollo guiado por pruebas con el ciclo rojo-verde-refactorizar. Úsalo cuando el usuario quiera construir funcionalidades o corregir errores usando TDD, mencione "rojo-verde-refactorizar", quiera pruebas de integración o solicite desarrollo basado en pruebas primero.
developmenttesting
to-prd
mattpocock
Convierte el contexto actual de la conversación en un PRD y publícalo en el rastreador de incidencias del proyecto. Úsalo cuando el usuario quiera crear un PRD a partir del contexto actual.
developmentdocumentproject-management
handoff
mattpocock
Compacta la conversación actual en un documento de handoff para que otro agente lo retome.
communicationproject-managementdocument
diagnose
mattpocock
Bucle de diagnóstico disciplinado para errores difíciles y regresiones de rendimiento. Reproducir → minimizar → hipotetizar → instrumentar → corregir → prueba de regresión. Usar cuando el usuario dice "diagnostica esto" / "depura esto", reporta un error, dice que algo está roto/lanzando una excepción/fallando, o describe una regresión de rendimiento.
developmenttestingcode-review
to-issues
mattpocock
Divide un plan, especificación o PRD en issues independientes y abordables en el rastreador de issues del proyecto, utilizando slices verticales tipo tracer-bullet. Úsalo cuando el usuario quiera convertir un plan en issues, crear tickets de implementación o desglosar el trabajo en issues.
developmentproject-management