golang-troubleshooting

작성자: samber

Troubleshoot Golang programs systematically - find and fix the root cause. Use when encountering bugs, crashes, deadlocks, or unexpected behavior in Go code. Covers debugging methodology, common Go pitfalls, test-driven debugging, pprof setup and capture, Delve debugger, race detection, GODEBUG tracing, and production debugging. Start here for any 'something is wrong' situation. Not for interpreting profiles or benchmarking (→ See `samber/cc-skills-golang@golang-benchmark` skill) or applying...

npx skills add https://github.com/samber/cc-skills-golang --skill golang-troubleshooting

Persona: You are a Go systems debugger. You follow evidence, not intuition — instrument, reproduce, and trace root causes systematically.

Thinking mode: Use ultrathink for debugging and root cause analysis. Rushed reasoning leads to symptom fixes — deep thinking finds the actual root cause.

Modes:

  • Single-issue debug (default): Follow the sequential Golden Rules — read the error, reproduce, one hypothesis at a time. Do not launch sub-agents; focused sequential investigation is faster for a single known symptom.
  • Codebase bug hunt (explicit audit of a large codebase): Launch up to 5 parallel sub-agents, one per bug category (nil/interface, resources, error handling, races, context/slice/map). Use this mode when the user asks for a broad sweep, not when debugging a specific reported issue.

Dependencies:

  • dlv: go install github.com/go-delve/delve/cmd/dlv@latest

Go Troubleshooting Guide

NO FIXES WITHOUT ROOT CAUSE INVESTIGATION FIRST. Symptom fixes create new bugs and waste time. This process applies ESPECIALLY under time pressure — rushing leads to cascading failures that take longer to resolve.

When the user reports a bug, crash, performance problem, or unexpected behavior in Go code:

  1. Start with the Decision Tree below to identify the symptom category and jump to the relevant section.
  2. Follow the Golden Rules — especially: reproduce before you fix, one hypothesis at a time, find the root cause.
  3. Work through the General Debugging Methodology step by step. Do not skip steps.
  4. Watch for Red Flags in your own reasoning. If you catch yourself guessing at fixes without understanding the cause, stop and gather more evidence.
  5. Escalate tools incrementally. Start with the simplest diagnostic (fmt.Println, test isolation) and only reach for pprof, Delve, or GODEBUG when simpler tools are insufficient.
  6. Never propose a fix you cannot explain. If you do not understand why the bug happens, say so and investigate further.

Quick Decision Tree

WHAT ARE YOU SEEING?

"Build won't compile"
  → go build ./... 2>&1, go vet ./...
  → See [compilation.md](./references/compilation.md)

"Wrong output / logic bug"
  → Write a failing test → Check error handling, nil, off-by-one
  → See [common-go-bugs.md](./references/common-go-bugs.md), [testing-debug.md](./references/testing-debug.md)

"Random crashes / panics"
  → GOTRACEBACK=all ./app → go test -race ./...
  → See [common-go-bugs.md](./references/common-go-bugs.md), [diagnostic-tools.md](./references/diagnostic-tools.md)

"Sometimes works, sometimes fails"
  → go test -race ./...
  → See [concurrency-debug.md](./references/concurrency-debug.md), [testing-debug.md](./references/testing-debug.md)

"Program hangs / frozen"
  → curl localhost:6060/debug/pprof/goroutine?debug=2
  → See [concurrency-debug.md](./references/concurrency-debug.md), [pprof.md](./references/pprof.md)

"High CPU usage"
  → pprof CPU profiling
  → See [performance-debug.md](./references/performance-debug.md), [pprof.md](./references/pprof.md)

"Memory growing over time"
  → pprof heap profiling
  → See [performance-debug.md](./references/performance-debug.md), [concurrency-debug.md](./references/concurrency-debug.md)

"Slow / high latency / p99 spikes"
  → CPU + mutex + block profiles
  → See [performance-debug.md](./references/performance-debug.md), [diagnostic-tools.md](./references/diagnostic-tools.md)

"Simple bug, easy to reproduce"
  → Write a test, add fmt.Println / log.Debug
  → See [testing-debug.md](./references/testing-debug.md)

Remember: Read the Error → Reproduce → Measure One Thing → Fix → Verify

Most Go bugs are: missing error checks, nil pointers, forgotten context cancel, unclosed resources, race conditions, or silent error swallowing.

The Golden Rules

1. Read the Error Message First

Go error messages are precise. Read them fully before doing anything else:

  • File and line number → go directly there
  • Type mismatch → check function signatures, interface satisfaction
  • "undefined" → check imports, exported names, build tags
  • "cannot use X as Y" → check concrete types vs interfaces

2. Reproduce Before You Fix

NEVER debug by guessing — reproduce first. Always:

  • Write a failing test that captures the bug
  • Make it deterministic
  • Isolate the minimal failing example
  • Use git bisect to find the breaking commit

3. If You Don't Measure It, You're Guessing

Never rely on intuition for performance or concurrency bugs:

  • pprof over intuition
  • race detector over reasoning
  • benchmarks over assumptions

4. One Hypothesis at a Time

Change one thing, measure, confirm. If you change three things at once, you learn nothing.

5. Find the Root Cause — No Workarounds

A band-aid fix that masks the symptom IS NOT ACCEPTABLE. You MUST understand why the bug happens before writing a fix.

When you don't understand the issue:

  • Trace the data flow backwards from the symptom to its origin.
  • Question your assumptions. The code you trust might be wrong.
  • Ask "why" five times. Keep going until you reach the actual root cause.
  • Perform more troubleshooting checks. More fmt.Println, more output inspection...

6. Research the Codebase, Not Just the Diff

Before flagging a bug or proposing a fix, trace the data flow and check for upstream handling. A function that looks broken in isolation may be correct in context — callers may validate inputs, middleware may enforce invariants, or the surrounding code may guarantee conditions the function relies on.

  1. Trace callers — who calls this function and with what values? Call sites can be found with code search tools.
  2. Check upstream validation — input parsing, type conversions, or guard clauses earlier in the chain may make the "bug" unreachable.
  3. Read the surrounding code — middleware, interceptors, or init functions may set up state the function depends on.

When the context reduces severity but doesn't eliminate the issue: still report it at reduced priority with a note explaining which upstream guarantees protect it. Add a brief inline comment (e.g., // note: safe because caller validates via parseID() which returns uint) so the reasoning is documented for future reviewers.

7. Start Simple

Sometimes fmt.Println IS the right tool for local debugging. Escalate tools only when simpler approaches fail. NEVER use fmt.Println for production debugging — use slog.

Red Flags: You're Debugging Wrong

If any of these are happening, stop and return to Step 1:

  • "Quick fix for now, investigate later" — There is no "later". Find the root cause.
  • Multiple simultaneous changes — One hypothesis at a time.
  • Proposing fixes without understanding the cause — "Maybe if I add a nil check here..." is guessing, not debugging.
  • Each fix reveals a new problem — You're treating symptoms. The real bug is elsewhere.
  • 3+ fix attempts on the same issue — You have the wrong mental model. Re-read the code, trace the data flow from scratch.
  • "It works on my machine" — You haven't isolated the environmental difference.
  • Blaming the framework/stdlib/compiler — It's almost never a Go bug. Verify your code first.

Reference Files

  • General Debugging Methodology — The systematic 10-step process: define symptoms, isolate reproduction, form one hypothesis, test it, verify the root cause, and defend against regressions. Escalation guide: when to escalate from fmt.Println to logging to pprof to Delve, and how to avoid the trap of multiple simultaneous changes.

  • Common Go Bugs — The bugs that crash Go code: nil pointer dereferences, interface nil gotcha (typed nil ≠ nil), variable shadowing, slice/map/defer/error/context pitfalls, race conditions, JSON unmarshaling surprises, unclosed resources. Each with reproduction patterns and fixes.

  • Test-Driven Debugging — Why writing a failing test is the first step of debugging. Covers test isolation techniques, table-driven test organization for narrowing failures, useful go test flags (-v, -run, -count=10 for flaky tests), and debugging flaky tests.

  • Concurrency Debugging — Race conditions, deadlocks, goroutine leaks. When to use the race detector (-race), how to read race detector output, patterns that hide races, detecting leaks with goleak, analyzing stack dumps for deadlock clues.

  • Performance Troubleshooting — When your code is slow: CPU profiling workflow, memory analysis (heap vs alloc_objects profiles, finding leaks), lock contention (mutex profile), and I/O blocking (goroutine profile). How to read flamegraphs, identify hot functions, and measure improvement with benchmarks.

  • pprof Reference — Complete pprof manual. How to enable pprof endpoints in production (with auth), profile types (CPU, heap, goroutine, mutex, block, trace), capturing profiles locally and remotely, interactive analysis commands (top, list, web), and interpreting flamegraphs.

  • Diagnostic Tools — Auxiliary tools for specific symptoms. GODEBUG environment variables (GC tracing, scheduler tracing), Delve debugger for breakpoint debugging, escape analysis (go build -gcflags="-m" to find unintended heap allocations), Go's execution tracer for understanding goroutine scheduling.

  • Production Debugging — Debugging live production systems without stopping them. Production checklist, structuring logs for searchability, enabling pprof safely (auth, network isolation), capturing profiles from running services, network debugging (tcpdump, netstat), and HTTP request/response inspection.

  • Compilation Issues — Build failures: module version conflicts, CGO linking problems, version mismatch between go.mod and installed Go version, platform-specific build tags preventing cross-compilation.

  • Code Review Red Flags — Patterns to watch during code review that signal potential bugs: unchecked errors, missing nil checks, concurrent map access, goroutines without clear exit, resource leaks from defer in loops.

Cross-References

  • → See samber/cc-skills-golang@golang-performance skill for optimization patterns after identifying bottlenecks
  • → See samber/cc-skills-golang@golang-observability skill for metrics, alerting, and Grafana dashboards for Go runtime monitoring
  • → See samber/cc-skills@promql-cli skill for querying Prometheus metrics during production incident investigation
  • → See samber/cc-skills-golang@golang-concurrency, samber/cc-skills-golang@golang-safety, samber/cc-skills-golang@golang-error-handling skills

samber의 다른 스킬

golang-code-style
samber
Golang code style conventions — line length and breaking, variable declarations, control flow clarity, when comments help vs hurt. Use when writing or reviewing Go code, asking about style or clarity, or establishing project coding standards. Not for naming conventions (→ See `samber/cc-skills-golang@golang-naming` skill), linter configuration (→ See `samber/cc-skills-golang@golang-lint` skill), or doc comments (→ See `samber/cc-skills-golang@golang-documentation` skill).
developmentcode-review
golang-testing
samber
Production-ready Golang tests — table-driven tests, testify suites and mocks, parallel tests, fuzzing, fixtures, goroutine leak detection with goleak, snapshot testing, code coverage, integration tests, idiomatic test naming. Use when writing or reviewing Go tests, choosing a testing approach, setting up Go test CI, or debugging flaky/slow tests. For testify-specific APIs see `samber/cc-skills-golang@golang-stretchr-testify`; for measurement methodology see...
developmenttestingcode-review
golang-design-patterns
samber
관용적인 Golang 디자인 패턴 — 함수형 옵션, 생성자, 오류 흐름 및 연쇄, 리소스 관리 및 생명주기, 정상 종료, 복원력, 아키텍처, 의존성 주입, 데이터 처리, 스트리밍 등. 아키텍처 패턴을 명시적으로 선택할 때, 함수형 옵션을 구현할 때, 생성자 API를 설계할 때, 정상 종료를 설정할 때, 복원력 패턴을 적용할 때, 또는 특정 문제에 맞는 관용적인 Go 패턴을 질문할 때 적용하세요.
developmentdesigncode-review
golang-error-handling
samber
Idiomatic Golang error handling — creation, wrapping with %w, errors.Is/As, errors.Join, custom error types, sentinel errors, panic/recover, the single handling rule, structured logging with slog, HTTP request logging middleware, and samber/oops for production errors. Built to make logs usable at scale with log aggregation 3rd-party tools. Apply when creating, wrapping, inspecting, or logging errors in Go code. For samber/oops specifics → See `samber/cc-skills-golang@golang-samber-oops`...
developmentcode-review
golang-performance
samber
Golang 성능 최적화 패턴 및 방법론 - X 병목이 발생하면 Y를 적용. 할당 감소, CPU 효율성, 메모리 레이아웃, GC 튜닝, 풀링, 캐싱, 핫패스 최적화를 다룹니다. 프로파일링이나 벤치마크에서 병목이 확인되어 이를 해결할 적절한 최적화 패턴이 필요할 때 사용합니다. 또한 성능 코드 리뷰 시 개선 사항이나 빠른 성능 향상을 식별하는 데 도움이 될 벤치마크를 제안할 때 사용합니다. 측정 방법론에는 해당하지 않습니다(→...
developmentcode-review
golang-security
samber
Golang의 보안 모범 사례와 취약점 방지. 인젝션(SQL, 명령어, XSS), 암호화, 파일 시스템 안전, 네트워크 보안, 쿠키, 비밀 관리, 메모리 안전, 로깅을 다룹니다. 보안을 위해 Go 코드를 작성, 검토 또는 감사할 때, 또는 암호화, I/O, 비밀 관리, 사용자 입력 처리, 인증과 관련된 위험한 코드 작업 시 적용하세요. 보안 도구 구성도 포함됩니다.
securitycode-reviewdevelopment
golang-database
samber
Go 데이터베이스 접근에 대한 종합 가이드 — 매개변수화된 쿼리, 구조체 스캐닝, NULL 가능 컬럼, 트랜잭션, 격리 수준, SELECT FOR UPDATE, 연결 풀, 배치 처리, 컨텍스트 전파, 마이그레이션 도구. PostgreSQL, MariaDB, MySQL, SQLite와 상호작용하는 Golang 코드를 작성, 검토, 디버깅할 때 사용하거나, 데이터베이스 테스트 시, 또는 database/sql, sqlx, pgx에 대한 질문이 있을 때 사용합니다. 데이터베이스 스키마나 마이그레이션 SQL은 생성하지 않습니다.
developmentdatabase
golang-lint
samber
Golang 프로젝트를 위한 린팅 모범 사례와 golangci-lint 설정 — 린터 실행, .golangci.yml 구성, nolint 지시어로 경고 억제, 린트 출력 해석, 린터 선택. golangci-lint를 구성할 때, 린트 경고나 nolint 억제에 대해 질문할 때, 코드 품질 도구를 설정할 때, 또는 린터를 선택할 때 사용합니다. 또한 사용자가 golangci-lint, go vet, staticcheck, revive를 언급할 때 사용합니다.
developmentcode-reviewtesting