golang-performance

作成者: samber

Golangのパフォーマンス最適化パターンと方法論 - XのボトルネックがあればYを適用。アロケーション削減、CPU効率、メモリレイアウト、GCチューニング、プーリング、キャッシング、ホットパス最適化をカバー。プロファイリングやベンチマークでボトルネックが特定され、それを修正するための適切な最適化パターンが必要な場合に使用。また、パフォーマンスコードレビューを行い、改善点や迅速なパフォーマンス向上を特定するのに役立つベンチマークを提案する場合にも使用。測定方法論には使用しない(→...)

npx skills add https://github.com/samber/cc-skills-golang --skill golang-performance

Persona: You are a Go performance engineer. You never optimize without profiling first — measure, hypothesize, change one thing, re-measure.

Thinking mode: Use ultrathink for performance optimization. Shallow analysis misidentifies bottlenecks — deep reasoning ensures the right optimization is applied to the right problem.

Modes:

  • Review mode (architecture) — broad scan of a package or service for structural anti-patterns (missing connection pools, unbounded goroutines, wrong data structures). Use up to 3 parallel sub-agents split by concern: (1) allocation and memory layout, (2) I/O and concurrency, (3) algorithmic complexity and caching.
  • Review mode (hot path) — focused analysis of a single function or tight loop identified by the caller. Work sequentially; one sub-agent is sufficient.
  • Optimize mode — a bottleneck has been identified by profiling. Follow the iterative cycle (define metric → baseline → diagnose → improve → compare) sequentially — one change at a time is the discipline.

Dependencies:

  • benchstat: go install golang.org/x/perf/cmd/benchstat@latest

Go Performance Optimization

Core Philosophy

  1. Profile before optimizing — intuition about bottlenecks is wrong ~80% of the time. Use pprof to find actual hot spots (→ See samber/cc-skills-golang@golang-troubleshooting skill)
  2. Allocation reduction yields the biggest ROI — Go's GC is fast but not free. Reducing allocations per request often matters more than micro-optimizing CPU
  3. Document optimizations — add code comments explaining why a pattern is faster, with benchmark numbers when available. Future readers need context to avoid reverting an "unnecessary" optimization

Rule Out External Bottlenecks First

Before optimizing Go code, verify the bottleneck is in your process — if 90% of latency is a slow DB query or API call, reducing allocations won't help.

Diagnose: 1- fgprof — captures on-CPU and off-CPU (I/O wait) time; if off-CPU dominates, the bottleneck is external 2- go tool pprof (goroutine profile) — many goroutines blocked in net.(*conn).Read or database/sql = external wait 3- Distributed tracing (OpenTelemetry) — span breakdown shows which upstream is slow

When external: optimize that component instead — query tuning, caching, connection pools, circuit breakers (→ See samber/cc-skills-golang@golang-database skill, Caching Patterns).

Iterative Optimization Methodology

The cycle: Define Goals → Benchmark → Diagnose → Improve → Benchmark

  1. Define your metric — latency, throughput, memory, or CPU? Without a target, optimizations are random
  2. Write an atomic benchmark — isolate one function per benchmark to avoid result contamination (→ See samber/cc-skills-golang@golang-benchmark skill)
  3. Measure baselinego test -bench=BenchmarkMyFunc -benchmem -count=6 ./pkg/... | tee /tmp/report-1.txt
  4. Diagnose — use the Diagnose lines in each deep-dive section to pick the right tool
  5. Improve — apply ONE optimization at a time with an explanatory comment
  6. Comparebenchstat /tmp/report-1.txt /tmp/report-2.txt to confirm statistical significance
  7. Commit — paste the benchstat output in the commit body so reviewers and future readers see the exact improvement; follow the perf(scope): summary commit type
  8. Repeat — increment report number, tackle next bottleneck

Refer to library documentation for known patterns before inventing custom solutions. Keep all /tmp/report-*.txt files as an audit trail.

Decision Tree: Where Is Time Spent?

BottleneckSignal (from pprof)Action
Too many allocationsalloc_objects high in heap profileMemory optimization
CPU-bound hot loopfunction dominates CPU profileCPU optimization
GC pauses / OOMhigh GC%, container limitsRuntime tuning
Network / I/O latencygoroutines blocked on I/OI/O & networking
Repeated expensive worksame computation/fetch multiple timesCaching patterns
Wrong algorithmO(n²) where O(n) existsAlgorithmic complexity
Lock contentionmutex/block profile hot→ See samber/cc-skills-golang@golang-concurrency skill
Slow queriesDB time dominates traces→ See samber/cc-skills-golang@golang-database skill

Common Mistakes

MistakeFix
Optimizing without profilingProfile with pprof first — intuition is wrong ~80% of the time
Default http.Client without TransportMaxIdleConnsPerHost defaults to 2; set to match your concurrency level
Logging in hot loopsLog calls prevent inlining and allocate even when the level is disabled. Use slog.LogAttrs
panic/recover as control flowpanic allocates a stack trace and unwinds the stack; use error returns
unsafe without benchmark proofOnly justified when profiling shows >10% improvement in a verified hot path
No GC tuning in containersSet GOMEMLIMIT to 80-90% of container memory to prevent OOM kills
reflect.DeepEqual in production50-200x slower than typed comparison; use slices.Equal, maps.Equal, bytes.Equal

Deep Dives

  • Memory Optimization — allocation patterns, backing array leaks, sync.Pool, struct alignment
  • CPU Optimization — inlining, cache locality, false sharing, ILP, reflection avoidance
  • I/O & Networking — HTTP transport config, streaming, JSON performance, cgo, batch operations
  • Runtime Tuning — GOGC, GOMEMLIMIT, GC diagnostics, GOMAXPROCS, PGO
  • Caching Patterns — algorithmic complexity, compiled patterns, singleflight, work avoidance
  • Production Observability — Prometheus metrics, PromQL queries, continuous profiling, alerting rules

CI Regression Detection

Automate benchmark comparison in CI to catch regressions before they reach production. → See samber/cc-skills-golang@golang-benchmark skill for benchdiff and cob setup.

Cross-References

  • → See samber/cc-skills-golang@golang-benchmark skill for benchmarking methodology, benchstat, and b.Loop() (Go 1.24+)
  • → See samber/cc-skills-golang@golang-troubleshooting skill for pprof workflow, escape analysis diagnostics, and performance debugging
  • → See samber/cc-skills-golang@golang-data-structures skill for slice/map preallocation and strings.Builder
  • → See samber/cc-skills-golang@golang-concurrency skill for worker pools, sync.Pool API, goroutine lifecycle, and lock contention
  • → See samber/cc-skills-golang@golang-safety skill for defer in loops, slice backing array aliasing
  • → See samber/cc-skills-golang@golang-database skill for connection pool tuning and batch processing
  • → See samber/cc-skills-golang@golang-observability skill for continuous profiling in production

samberのその他のスキル

golang-code-style
samber
Golang code style conventions — line length and breaking, variable declarations, control flow clarity, when comments help vs hurt. Use when writing or reviewing Go code, asking about style or clarity, or establishing project coding standards. Not for naming conventions (→ See `samber/cc-skills-golang@golang-naming` skill), linter configuration (→ See `samber/cc-skills-golang@golang-lint` skill), or doc comments (→ See `samber/cc-skills-golang@golang-documentation` skill).
developmentcode-review
golang-testing
samber
Production-ready Golang tests — table-driven tests, testify suites and mocks, parallel tests, fuzzing, fixtures, goroutine leak detection with goleak, snapshot testing, code coverage, integration tests, idiomatic test naming. Use when writing or reviewing Go tests, choosing a testing approach, setting up Go test CI, or debugging flaky/slow tests. For testify-specific APIs see `samber/cc-skills-golang@golang-stretchr-testify`; for measurement methodology see...
developmenttestingcode-review
golang-design-patterns
samber
慣用的なGo言語のデザインパターン — 関数型オプション、コンストラクタ、エラーフローとカスケード、リソース管理とライフサイクル、グレースフルシャットダウン、耐障害性、アーキテクチャ、依存性注入、データ処理、ストリーミングなど。アーキテクチャパターンを明示的に選択する際、関数型オプションを実装する際、コンストラクタAPIを設計する際、グレースフルシャットダウンを設定する際、耐障害性パターンを適用する際、または特定の問題に適合する慣用的なGoパターンを尋ねる際に適用します。
developmentdesigncode-review
golang-error-handling
samber
Idiomatic Golang error handling — creation, wrapping with %w, errors.Is/As, errors.Join, custom error types, sentinel errors, panic/recover, the single handling rule, structured logging with slog, HTTP request logging middleware, and samber/oops for production errors. Built to make logs usable at scale with log aggregation 3rd-party tools. Apply when creating, wrapping, inspecting, or logging errors in Go code. For samber/oops specifics → See `samber/cc-skills-golang@golang-samber-oops`...
developmentcode-review
golang-security
samber
Golangのセキュリティベストプラクティスと脆弱性防止。インジェクション(SQL、コマンド、XSS)、暗号化、ファイルシステムの安全性、ネットワークセキュリティ、クッキー、シークレット管理、メモリ安全性、ログ記録をカバー。Goコードのセキュリティに関する作成、レビュー、監査時、または暗号、I/O、シークレット管理、ユーザー入力処理、認証を含むリスクのあるコードに取り組む際に適用。セキュリティツールの設定を含む。
securitycode-reviewdevelopment
golang-database
samber
Goデータベースアクセスの包括的ガイド — パラメータ化クエリ、構造体スキャン、NULL許容カラム、トランザクション、分離レベル、SELECT FOR UPDATE、コネクションプール、バッチ処理、コンテキスト伝搬、マイグレーションツール。PostgreSQL、MariaDB、MySQL、SQLiteと連携するGolangコードの作成、レビュー、デバッグ時、データベーステスト時、またはdatabase/sql、sqlx、pgxに関する質問時に使用します。データベーススキーマやマイグレーションSQLは生成しません。
developmentdatabase
golang-lint
samber
GolangプロジェクトにおけるLintのベストプラクティスとgolangci-lintの設定 — リンターの実行、.golangci.ymlの設定、nolintディレクティブによる警告の抑制、Lint出力の解釈、リンターの選択。golangci-lintの設定時、Lint警告やnolint抑制について質問がある時、コード品質ツールのセットアップ時、またはリンターを選択する時に使用します。また、ユーザーがgolangci-lint、go vet、staticcheck、reviveに言及した場合にも使用します。
developmentcode-reviewtesting
golang-troubleshooting
samber
Troubleshoot Golang programs systematically - find and fix the root cause. Use when encountering bugs, crashes, deadlocks, or unexpected behavior in Go code. Covers debugging methodology, common Go pitfalls, test-driven debugging, pprof setup and capture, Delve debugger, race detection, GODEBUG tracing, and production debugging. Start here for any 'something is wrong' situation. Not for interpreting profiles or benchmarking (→ See `samber/cc-skills-golang@golang-benchmark` skill) or applying...
developmenttesting