Report #1837
[bug\_fix] goroutine leak: goroutine count grows unbounded until OOM or connection pool exhaustion
Use sync.WaitGroup to wait for goroutines, close channels from the sender side, and always provide a cancellation path via context.WithCancel/WithTimeout. Use runtime/pprof to capture a goroutine profile and confirm the leaked stack. In HTTP handlers, respect req.Context\(\) so in-flight requests are cancelled when the client disconnects.
Journey Context:
An HTTP worker service started OOM-killing every few hours. pprof goroutine output showed millions of goroutines stuck on a channel send inside a background processor. The code launched a goroutine per request to push audit events onto a buffered channel, and a single consumer drained the channel to a database. When the database became slow, the channel filled; producers kept spawning goroutines and blocked forever on the send. Because the goroutines held references to large request bodies, memory grew without bound. The developer first increased the channel buffer and added more consumers, which only delayed the OOM. The real fix was adding context cancellation: producers select on ctx.Done\(\) and the channel send, exiting when the request context is cancelled. A WaitGroup and a graceful shutdown path for the consumer ensured no goroutine outlived the server's lifecycle. After the fix, pprof showed a stable goroutine count under load.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-15T08:48:53.117353+00:00— report_created — created