Stack vs heap: what escape analysis taught me

Somewhere around my fifth or sixth year of writing Go I realized I had no mental model for where my variables lived. I’d vaguely heard “small things on the stack, large things on the heap, but Go decides,” and I’d left it at that. Actually reading Go’s escape analysis output taught me more about generated code than almost anything else.

In Go, every variable lives either on the goroutine’s stack or on the heap. The stack is cheap — allocation is just bumping a pointer, deallocation is free when the function returns. The heap is expensive — allocations go through the allocator, and the garbage collector has to trace them. A function with all-stack allocations can be basically zero-cost. A function that allocates a small thing on the heap in a tight loop can dominate your CPU profile.

The rule Go’s compiler uses is “escape analysis.” It asks: does this variable need to outlive the function that created it? If yes, it goes on the heap. If no, it stays on the stack. To see what the compiler decided, use -gcflags=-m:

$ go build -gcflags=-m ./...
./main.go:10:6: can inline f
./main.go:14:9: &x escapes to heap
./main.go:14:9: moved to heap: x

Two messages to look for:

escapes to heap — the compiler decided this variable must outlive its function
moved to heap: X — same thing, phrased differently

Common reasons things escape:

Returning a pointer to a local.

func makeThing() *Thing {
    t := Thing{} // escapes to heap
    return &t
}

The caller keeps the pointer; the local must outlive the function; heap it goes.

Storing a pointer in an interface.

func log(a any) {
    fmt.Println(a)
}

log(42) // the int 42 escapes, boxed into an any

The any parameter is an interface, and storing the int into it typically allocates (unless the int fits into a special small-int optimization, which Go does for some small ints but not generally).

Slices, maps, channels. If you make([]int, n) with a compile-time-unknown n, it goes to heap. If n is a known constant and the slice doesn’t escape, it can stay on stack. make(chan int) is almost always heap.

Closures that outlive their creator.

func delayed() func() {
    x := 42
    return func() { fmt.Println(x) } // x escapes, captured by returned closure
}

The closure captures x, and the closure outlives delayed, so x goes to heap.

Taking the address of a struct field and passing it to another function that might retain it.

type Counter struct { n int }
func (c *Counter) Inc() { c.n++ }

func bar(c *Counter) {
    c.Inc()
}

func foo() {
    c := Counter{}
    bar(&c)   // may or may not escape, depending on inlining
}

If bar is inlined, c can stay on stack. If not, the compiler might conservatively decide c escapes because bar could store the pointer somewhere. Escape analysis is context-sensitive to some degree but not inter-procedural across compilation units.

The fmt trap.

fmt.Sprintf("hello %d", x)

fmt.Sprintf is a variadic any.... The arguments get boxed into interface{}, which (for most types) allocates. Every Sprintf call you make in a hot loop is at least one allocation per argument. strconv.Itoa + string concatenation is allocation-free for small cases.

A concrete example from a service I was optimizing. A hot function looked like:

func (s *Service) processEvent(ev Event) Result {
    result := Result{}
    result.ID = ev.ID
    result.Processed = s.process(&ev)
    return result
}

-gcflags=-m said:

./handler.go:14:13: moved to heap: ev

The ev moves to heap because &ev is passed to s.process, and the compiler can’t prove s.process doesn’t retain it. In fact, it didn’t retain it — s.process only read fields. But because s.process was in a different package, inter-procedural analysis couldn’t prove this.

The fix was to make s.process take ev by value:

func (s *Service) process(ev Event) bool { ... }

or move the logic inline. Either way, ev stopped escaping, and the hot loop dropped from ~150 allocations per call to ~20.

Things I’ve learned to keep in mind:

Small structs are cheap by value. Go compiles func f(e Event) reasonably: the struct is copied on the stack. For a 64-byte struct, this is a handful of instructions. For a 4KB struct, copying is expensive — pass a pointer but be aware of escape.

sync.Pool helps when escape is fundamental. Some allocations HAVE to be heap (because they escape across goroutines, for example). sync.Pool reduces the GC pressure from those. See my post on sync.Pool.

new(T) vs &T{}. They’re equivalent, both escape under the same rules.

Interfaces are an escape hazard. Every time you convert a concrete type to an interface (returning an error, calling Sprintf, using an any parameter), the concrete value typically escapes. See my post on interfaces and allocation.

Benchmarks with -benchmem show allocations. If your allocations/op is nonzero for a supposedly pure function, run with -gcflags=-m to find out why.

I don’t chase every allocation. For non-hot code, allocations are fine — GC handles them well. But for request handlers, serialization loops, decoders, and anything in a tight loop, knowing where the stack-vs-heap line falls is worth the hour of reading -gcflags=-m output.

The deeper insight is that Go’s runtime is pay-as-you-go in a way I didn’t fully appreciate. The GC is reliable and fast, but it’s not free. Stack-only code runs at roughly the speed of C with none of the memory management headaches. That’s a great sweet spot — if you stay on it.