Go's escape analysis and why my function return worked

▲ Go's escape analysis and why my function return worked(bonniesimon.in)

42 points by bonniesimon 8 days ago | 70 comments

▲ tdfirth 2 days ago | parent | next [-]

I don’t think this is confusing to the vast majority of people writing Go.

In my experience, the average programmer isn’t even aware of the stack vs heap distinction these days. If you learned to write code in something like Python then coming at Go from “above” this will just work the way you expect.

If you come at Go from “below” then yeah it’s a bit weird.

▲ onionisafruit 2 days ago | parent | next [-]

Go has been my primary language for a few years now, and I’ve had to do extra work to make sure I’m avoiding the heap maybe five times. Stack and heap aren’t on my mind most of the time when designing and writing Go, even though I have a pretty good understanding of how it works. The same applies to the garbage collector. It just doesn’t matter most of the time.

That said, when it matters it matters a lot. In those times I wish it was more visible in Go code, but I would want it to not get in the way the rest of the time. But I’m ok with the status quo of hunting down my notes on escape analysis every few months and taking a few minutes to get reacquainted.

Side note: I love how you used “from above” and “from below”. It makes me feel angelic as somebody who came from above; even if Java and Ruby hardly seemed like heaven.

▲

carb 2 days ago | parent | next [-]

Why have you had to avoid the heap? Performance concerns?

	▲	malkia 2 days ago \| parent \| next [-]
		For me, avoiding heap, or rather avoiding gc came when I was working (at work) on backend and web server using Java, and there was default rule for our code that if gc takes more than 1% (I don't remember the exact value) then the server gets restarted. Coming (back then) from C/C++ gamedev - I was puzzled, then I understood the mantra - it's better for the process to die fast, instead of being pegged by GC and not answering to the client. Then we started looking what made it use GC so much. I guess it might be similar to Go - in the past I've seen some projects using a "baloon" - to circumvent Go's GC heuristic - e.g. if you blow this dummy baloon that takes half of your memory GC might not kick so much... Something like this... Then again obviously bad solution long term
	▲	ignoramous 2 days ago \| parent \| prev [-]
		Garbage Collection. The content of the stack is (always?) known at compile time; it can also be thrown away wholesale when the function is done, making allocations on the stack relatively cheaper. These FOSDEM talks by Bryan Boreham & Sümer Cip talk about it a bit: - Optimising performance through reducing memory allocations (2018), https://archive.fosdem.org/2018/schedule/event/faster/ - Writing GC-Friendly [Go] code (2025), https://archive.fosdem.org/2025/schedule/event/fosdem-2025-5... Speaking of GC, Go 1.26 will default to a newer one viz. Green Tea: https://go.dev/blog/greenteagc

▲

tdfirth a day ago | parent | prev [-]

Ha! I had not intended to imply that one is better than the other, but I am glad that it made you feel good :).

I also came "from above".

▲ bostik 2 days ago | parent | prev | next [-]

As someone who writes both Python and Go (and I've been using Python professionally since 2005), I remember that the scoping behaviour has changed.

Back in Python 2.1 days, there was no guarantee that a locally scoped variable would continue to exist past the end of the method. It was not guaranteed to vanish or go fully out of scope, but you could not rely on it being available afterwards. I remember this changing from 2.3 onwards (because we relied on the behaviour at work) - from that point onwards you could reliably "catch" and reuse a variable after the scope it was declared in had ended, and the runtime would ensure that the "second use" maintained the reference count correctly. GC did not get in the way or concurrently disappear the variable from underneath you anymore.

Then from 2008 onwards the same stability was extended to more complex data types. Again, I remember this from having work code give me headaches for yanking supposedly out-of-scope variable into thin air, and the only difference being a .1 version difference between the work laptop (where things worked as you'd expect) and the target SoC device (where they didn't).

▲ compsciphd 2 days ago | parent | prev [-]

I don't see how this is coming at go "from below".

even in C, the concept of returning a pointer to a stack allocated variable is explicitly considered undefined behavior (not illegal, explicitly undefined by the standard, and yes that means unsafe to use). It be one thing if the the standard disallowed it.

but that's only because the memory location pointed to by the pointer will be unknown (even perhaps immediately). the returning of the variable's value itself worked fine. In fact, one can return a stack allocated struct just fine.

TLDR: I don't see what the difference between returning a stack allocated struct in C and a stack allocated slice in Go is to a C programmer. (my guess is that the C programmer thinks that a stack allocated slice in Go is a pointer to a slice, when it isn't, it's a "struct" that wraps a pointer)

▲ simiones 2 days ago | parent [-]

The confusion begins the moment you think Go variables get allocated on the stack, in the C sense. They don't, semantically. Stack allocation is an optimization that the Go compiler can sometimes do for you, with no semantics associated with it.

The following Go code also works perfectly well, where it would obviously be UB in C:

  func foo() *int {
    i := 7
    return &i
  }

  func main() {
    x := foo()
    fmt.Printf("The int was: %d", *x) //guaranteed to print 7
  }

▲

compsciphd 2 days ago | parent | next [-]

ok, I'd agree with you in that example a go programmer would expect it to work fine, but a C programmer would not, but that's not the example the writer gave. I stand by my statement that the example the writer gave, C programmer would expect to work just fine.

	▲	simiones 2 days ago \| parent [-]
		I think the writer had multiple relatively weird confusions, to be fair. It's most likely that "a little knowledge is a dangerous thing". They obviously knew something about escape analysis and Go's ability to put variables on the stack, and they likely knew as well that Go slices are essentially (fat) pointers to arrays. As the author shows in their explanations, they thought that the backing array for the slice gets allocated on the stack, but then the slice (which contains/represents a pointer to the stack-allocated array) gets returned. This is a somewhat weird set of assumptions to make (especially give that the actual array is allocated in a different function that we don't get to see, ReadFromFile, but apparently this is how the author thought through the code.

▲

samdoesnothing a day ago | parent | prev [-]

Is that the case? I thought that it would be a copy instead of a heap allocation.

Of course the compiler could inline it or do something else but semantically its a copy.

	▲	masklinn 20 hours ago \| parent [-]
		A copy of what? It’s returning a pointer, so i has to be on the heap[0]. gc could create i on the stack then copy it to the heap, but if you plug that code into godbolt you can see that it is not that dumb, it creates a heap allocation then writes the literal directly into that. [0] unless Foo is inlined and the result does not escape the caller’s frame, then that can be done away with.

▲ foldr 8 days ago | parent | prev | next [-]

This seems to be a persistent source of confusion. Escape analysis is just an optimization. You don't need to think about it to understand why your Go code behaves the way it does. Just imagine that everything is allocated on the heap and you won't have any surprises.

▲ Yokohiii 2 days ago | parent | next [-]

I am currently learning go and your comment made me sort some things out, but probably in a counterintuitive way.

Assuming to everything allocates on the heap, will solve this specific confusion.

My understanding is that C will let you crash quite fast if the stack becomes too large, go will dynamically grow the stack as needed. So it's possible to think you're working on the heap, but you are actually threshing the runtime with expensive stack grow calls. Go certainly tries to be smart about it with various strategies, but a rapid stack grow rate will have it's cost.

▲ foldr 2 days ago | parent [-]

Go won’t put large allocations on the stack even if escape analysis would permit it, so generally speaking this should only be a concern if you have very deep recursion (in which case you might have to worry about stack overflows anyway).

▲ masklinn 20 hours ago | parent | next [-]

> Go won’t put large allocations on the stack even if escape analysis would permit it

Depends what you mean by “large”. As of 1.24 Go will put slices several KB into the stack frame:

    make([]byte, 65536)

Goes on the stack if it does not escape (you can see Go request a large stack frame)

    make([]byte, 65537)

goes on the heap (Go calls runtime.makeslice).

Interestingly arrays have a different limit: they respect MaxStackVarSize, which was lowered from 10MB to 128 KB in 1.24.

If you use indexed slice literals gc does not even check and you can create megabyte-sized slices on the stack.

	▲	Yokohiii 16 hours ago \| parent [-]
		There is a option -smallframes that seems to be intended for conservative use cases. Below are the related configs and a test at what point they escape (+1). `// -smallframes // ir.MaxStackVarSize = 64 * 1024 // ir.MaxImplicitStackVarSize = 16 * 1024 a := [64 * 1024 +1]byte{} b := make([]byte, 0, 16 * 1024 +1) // default // MaxStackVarSize = int64(128 * 1024) // MaxImplicitStackVarSize = int64(64 * 1024) c := [128 * 1024 +1]byte{} d := make([]byte, 0, 64 * 1024 +1)` Not sure how to verify this, but the assumption you can allocate megabytes on the stack seems wrong. The output of the escape analysis for arrays is different then the make statement: `test/test.go:36:2: moved to heap: c` Maybe an overlook because it is a bit sneaky?

▲ Yokohiii 2 days ago | parent | prev [-]

Escape analysis accounts for size, so it wouldn't even permit it.

The initial stack size seems to be 2kb, a more on a few systems. So far I understand you can allocate a large local i.e. 8kb, that doesn't escape and grow the stack immediately. (Of course that adds up if you have a chain of calls with smaller allocs). So recursion is certainly not the only concern.

▲

foldr 2 days ago | parent [-]

For that to be a problem you either have to have one function that allocates an enormous number of non-escaping objects below the size limit (if the Go compiler doesn't take the total size of all a function's non-escaping allocations into account – I don't know), or a very long series of nested function calls, which in practice is only likely to arise if there are recursive calls.

▲

Yokohiii a day ago | parent [-]

I think we mix things up here. But be aware of my newbie knowledge.

I am pretty sure the escape analysis doesn't affect the initial stack size. Escape analysis does determine where an allocation lives. So if your allocation is lower then what escape analysis considers heap and bigger then the initial stack size, the stack needs to grow.

What I am certain about, is that I have runtime.newstack calls accounting for +20% of my benchmark times (go testing). My code is quite shallow (3-4 calls deep) and anything of size should be on the heap (global/preallocated) and the code has zero allocations. I don't use goroutines either, it might me I still make a mistake or it's the overhead from the testing benchmark. But this obviously doesn't seem to be anything super unusual.

▲

foldr a day ago | parent [-]

I don't know about your code, but in general, goroutine stacks are designed to start small and grow. There is nothing concerning about this. A call to runtime.newstack triggered by a large stack-allocated value would generally be cheaper than the corresponding heap allocation.

	▲	Yokohiii a day ago \| parent [-]
		I found my issue, I was creating a 256 item fixed array of a 2*uint8 struct in my code. That was enough to cause newstack calls. It now went down from varying 10% to roughly 1%. Oddly enough it didn't change the ns/op a bit. I guess some mix of workload related irrelevancy and inaccurate reporting or another oversight on my side.

▲ bonniesimon 7 days ago | parent | prev | next [-]

Makes sense. I need to rewire how I think about Go. I should see it how I see JS.

▲ 9rx 2 days ago | parent | prev [-]

> This seems to be a persistent source of confusion.

Why? It is the same as in C.

    #include <stdio.h>
    #include <stdlib.h>

    struct slice {
        int *data;
        size_t len;
        size_t cap;
    };

    struct slice readLogsFromPartition() {
        int *data = malloc(2);
        data[0] = 1;
        data[1] = 2;
        return (struct slice){ data, 2, 2 };
    }

    int main() {
        struct slice s = readLogsFromPartition();
        for (int i = 0; i < s.len; i++) {
            printf("%d\n", s.data[i]);
        }
        free(s.data);
    }

▲ simiones 2 days ago | parent | next [-]

The point the GP was making was that the following Go snippet:

  func foo() {
    x := []int { 1 }
    //SNIP 
  }

Could translate to C either as:

  void foo() { 
    int* x = malloc(1 * sizeof(int));
    x[0] = 1;
    //...
  }

Or as

  void foo() { 
    int data[1] = {1};
    int *x = data;
    //...
  }

Depending on the content of //SNIP. However, some people think that the semantics can also match the semantics of the second version in C - when in fact the semantics of the Go code always match the first version, even when the actual implementation is the second version.

▲

9rx 2 days ago | parent [-]

The semantics are clearly defined as being the same as the C code I posted earlier. Why would one try to complicate the situation by thinking that it would somehow magically change sometimes?

▲

simiones 2 days ago | parent [-]

Because people hear that Go supports value types and so is more efficient than Java because it can allocate on the stack*, and so they start thinking that they need to manage the stack.

* Of course, in reality, Java also does escape analysis to allocate on the stack, though it's less likely to happen because of the lack of value types.

	▲	9rx 2 days ago \| parent [-]
		I don't see the difficulty here. The slice is to be thought of as value type, as demonstrated in the C version. Just like in C, you can return it from a function without the heap because it is copied.

▲ foldr 2 days ago | parent | prev [-]

What confuses people is

    int *foo(void) {
        int x = 99;
        return &x; // bad idea
    }

vs.

    func foo() *int {
        x := 99
        return &x // fine
    }

They think that Go, like C, will allocate x on the stack, and that returning a pointer to the value will therefore be invalid.

(Pedants: I'm aware that the official distinction in C is between automatic and non-automatic storage.)

▲ knorker a day ago | parent [-]

Yes. That's escape analysis. But this is not what OP did.

What you wrote is not the same in C and Go, because GC and escape analysis. But 9rx is also correct that what OP wrote is the same in C and Go.

So OP almost learned about escape analysis, but their example didn't actually do it. So double confusion on their side.

▲ foldr a day ago | parent [-]

Well, my point is that escape analysis has nothing to do with it at the semantic level. So it's actually just 'because GC'. You don't need the concept of escape analysis at all to understand the behavior of the Go example.

▲ knorker 18 hours ago | parent [-]

Yeah. That's what I said.

▲ foldr 17 hours ago | parent [-]

I mean that escape analysis has nothing to do with my example either, in terms of understand the semantics of the code (so I’m disagreeing with the ‘because GC and escape analysis’ part of your comment).

▲ knorker 16 hours ago | parent [-]

Your https://news.ycombinator.com/item?id=46234206 relies on escape analysis though, right?

Escape analysis is the reason your `x` is on the heap. Because it escaped. Otherwise it'd be on the stack.[1]

Now if by "semantics of the code" you mean "just pretend everything is on the heap, and you won't need to think about escape analysis", then sure.

Now in terms of what actually happens, your code triggers escape analysis, and OP does not.

[1] Well, another way to say this I guess is that without escape analysis, a language would be forced to never use the stack.

▲ foldr 15 hours ago | parent [-]

Escape analysis clearly isn’t part of the semantics of Go. For that to be the case, the language standard would have to specify exactly which values are guaranteed to be stack allocated. In reality, this depends on size thresholds which can vary from platform to platform or between different versions of the Go compiler. Is the following non-escaping array value stack allocated?

    func pointless() byte {
        var a byte[1024]
        a[0] = 1
        return a[0]
    }

That’s entirely up to the compiler, not something that’s determined by the language semantics. It could vary from platform to platform or compiler version to compiler version. So clearly you don’t need to think about the details of escape analysis to understand what your code does because in many cases you simply won’t know if your value is on the stack or not.

	▲	knorker 14 hours ago \| parent [-]
		While you are of course 100% correct, in the context of discussing escape analysis I find it odd to say that it's essentially not "real". Like any optimization, it makes sense to talk about what "will" happen, even if a language (or a specific compiler) makes no specific promises. Escape analysis enables an optimization. I think I understand you to be saying that "escape analysis" is not why returning a pointer to a local works in Go, but it's what allows some variables to be on the stack, despite the ability to return pointers to other "local" variables. Or similar to how the compiler can allow "a * 6" to never use a mul instruction, but just two shifts and an add. Which is probably a better way to think about it. > So clearly you don’t need to think about the details of escape analysis to understand what your code does Right. To circle back to the context: Yeah, OP thought this was due to escape analysis, and that's why it worked. No, it's just a detail about why other code does something else. (but not really, because OP returned the slice by value) So I suppose it's more correct to say that we were never discussing escape analysis at all. An escape analysis post would be talking about allocation counts and memory fragmentation, not "why does this work?". Claude (per OPs post) led them astray.

▲ jstanley 2 days ago | parent | prev | next [-]

It's not confusing that this works in Go. (In my opinion).

A straightforward reading of the code suggests that it should do what it does.

The confusion here is a property of C, not of Go. It's a property of C that you need to care about the difference between the stack and the heap, it's not a general fact about programming. I don't think Go is doing anything confusing.

▲

throwaway894345 2 days ago | parent [-]

I like Go a lot, but I often wish we could be more explicit about where allocations are. It’s often important for writing performant code, but instead of having semantics we have to check against the stack analyzer which has poor ergonomics and may break at any time.

But yeah, to your point, returning a slice in a GC language is not some exotic thing.

▲

onionisafruit 2 days ago | parent | next [-]

I think I would like a “stackvar” declaration that works the same as “var” except my code won’t compile if escape analysis shows it would wind up on the heap. I say that knowing I’m not a language designer and have never written a compiler. This may be an obviously bad idea to somebody experienced in either of those.

I commented elsewhere on this post that I rarely have to think about stacks and heaps when writing Go, so maybe this isn’t my issue to care about either.

▲

Scaevolus 2 days ago | parent | next [-]

This could probably be implemented as an expensive comment-driven lint during compilation.

	▲	onionisafruit a day ago \| parent [-]
		I don’t think it can be a true linter because it depends on the compiler. But it’s not a bad idea anyway

▲

Yokohiii 2 days ago | parent | prev [-]

Escape analysis sends large allocation to the stack. The information is there.

▲

Yokohiii 2 days ago | parent | prev [-]

Can you elaborate on the stack analyzer? All I could figure out was to see runtime.morestack calls that affected the runtime, but as far I remember the caller timings did exclude the cost. Having a clearer view of stack grow rates would be really great.

▲

throwaway894345 2 days ago | parent [-]

I’m not sure what you mean? Are you asking for information about what it is or how to use it?

▲

Yokohiii a day ago | parent [-]

I never heard of "stack analyzer" and didn't get meaningful results for it, do you mean escape analysis?

▲

throwaway894345 a day ago | parent [-]

Sorry, yes, I meant “escape analyzer”. I’ve been jet lagged.

	▲	Yokohiii a day ago \| parent [-]
		Ok no problem. Take a good sleep soon!

▲ nasretdinov 2 days ago | parent | prev | next [-]

If the functions get inlined (which they might if they're small enough), then the code won't even need to allocate on heap! That's a kind of optimisation that's not really possible without transparent escape analysis.

▲ jasonthorsness 2 days ago | parent | prev | next [-]

You can run the compiler with a flag that shows all the escapes with -gcflags “-m” and there’s also support in goland and vscode to show the escapes as inline annotations in the editor. This sort of thing IMO is one of the useful things about IDEs: showing hints from later parts of the tool chain about how things are going to turn out

▲ 2 days ago | parent | prev | next [-]

[deleted]

▲ mwsherman 2 days ago | parent | prev | next [-]

Shameless plug, if one wishes to track down allocations in Go, an allocations explorer for VS Code: https://marketplace.visualstudio.com/items?itemName=Clipperh...

	▲	tgv 2 days ago \| parent [-]
		That looks nice. Going to give it a try.

▲ matthewaveryusa 2 days ago | parent | prev | next [-]

Nope, this analysis is wrong. Decompile your code and look at what's going on: https://godbolt.org/z/f1nx9ffYK

The thing being returned is a slice (a fat pointer) that has pointer, length, capacity. In the code linked you'll see the fat pointer being returned from the function as values. in C you'd get just AX (the pointer, without length and cap)

    command-line-arguments_readLogsFromPartition_pc122:
            MOVQ    BX, AX     // slice.ptr   -> AX (first result register)
            MOVQ    SI, BX     // slice.len   -> BX (second)
            MOVQ    DX, CX     // slice.cap   -> CX (third)

The gargabe collection is happening in the FUNCDATA/PCDATA annotations, but I don't really know how that works.

▲ potato-peeler 2 days ago | parent | prev | next [-]

If the variable was defined in the calling function itself, and a pointer was passed, I guess the variable will still be in the heap?

	▲	Yokohiii 2 days ago \| parent [-]
		Pointers escape to the heap by default.

▲ metadat 2 days ago | parent | prev | next [-]

Emojis in code comments make them unreadable. Why is this a thing?

▲ knorker 2 days ago | parent | prev | next [-]

Are you sure this is what's happening? Looks to me like the slice object is returned by value, and the array was always on the heap. See https://go.dev/play/p/Bez0BgRny7G (the address of the slice object changed, so it's not the same object on the heap)

Sure, Go has escape analysis, but is that really what's happening here?

Isn't this a better example of escape analysis: https://go.dev/play/p/qX4aWnnwQV2 (the object retains its address, always on the heap, in both caller and callee)

▲ simiones 2 days ago | parent | next [-]

Depending on escape analysis, the array underlying the slice can get allocated on the stack as well, if it doesn't escape the function context. Of course, in this case, because we are returning a pointer to it via the slice, that optimization isn't applicable.

▲ knorker a day ago | parent [-]

Agreed that it could in principle. But I can't immediately get it to do so: https://go.dev/play/p/9hLHattS8cf

Both arrays in this example seem to be on the heap.

▲ simiones 17 hours ago | parent [-]

Taking the address of those variables makes them escape to heap. Even sending them to the Printf function makes them escape to heap.

If you want to confirm, you have to use the Go compiler directly. Take the following code:

  package main
  import (
    "fmt"
  )
  type LogEntry struct {
    s string
  }
  func readLogsFromPartition(partition int) []LogEntry {
    var logs []LogEntry // Creating an innocent slice
    logs = []LogEntry{{}}
    logs2 := []LogEntry{{}}
    fmt.Printf("%v %v\n", len(logs), len(logs2))
    return []LogEntry{{}}
  }
  func main() {
    logs := readLogsFromPartition(1)
    fmt.Printf("%p\n", &logs[0])
  }

And compile it with

  $ go build -gcflags '-m' main.go
  # command-line-arguments
  ./main.go:15:12: inlining call to fmt.Printf
  ./main.go:21:12: inlining call to fmt.Printf
  ./main.go:13:19: []LogEntry{...} does not escape
  ./main.go:14:21: []LogEntry{...} does not escape
  ./main.go:15:12: ... argument does not escape
  ./main.go:15:27: len(logs) escapes to heap
  ./main.go:15:38: len(logs2) escapes to heap
  ./main.go:16:19: []LogEntry{...} escapes to heap
  ./main.go:21:12: ... argument does not escape

However, if you return logs2, or if you take the address, or if you pass them to Printf with %v to print them, you'll see that they now escape.

An additional note: in your original code from your initial reply, everything you allocate escapes to heap as well. You can confirm in a similar way.

▲ masklinn 2 days ago | parent | prev | next [-]

That’s the one.

Since 1.17 it’s not impossible for escape analysis to come into play for slices but afaik that is only a consideration for slices with a statically known size under 64KiB.

▲ bonniesimon 2 days ago | parent | prev | next [-]

Interesting! This could be true. I'll play around with this in a bit.

	▲	knorker 2 days ago \| parent [-]
		Yeah I think what you're describing, returning a slice thus copying a reference to the same array (but not copying the array), then destroying the callee slice not causing the array to be freed, is just basic garbage collection logic, not escape analysis.

▲ lenkite 2 days ago | parent | prev [-]

Yes, this is merely the slice fat pointer being copied and returned.

▲ 8 days ago | parent | prev | next [-]

[deleted]

▲ samdoesnothing 2 days ago | parent | prev | next [-]

Go is returning a copy of the slice, in the same way that C would return a copy of an int or struct if you returned it. The danger of C behaviour in this instance is that a stack allocated array decays into a pointer which points to the deallocated memory. Otherwise the behaviour is pretty similar between the languages.

	▲	debugnik 2 days ago \| parent [-]
		I first wrote an answer about how local variables can survive through a pointer, but deleted it because you're right that this Go code doesn't even address locals. It's a regular value copy.

▲ gethly 2 days ago | parent | prev [-]

> In C, you can't assign a value in a local function and then return it

I am so glad I never taken up C. This sound like a nightmare of a DX to me.

	▲	kjeetgill 2 days ago \| parent [-]
		Depending on what your working on, it's actually super nice to know very clearly what lives on the stack vs the heap for performance and compactness reasons. Basically anything that didn't come from malloc or a function calling malloc lives on the stack and doesn't live past the function it was allocated in. And these days, if you're bothering with C you probably care about these things. Accidentally promoting from the stack to the heap would be annoying.