Remix.run Logo
rfoo 2 days ago

> The only thing that I can say definitively is that there is overhead to doing the literal stack switch. There's a reason async I/O got us past the C10k problem so handily.

You can also say that not having to constantly allocate & deallocate stuff and rely on a bump allocator (the stack) most of the time more than compensate for the stack switch overhead. Depends on workload of course :p

IMO it's more about memory and nowadays it might just be path dependence. Back in C10k days address spaces were 32-bit (ok 31-bit really), and 2**31 / 10k ~= 210KiB. Makes static-ish stack management really messy. So you really need to extract the (minimal) state explicitly and pack them on heap.

Now we happily run ASAN which allocates 1TiB (2**40) address space during startup for a bitmap of the entire AS (2**48) and nobody complains.