| ▲ | ambonvik 5 hours ago | ||||||||||||||||||||||||||||||||||
Hi, it is hand-coded assembly. Pushing all necessary registers to the stack (including GS on Windows), swapping the stack pointer to/from memory, popping the registers, and off we go on the other stack. I save FPU flags, but not more FPU state than necessary (which again is a whole lot more on Windows than on Linux). Others have done this elsewhere, of course. There are links/references to several other examples in the code. I mention two in particular in the NOTICE file, not because I copied their code, but because I read it very closely and followed the outline of their examples. It would probably taken me forever to figure out the Windows TIB on my own. What I think is pretty cool (biased as I am) in my implementation is the «trampoline» that launches the coroutine function and waits silently in case it returns. If it does, it is intercepted and the proper coroutine exit() function gets called. | |||||||||||||||||||||||||||||||||||
| ▲ | anematode 5 hours ago | parent [-] | ||||||||||||||||||||||||||||||||||
Interesting. How does the trampoline work? I'm wondering whether we could further decrease the overhead of the switch on GCC/clang by marking the push function with `__attribute__((preserve_none))`. Then among GPRs we only need to save the base and stack pointers, and the callers will only save what they need to | |||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||