| ▲ | jsheard 3 hours ago | |
> Toolchain support should also be improved, e.g. you could make all vregs take 512-bit on the stack, but have the codegen only utilize the lowee 128 bit, if you have 128-but vregs, 256-bit if you have 256-bit vregs and 512-bit if you have >=512-bit vregs. SVE theoretically supports hardware up to 2048-bit, so conservatively reserving the worst-case size at compile time would be pretty wasteful. That's 16x overhead in the base case of 128-bit hardware. | ||