▲ | dzaima 3 days ago | |
Another bad choice (perhaps more accurately called a bug, but they chose to not do anything about it): vmv1.r & co (aka whole-vector-register move instructions) depend on valid vtype being set, despite not using any part of it (outside of the extreme edge-case of an interrupt happening in the middle of it, and the hardware wanting to chop the operation in half instead of finishing it (entirely pointless for application-class CPUs where VLEN isn't massive enough for that to in any way be useful; never mind moves being O(1) with register renaming)) So to move one vector register to another, you need to have a preceding vsetvl; worse, with the standard calling convention you may get illegal vtype after a function call! Even worse, the behavior is actually left reserved for for move with illegal vtype, so hardware can (and some does) just allow it, thereby making it impossible to even test for on some hardware. Oh, and that thing about being able to stop a vector instruction midway through? You might think that's to allow guaranteeing fast interrupts while keeping easy forwards progress; but no, vector reductions cannot be restarted.. And there's the extremely horrific vfredosum[1], which is an ordered float sum reduction, i.e. a linear chain of N float adds, i.e. a (fp add latency) * (element count in vector) -cycle op that must be started completely over again if interrupted. [1]: https://dzaima.github.io/intrinsics-viewer/#0q1YqVbJSKsosTtY... |