Remix.run Logo
camel-cdr 3 days ago

> Handling of misaligned loads/stores

Agreed, I think the problem is that RVI doesn't want to/can't mandate implementation details.

I hope that the first few RVA23 cores will have proper misaligned load/store support and we can tell toolchains RVA23 or Zicclsm means fast misaligned load/store and future hardware that is stupid enough to not implement it, will just have to suffer.

There is some silver lining, because you can transform N misaligned loads into N+1 aligned ones + a few instructions to stich together the result. Currently this needs to be done manually, but hopefully it will be an optimization in future compiler versions: https://github.com/llvm/llvm-project/issues/150263 (Edit: oh, I should've recognised your username, xd)

> The hardcoded page size.

There is Svnapot, which is supposes to allow other page sizes, but I don't know enough about it to be sure it actually solves the problem properly.

> You have to use a CSPRNG on top of it for any sensitive applications

Shouldn't you have to do that reguardless and also mix in other kind of state on OS level?

> Extensions do not form hierarchies

The mandatory extensions in the RVA profiles are a hierarchy.

> Detection of available extensions

I think this is being worked on with unified disvover, whch should also cover other microarchitectural details.

There also is a neat toolchain solution with: https://github.com/riscv-non-isa/riscv-c-api-doc/blob/main/s...

> being unable to port tricky SIMD code to the V extension

Anything NEON code is trivially ported to RVV, as is AVX-512 code that doesn't use GFNI which is pretty much the only extension that doesn't have a RVV equivalent yet (neither does NEON or SVE though).

Where the complaints come from is if you want to take full advantage of the native vector length in VLA code, which can sometimes be tricky, especially in existing projects which are sometimes build arround the assumption of fixed vector lengths. But you can always fall back to using RVV as a fixed vector length ISA with a much faster way of querrying vector length then CPUID.

> P.S.: I would be interested to hear about other people gripes with RISC-V

I feel like encoding scalar fmacc with three sources and seperate destinations and rounding modes was a huge waste of encoding space, I would trade that for a vpternlog equivalent, which also is a encoding hog, any day.

The vl=0 special case was a bad idea, now you have to know/predict vl!=0 to get rid of the vector destination as a read dependency, or have some mechanism to kill an instuction if vl=0.

There should've been restricted vrgather variants earlier, but I'm now (slowly) working on proposing them and a handfull of other new vector instructions (mask add/sub, pext/pdep, bmatflip).

Overall though, I think RVV came out suprizingly good, everything works thogether very nicely.

zozbot234 3 days ago | parent [-]

> I feel like encoding scalar fmacc with three sources and seperate destinations and rounding modes was a huge waste of encoding space

This might be easily solved by defining new lighter varieties of the F/D/Q extensions (under new "Z" names) that just don't include the fmacc insn blocks and reserve them for extension. (Of course, these new extensions would themselves be incompatible with the full F/D/Q extensions, effectively deprecating them for general use and relegating them to special-purpose uses where the FMACC encodings are genuinely useful.) Something to think about if the 32-bit insn encoding space becomes excessively scarce.