Remix.run Logo
newpavlov 6 hours ago

In some cases RISC-V ISA spec is definitely the one to blame:

1) https://github.com/llvm/llvm-project/issues/150263

2) https://github.com/llvm/llvm-project/issues/141488

Another example is hard-coded 4 KiB page size which effectively kneecaps ISA when compared against ARM.

weebull 5 hours ago | parent | next [-]

All of those things are solved with modern extensions. It's like comparing pre-MMX x86 code with modern x86. Misaligned loads and stores are Zicclsm, bit manipulation is Zb[abcs], atomic memory operations are made mandatory in Ziccamoa.

All of these extensions are mandatory in the RVA22 and RVA23 profiles and so will be implemented on any up to date RISC-V core. It's definitely worth setting your compiler target appropriately before making comparisons.

LeFantome 5 hours ago | parent | next [-]

Ubuntu being RVA23 is looking smarter and smarter.

The RISC-V ecosystem being handicapped by backwards compatibility does not make sense at this point.

Every new RISC-V board is going to be RVA23 capable. Now is the time to draw a line in the sand.

sidewndr46 2 hours ago | parent | prev | next [-]

You're correct but I guess my thoughts are if we're going to wind up with a mess of extensions, why not just use x86-64?

whaleofatw2022 2 hours ago | parent | next [-]

Because the ISA is not encumbered the way other ISAs are legally, and there are use cases where the minimal profile is fine for the sake of embedded whatever vs the cost to implement the extensions

computably an hour ago | parent | prev [-]

> why not just use x86-64?

Uh, because you can't? It's not open in any meaningful sense.

cmovq 2 hours ago | parent | prev | next [-]

But RISC-V is a _new_ ISA. Why did we start out with the wrong design that now needs a bunch of extensions? RISC-V should have taken the learnings from x86 and ARM but instead they seem to be committing the same mistakes.

wolvoleo an hour ago | parent | next [-]

It is a reduced instruction set computing isa of course. It shouldn't really have instructions for every edge case.

I only use it for microcontrollers and it's really nice there. But yeah I can imagine it doesn't perform well on bigger stuff. The idea of risc was to put the intelligence in the compiler though, not the silicon.

hun3 2 hours ago | parent | prev [-]

It was kind of an experiment from start. Some ideas turned out to be good, so we keep them. Some ideas turned out not to be good, so we fix them with extensions.

edflsafoiewq 5 hours ago | parent | prev | next [-]

What about page size?

ori_b 3 hours ago | parent [-]

It's 4k on x86 as well. Doesn't seem to hurt so bad -- at least, not enough to explain the risc-v performance gap.

twoodfin 2 hours ago | parent [-]

Hmm? x86 has supported much larger “huge” page sizes for ages.

newpavlov 4 hours ago | parent | prev | next [-]

>Misaligned loads and stores are Zicclsm

Nope. See https://github.com/llvm/llvm-project/issues/110454 which was linked in the first issue. The spec authors have managed to made a mess even here.

Now they want to introduce yet another (sic!) extension Oilsm... It maaaaaay become part of RVA30, so in the best case scenario it will be decades before we will be able to rely on it widely (especially considering that RVA23 is likely to become heavily entrenched as "the default").

IMO the spec authors should've mandated that the base load/store instructions work only with aligned pointers and introduced misaligned instructions in a separate early extension. (After all, passing a misaligned pointer where your code does not expect it is a correctness issue.) But I would've been fine as well if they mandated that misaligned pointers should be always accepted. Instead we have to deal the terrible middle ground.

>atomic memory operations are made mandatory in Ziccamoa

In other words, forget about potential performance advantages of load-link/store-conditional instructions. `compare_exchange` and `compare_exchange_weak` will always compile into the same instructions.

And I guess you are fine with the page size part. I know there are huge-page-like proposals, but they do not resolve the fundamental issue.

I have other minor performance-related nits such `seed` CSR being allowed to produce poor quality entropy which means that we have bring a whole CSPRNG if we want to generate a cryptographic key or nonce on a low-powered micro-controller.

By no means I consider myself a RISC-V expert, if anything my familiarity with the ISA as a systems language programmer is quite shallow, but the number of accumulated disappointments even from such shallow familiarity has cooled my enthusiasm for RISC-V quite significantly.

4 hours ago | parent | prev [-]
[deleted]
adastra22 6 hours ago | parent | prev [-]

Also the bit manipulation extension wasn't part of the core. So things like bit rotation is slow for no good reason, if you want portable code. Why? Who knows.

adgjlsfhk1 6 hours ago | parent | next [-]

> Also the bit manipulation extension wasn't part of the core.

This is primarily because core is primarily a teaching ISA. One of the best parts about RiscV is that you can teach a freshman level architecture class or a senior level chip building project with an ISA that is actually used. Anything powerful to run (a non built from source manually) linux will support a profile that bundles all the commonly needed instructions to be fast.

jacquesm 6 hours ago | parent | next [-]

Bit manipulation instructions are part and parcel of any curriculum that teaches CPU architecture. They are the basic building blocks for many more complex instructions.

https://five-embeddev.com/riscv-bitmanip/1.0.0/bitmanip.html

I can see quite a few items on that list that imnsho should have been included in the core and for the life of me I can't see the rationale behind leaving them out. Even the most basic 8 bit CPU had various shifts and rolls baked in.

bmenrigh 35 minutes ago | parent | next [-]

Yeah I don’t get it. Shifts and rolls are among the simplest of all instructions to implement because they can be done with just wires, zero gates. Hard to imagine a justification for leaving them out.

rwmj 6 hours ago | parent | prev | next [-]

This is the reason behind the profiles like RVA23 which include bitmanip, vector and a large number of other extensions. Real chips coming very soon will all be RVA23.

jacquesm 6 hours ago | parent [-]

Neat. I can't wait to get my hands on a devboard.

NekkoDroid 5 hours ago | parent | next [-]

The earlierst I know of coming is the SpaceMit K3, which Sipeed will have dev boards for.

statusfailed 4 hours ago | parent | prev [-]

The Milk-V Jupiter 2 (coming out in April) is RV23 too

jacquesm 4 hours ago | parent [-]

Nice board but very low on max RAM.

kevin_thibedeau 5 hours ago | parent | prev [-]

32-bit barrel shifters consume significant area and RISC-V was developed to support resource constrained low cost embedded hardware in a minimal ISA implementation.

pezezin 3 hours ago | parent | next [-]

The 32-bit ARM architecture included a barrel shifter as part of its basic design, as in every instruction had a shift field.

If a CPU built in 1985 with a grand total of 26 000 transistors could afford it, I am pretty sure that anything built in this century could afford it too.

snvzz 3 hours ago | parent [-]

26k is a lot of transistors for an embedded MCU.

You'd be excluding many small CPUs which exist within other chips running very specialized code.

As profiles mandate these instructions anyway, there's no good reason to complicate the most basic RISC-V possible.

RISC-V is the ISA for everything, from the smallest such CPUs to supercomputers.

wk_end 2 hours ago | parent [-]

What MCUs are you thinking of?

To the best of my knowledge (and Google-fu), 26K really isn't a lot of transistors for an embedded MCU - at least not a fully-featured 32-bit one comparable to a minimal RISC-V core. An ARM Cortex M0, which is pretty much the smallest thing out there, is around 10K gates => around 40K transistors. This is also around the same size as a minimal RISC-V core AFAICT.

The ARM core has a shifter, though.

snvzz 2 hours ago | parent [-]

There's reason RV32E and RV64E, with half the registers, are a thing. RV32I/RV64I isn't small enough.

There are many chips in the market that do embed 8051s for janitorial tasks, because it is small and not legally encumbered. Some chips have several non-exposed tiny embedded CPUs within.

RISC-V is replacing many of these, bringing modern tooling. There's even open source designs like SERV that fit in a corner of an already small FPGA, leaving room for other purposes.

wk_end an hour ago | parent | next [-]

Per https://en.wikipedia.org/wiki/Transistor_count, even an 8051 has 50K transistors, which reinforces my claim that 26K really doesn't seem like a big ask for an MCU core. Whether that means a barrel shifter is worth it or not is a totally orthogonal question, of course.

(Although I do have to eat my words here - I didn't check that Wikipedia page, and it does actually list a ~6K RISC-V core! It's an experimental academic prototype "made from a two-dimensional material [...] crafted from molybdenum disulfide"; I don't know if that construction might allow for a more efficient transistor count and it's totally impractical - 1KHz clock speed, 1-bit ALU, etc. - for almost any purpose, but it is technically a RISC-V implementation significantly smaller than 26K)

adgjlsfhk1 an hour ago | parent | prev [-]

> There's reason RV32E and RV64E, with half the registers, are a thing. RV32I/RV64I isn't small enough.

This is actually kind of counter to your point. The really tiny micro-controllers from the 80s only had 224 bits of registers. RV32E is at least twice that (16 registers*32 bits), and modern mcus generally use 2-4kbs of sram, so the overhead of a 32 bit barrel shifter is pretty minimal.

adgjlsfhk1 5 hours ago | parent | prev | next [-]

IIUC this is a lot less true in the modern era. Even with 24nm transistors (the cheapest transistor last time I checked), modern microcontrollers have a fairly big transistor budget for the core (since 80+% of the transistors are going to sram anyway).

jacquesm 5 hours ago | parent | prev [-]

You can save a lot of silicon by doing 8 or 16 bit shifters and then doing the rest at the code generation level. Not having any seems really anemic to me.

hackyhacky 6 hours ago | parent | prev [-]

> One of the best parts about RiscV is that you can teach a freshman level architecture class or a senior level chip building project with an ISA that is actually used.

Same could be said of MIPS.

My understanding is the RISC-V raison d'etre is rather avoidance of patented/copywritten designs.

adgjlsfhk1 6 hours ago | parent [-]

the avoidance of patent/copyright is critical for (legally) having students design their own chips. MIPS was pretty good (and widely used) for teaching assembly, but pretty bad for teaching a class where students design chips

fidotron 6 hours ago | parent | prev [-]

The fact the Hazard3 designer ended up creating an extension to resolve related oddities was kind of astonishing.

Why did it fall to them to do it? Impressive that he did, but it shouldn't have been necessary.

rllj 6 hours ago | parent [-]

Which extension is that?

mjmas 6 hours ago | parent [-]

An extension he calls Xh3bextm. For extracting multiple bits from bitfields.

https://wren.wtf/hazard3/doc/#extension-xh3bextm-section

There are also four other custom extensions implemented.