Remix.run Logo
IshKebab 6 days ago

Neat, but it's not like assembly is really a bottleneck in any but the most extreme cases. LLVM and GAS are already very fast.

I feel like this might mostly be useful as a reference, because currently RISC-V assembly's specification is mostly "what do GCC/Clang do?"

drob518 6 days ago | parent | next [-]

Exactly. I don’t know too many assembly language programmer's who are griping about slow tools, particularly on today’s hardware. Yea, Orca/M on my old Apple II with 64k RAM and floppy drives was pretty slow, but since then not so much. But sure, as a fun challenge to see how fast you can make it run, go for it.

CyberDildonics 6 days ago | parent | prev | next [-]

ASM should compile at hundreds of MB/s. All the ASM you could write in your entire life will compile instantly. There is no one in decades that has thought their assembler is too slow.

benreesman 6 days ago | parent | prev [-]

ptxas comes to mind.

gdiamos 6 days ago | parent [-]

ptxas is a bit of a misnomer - it actually wraps the entire NVIDIA driver backend compiler

PTX isn’t the assembly language, it is a virtual ISA, so you need a full backend compiler with 10s to 100s of passes to get to machine code

benreesman 6 days ago | parent [-]

I appreciate that hitting sm_70 through sm_120 in one call isn't the same as hitting RISC-V in one call, but I do a lot of builds just for sm_120 which is closer to a fair comparison.

It's imperfect, but I take any excuse to point out how bad monopolies are for customers. All you have to do is build the driver to see that "low priority" is a pretty broad term on the allegedly elite trillion dollar toolchain.

I'm not saying CUDA is unimpressive, its a very, very, very hard problem. But if they were in an uncorrupted market ptxas would be fast instead of devastating znver5 workstations with 6400MT DDR5.