Remix.run Logo
benreesman 6 days ago

I appreciate that hitting sm_70 through sm_120 in one call isn't the same as hitting RISC-V in one call, but I do a lot of builds just for sm_120 which is closer to a fair comparison.

It's imperfect, but I take any excuse to point out how bad monopolies are for customers. All you have to do is build the driver to see that "low priority" is a pretty broad term on the allegedly elite trillion dollar toolchain.

I'm not saying CUDA is unimpressive, its a very, very, very hard problem. But if they were in an uncorrupted market ptxas would be fast instead of devastating znver5 workstations with 6400MT DDR5.