Remix.run Logo
Q6T46nT668w6i3m 5 days ago

I agree that “learning CUDA wasn’t particularly difficult to get started,” there are Grand Canyon sized chasms between CUDA and its alternatives when attempting to crank performance.

physicsguy 4 days ago | parent | next [-]

Well, I think to a degree that depends what you're targeting.

Single socket 8 core CPU? Yes.

If you spent some time playing with trying to eke out performance on Xeon Phi and have done NUMA-aware code for multi socket boards and optimising for the memory hierarchy of L1/L2/L3 then it really isn't that different.

j45 5 days ago | parent | prev [-]

It will improve for sure but this shouldn’t be downplayed.