Remix.run Logo
throwup238 21 hours ago

> I don’t suppose you know a good “for dummies” explanation of why CUDA is such an insurmountable moat for them?

Theoretically the moat isn’t insurmountable and AMD has made some inroads thanks to the open source community but in practice a generic CUDA layer requires a ton of R&D that AMD hasn’t been able to afford since the ATI acquisition. It’s been fighting for its existence for most of that time and just never had the money to invest in catching up to NVIDIA beyond the hardware. Even something as seemingly simple as porting the BLAS library to CUDA is a significant undertaking that has to validate numerical codes while dealing with floating point subtleties. The CPU versions of these libraries are so foundational and hard to get right that they’re still written in FORTRAN and haven’t changed much in decades. Everything built on top of those libraries then requires having customers who can help you test and profile real code in use. When people say that software isn’t a moat they’re talking about basic CRUD over a business domain where all it takes is a competent developer and someone with experience in the industry to replicate. CUDA is about as far from that as you can get in software without stepping on Mentor Graphics’ or Dassault’s toes.

There’s a second factor which is that hardware companies tend to have horrible software cultures, especially when silicon is the center of gravity. The hardware guys in leadership discount the value of software and that philosophy works itself down the hierarchy. In this respect NVIDIA is very much an outlier and it shows in CUDA. Their moat isn’t just the software but the organization that allowed it to flourish in a hardware company, which predates their success in AI (NVIDIA has worked with game developers for decades to optimize individual games).

franktankbank 19 hours ago | parent [-]

Maybe nobody has reputably released non-fortran versions but they probably exist.

throwup238 16 hours ago | parent [-]

Lots of other versions exist including reputable ones like Intel’s MKL. The hard part isn’t reimplementing it, it’s validating the output across a massive corpus of scientific work.

BLAS is an example though, it’s the tip of an iceberg.