Remix.run Logo
archerx 4 days ago

What’s bad for Nvidia is good for everyone else. The cuda lock-in needs to die.

pjmlp 4 days ago | parent | next [-]

It is on AMD and Intel to deliver.

nabla9 4 days ago | parent | next [-]

And they continue to fumble with it. AMD has had time to catch up--a decade in fact. They simply don’t understand: robust software support requires a significant investment from their side. Simply providing small amounts of funding for academic research doesn’t suffice.

Meanwhile Nvidia keeps building more and more libraries..

DSingularity 4 days ago | parent [-]

It’s not AMD it’s their board. Unless the board approves billions of $ in stock rewards to motivate good engineers nobody is going to join.

It’s not rocket science. They can identify many key personel in Nvidia and make them offers which would be significantly better for them. Cycle 3 years and repeat. Two or three cycles and you will have replicated the most important parts.

teeklp 4 days ago | parent [-]

It wasn't me that missed the deadline, it was my brain.

ants_everywhere 4 days ago | parent | prev | next [-]

Or one of the cloud providers who doesn't want to pay lock-in prices when they'd rather pay commodity prices

Twirrim 4 days ago | parent | next [-]

Not sure cloud providers will care, all the costs get passed onto the customers. There's already far more demand for GPUs than can be met by the supply chain, too.

If they were sitting on excess stock, or struggling to sell, sure.

coredog64 4 days ago | parent | prev | next [-]

The cloud providers all have their own Nvidia alternatives. Having worked with more than one, I would rate them not much better than AMD when it comes to software.

topspin 4 days ago | parent | prev | next [-]

How feasible is this for a cloud operation(s)? I imagine this work requires close collaboration with the architects and proprietary knowledge about the design.

ants_everywhere 4 days ago | parent [-]

it seems feasible, it's more a matter of how much of a priority it is.

I follow Google most closely. They design and manufacture their own accelerators. AWS I know manufactures its own CPUs, but I don't know if they're working on or already have an AI accelerator.

Several of the big players are working on OpenXLA, which is designed to abstract and commoditize the GPU layer: https://openxla.org/xla

OpenXLA mentions:

> Alibaba, Amazon Web Services, AMD, Apple, Arm, Google, Intel, Meta, and NVIDIA

mdaniel 4 days ago | parent [-]

> AWS I know manufactures its own CPUs, but I don't know if they're working on or already have an AI accelerator

I believe those are the Inferentia: https://aws.amazon.com/ai/machine-learning/inferentia/

> AWS Inferentia chips are designed by AWS to deliver high performance at the lowest cost in Amazon EC2 for your deep learning (DL) and generative AI inference applications

but I don't know this second if they're supported by the major frameworks, or what

I also didn't recall about https://aws.amazon.com/ai/machine-learning/trainium/ until I was looking up that page, so it seems they're trying to have a competitor to the TPUs just naming them dumb, because AWS

> AWS Trainium chips are a family of AI chips purpose built by AWS for AI training and inference to deliver high performance while reducing costs.

ants_everywhere 4 days ago | parent [-]

thanks this is useful!

> have a competitor to the TPUs just naming them dumb, because AWS

I kind of like "trainium" although "inferentia" I could take or leave. At least it's nice that the names tell you the intended use case.

pjmlp 4 days ago | parent | prev [-]

With what software though?

gary_0 3 days ago | parent | prev [-]

It's my guess that they don't want to. A decade ago when AMD's CPUs had trouble competing at the high end, they ceded that market segment almost entirely, and now they're doing the same with nVidia. And Intel is basically a dead company. Neither of them are going to risk the capital and internal shake-up necessary to actually compete with nVidia. And anyways, there's only so much TSMC capacity for high-end chips, and Apple and nVidia have already spent infinity dollars reserving most of it.

hgehjddfy 3 days ago | parent [-]

Ummm are you forgetting epyc and ryzen are now unmatched?

If AMD wants to they can compete..

gary_0 3 days ago | parent [-]

Unmatched by Intel, who have been failing for a decade, so there was no competition? They mostly won by default. And Apple's chips are giving them a run for their money. If Apple sold plain CPUs that weren't locked to their software (they never will, but hypothetically) then AMD would let themselves slide into 2nd place again.

That really makes three companies that are happy to concede to nVidia, because Apple could definitely challenge nVidia if they wanted to.

Note: I'm not saying that AMD sucks, just that their corporate culture prevents them from being very ambitious.

AuthAuth 3 days ago | parent [-]

Apples chips dont even come close. Their benchmarks compete in specific tasks and then measure by metrics like preformance per watt. These are benchmarks AMDs cpus arent optimizing for and yet they're still close. Once you remove the power consumption out of the tests and broading the tests AMD cpu's come out ahead. Apple had something impressive with M1 then within a year the other mobile cpu manufactures came out with something on par. A year after that and they had surpassed apple.

Apples closest cpu competition is Qualcomm and they dont win that.

dismalaf 3 days ago | parent | prev [-]

What's crazy to me is that ROCm and SYCL are open-source, but somehow more difficult to install and support less hardware for their respective brands than CUDA does for Nvidia...