Remix.run Logo
gchadwick 4 days ago

I'd say there's a mix of 'Chinese GPUs are not that good after all' and 'Nvidia doesn't have any magical secret sauce, and China could easily catch up' going on. Nvidia GPUs are indeed remarkable devices with a complex software stack that offers all kinds of possibilities that you cannot replicate over night (or over a year or two!)

However they've also got a fair amount of generality, anything you might want to do that involves huge amounts of matmuls and vector maths you can probably map to a GPU and do a half decent job of it. This is good for things like model research and exploration of training methods.

Once this is all developed you can cherry pick a few specific things to be good at and build your own GPU concentrating on making those specific things work well (such as inference and training on Transformer architectures) and catch up to Nvidia on those aspects even if you cannot beat or match a GPU on every possible task, however you don't care as you only want to do some specific things well.

This is still hard and model architectures and training approaches are continuously evolving. Simplify things too much and target some ultra specific things and you end up with some pretty useless hardware that won't allow you to develop next year's models, nor run this year's particularly well. You can just develop and run last year's models. So you need to hit a sweet spot between enough flexibility to keep up with developments but don't add so much you have to totally replicate what Nvidia have done.

Ultimately the 'secret sauce' is just years of development producing a very capable architecture that offers huge flexibility across differing workloads. You can short-cut that development by reducing flexibility or not caring your architecture is rubbish at certain things (hence no magical secret sauce). This is still hard and your first gen could suck quite a lot (hence not that good after all) but when you've got a strong desire for an alternative hardware source you can probably put up with a lot of short-term pain for the long-term pay off.

FooBarWidget 4 days ago | parent [-]

What does "are not good after all" even mean? I feel there are too many value judgements in that question's tone, that blindsides western observers. I feel like the tone has the hidden implication of "this must be fake after all, they're only good at faking/stealing, nothing to see here move along".

Are they as good as Nvidia? No. News reporters have a tendency to hype things up beyond reality. No surprises there.

Are they useless garbage? No.

Can the quality issues be overcome with time and R&D? Yes.

Is being "worse" a necessary interim step to become "good"? Yes.

Are they motivated to become "good"? Yes.

Do they have a market that is willing to wait for them to become "good"? Also yes. It used to be no, but the US created this market for them.

Also, comparing Chinese AI chips to Nvidia is a bit like comparing AWS with Azure. Overcoming compatibility problems is not trivial, you can't just lift and shift your workload to another public cloud, you are best off redesigning your entire infra for the capabilities of the target cloud.

rich_sasha 4 days ago | parent | next [-]

I think my question made it clear I'm not simply assuming China is somehow cheating here - either in the specs of their current product, or in stealing IP.

No, I just struggle to reconcile (but many answers here go some way to clarifying) Nvidia being the pinnacle of the R&D-driven tech industry - not according to me but to global investors - and China catching up seemingly easily.

FooBarWidget 4 days ago | parent [-]

Unfortunately I think global investors are quite dumb. For example all the market analysts were very positive about ASML, Nvidia, etc but they all assumed sales to China would continue according to projections that don't take US sanctions or Chinese competition into account. Every time a sanction landed or a Chinese competitor made major step forward, it was surprise pikachu, even though enthusiasts who follow news on this topic saw it coming years ago.

gchadwick 4 days ago | parent | prev [-]

To me at least "not good after all" means their current latest hardware has issues which means it cannot replace Nvidia GPUs yet. This is a hard problem so not getting there yet doesn't imply bad engineering just a reflection of the scale of the challenge! It also doesn't imply that if this generation is a miss following generations couldn't be large win. Indeed I think it would be very foolish to assume that Alibaba or other Chinese firms cannot build devices that can challenge Nvidia here on the basis of current generation not being up to it yet. As you say they have a large market that's willing to wait for them to become good.

Plus it may not be true, this new Alibaba chip could turn out to be brilliant.