I think it is in the interest of chip makers to make sure we all get local models

qalmakka 8 hours ago | parent | next [-]

I think they're in a win-win situation. Big AI companies would love to see local computing die in favour of the cloud because they are well aware the moment an open model that can run on non ludicrous consumer hardware appears, they're screwed. In this situation Nvidia, AMD and the like would be the only ones profiting from it - even though I'm not convinced they'd prefer going back to fighting for B2C while B2B Is so much simpler for them

▲

zozbot234 8 hours ago | parent | next [-]

If you want to run AI models at scale and with reasonably quick response, there's not many alternatives to datacenter hardware. Consumer hardware is great for repurposing existing "free" compute (including gaming PCs, pro workstations etc. at the higher end) and for basic insurance against rug pulls from the big AI vendors, but increased scale will probably still bring very real benefits.

▲

qalmakka 8 hours ago | parent [-]

Currently, yes. But I don't find it hard to imagine that in a while we could get reasonably light open models with a level of reasoning similar to current opus, for instance. In such a scenario how many people would opt to pay for a way more expensive cloud subscription? Especially since lots of people are already not that interested in paying for frontier models nowadays where it makes sense. Unless keep on getting a constant, never ending stream of improvements we're basically bound to get to a point where unless you really need it you are ok with the basic, cheaper local alternative you don't have to pay for monthly.

	▲	zozbot234 7 hours ago \| parent \| next [-]
		I think average users are already okay with the reasoning level they'd get with current open models. But the big AI firms have pivoted their frontier models towards the enterprise: coding and research, as opposed to general chat. And scale is quite important for these uses, ordinary pro hardware is not enough.
	▲	twoodfin 7 hours ago \| parent \| prev [-]
		This is really just a question of product design meeting the technology. Today, lots of integer compute happens on local devices for some purposes, and in the cloud for others. Same is already true for matmul, lots of FLOPS being spent locally on photo and video processing, speech to text, … No obvious reason you wouldn’t want to specialize LLM tasks similarly, especially as long-running agents increasingly take over from chatbots as the dominant interaction architecture.

▲

BobbyJo 8 hours ago | parent | prev [-]

At a consistent amount of usage, datacenters are at least an order of magnitude more hardware efficient. I'm sure Nvidia and AMD would be fine fighting for B2C if it meant volume would be 10+x.

Now, given they can't satisfy current volume, they are forced to settle for just having crazy margins.

▲

qalmakka 7 hours ago | parent [-]

The problem with B2C is that you need to have leverage of some kind (more demanding applications, planned obsolescence, ...) in order to get people to keep on buying your product. The average consumer may simply consider themselves satisfied with their old product they already own and only replace it when it breaks down. On the contrary, with the cloud you can keep people hooked on getting the latest product whether they need it or not, and get artificial demand from datacentres and such.

	▲	try-working 16 minutes ago \| parent \| next [-]
		Future upgrade cycles on phones and laptops, PCs, will be driven by SOCs that embed some type of ASIC that run a specific model. Every 6 months there will be a new, better version to upgrade to, which will require a new device. This is how Apple will be able to reduce cycles from 3 years to 6-12 months.
	▲	BobbyJo 6 hours ago \| parent \| prev [-]
		I think businesses running datacenters are much less likely to frivolously buy the latest GPUs with no functional incentive than general consumers are...

▲

zozbot234 8 hours ago | parent | prev | next [-]

Definitely. Many big hardware firms are directly supporting HuggingFace for this very reason.

▲

ninjahawk1 8 hours ago | parent | prev [-]

True, chip companies have the opposite mindset, Nvidia is making their own open weights I believe