Remix.run Logo
UncleOxidant 6 hours ago

If we didn't have a RAM/GPU shortage right now they would be more nervous than they are. But as it is very few people are going to be able to afford a rig that can run this model effectively. That's probably not going to change for several more years yet. I think if the Z.ai folks decide to come out with a flash version of GLM-5.2 specialized for coding that came in about about 80B params, then the US frontier labs would probably be more worried. Overall, the Chinese AI companies have been showing the way to do the same amount with less (sometimes much less) and as that trend continues it's going to make the frontier labs worried - but even the Chinese AI companies are going to want to protect their moat by not releasing capable models that are significantly smaller than their current flagship models. AliBaba Qwen seems to be there now - it's gotten mighty quiet from them lately - their latest 395B model is just too large for most folks to run at home and they don't seem to be making any noises about releasing smaller ones this time around.

gpm 5 hours ago | parent | next [-]

The ram/gpu shortage won't last forever though. Moreover we can be pretty confident that long-term the prices will obey wrights law and come down in cost significantly (from the pre-shortage prices) as we learn to produce them more efficiently.

LLM companies are valued as if they're going to have some enduring monopoly that they can extract money from... GLM-5.2 and similar models make that valuation very very questionable.

UncleOxidant 5 hours ago | parent | next [-]

> The ram/gpu shortage won't last forever though.

No disagreement there, but it could easily last another 3 to 5 years which is a long time in tech terms.

DougN7 2 hours ago | parent [-]

Long enough for them to IPO and all the execs to retire. I doubt they care beyond the IPO.

mannanj 5 hours ago | parent | prev [-]

> The ram/gpu shortage won't last forever though

Don't underestimate the markets ability to remain irrational

colinsane 3 hours ago | parent [-]

the companies which have the power to alleviate these shortages are the same companies who are profiting most from the shortage. scarcity is an asset, it's not irrational that a concentrated marked will produce more of that asset.

selectodude 3 hours ago | parent [-]

The solution for high prices is high prices.

If making RAM and SSDs is now cause for a 10 figure valuation, after enough time somebody will dive in.

elorant 5 hours ago | parent | prev | next [-]

Very few people, but quite a lot of companies especially after per token pricing took effect and companies see their invoices skyrocketing. You pay an upfront cost once and you’re done.

dannyw 2 hours ago | parent | prev | next [-]

When a large open weight model is released, a lab, startup, or a rich hoist can easily do logit-level distillation and create a XXb param model or whatever, and in theory you should get a really good distill.

verdverm 5 hours ago | parent | prev [-]

I suspect the time horizon is shorter because of software advances. We are getting more capability out of smaller models

Alibaba released Qwen 3.6 "tiny" models not that long ago, they punch way above their weight(s)

UncleOxidant 3 hours ago | parent [-]

> Alibaba released Qwen 3.6 "tiny" models not that long ago, they punch way above their weight(s)

True, Qwen3.6-27B is amazing for it's size. However, it seems likely that we're not going to see anymore of these smaller models from Alibaba/Qwen since several key players exited that organization a few months back.

Infernal 3 hours ago | parent | next [-]

Do we know where those key players went?

verdverm 2 hours ago | parent | prev [-]

Good to know, I think the trend is clear based on the models coming out of China and well see more capabilities in smaller, more efficient models.