That's already happening. Qwen3.6 and Gemma4.

Basically small and medium models that are crazy well trained for their sizes.

Then we have a lot of specular decoding stuff like MTP and others coming to speed up responses, and finally better quantisation to use less memory.

Local LLM is the future, and the larger labs know that the open models will eat their lunch once people realise that the gap is only a few months. If we were good with LLMs a couple months ago, we're good with the open models now.

▲

krupan 10 hours ago | parent [-]

And how were those models developed and trained?

▲

lelanthran 10 hours ago | parent [-]

> And how were those models developed and trained?

That's irrelevant to my decision to use local or not.

▲

krupan 10 hours ago | parent | next [-]

That's not what this thread is about? We're saying some new breakthrough is needed, someone said it already has happened, and I'm asking if it really has. Has it? I don't think so, those models are not in some way fundamentally different than other LLMs

▲

lelanthran 9 hours ago | parent [-]

> We're saying some new breakthrough is needed, someone said it already has happened, and I'm asking if it really has.

I didn't read "and how were those models trained" as "Are we there yet?"

	▲	intothemild 2 hours ago \| parent [-]
		There's a percentage of people who love to question how the open models were trained.. they are almost always going to try and make some argument about using the closed frontier models for distillation as some form of theft. Just totally forgetting that the frontier models themselves stole an insane amount to get to where they are. It's theft all the way across the board, and when someone tries to make the argument that open models theft is bad, but Altman or Amodei's theft is good.. they are revealing a lot about themselves

▲

10 hours ago | parent | prev [-]

[deleted]