The only thing that we are sure can't be highly compressed is knowledge, because you can only fit so much information in given entropy budget without losing fidelity.

The minimal size limits of reasoning abilities are not clear at all. It could be that you don't need all that many parameters. In which case the door is open for small focused models to converge to parity with larger models in reasoning ability.

If that happens we may end up with people using small local models most of the time, and only calling out to large models when they actually need the extra knowledge.

▲

idle_zealot 3 hours ago | parent | next [-]

> and only calling out to large models when they actually need the extra knowledge

When would you want lossy encoding of lots of data bundled together with your reasoning? If it is true that reasoning can be done efficiently with fewer parameters it seems like you would always want it operating normal data searching and retrieval tools to access knowledge rather than risk hallucination.

And re: this discussion of large data centers versus local models, do recall that we already know it's possible to make a pretty darn clever reasoning model that's small and portable and made out of meat.

	▲	dryarzeg 17 minutes ago \| parent \| next [-]
		> we already know it's possible to make a pretty darn clever reasoning model There's is a problem though: we know that it is possible, but we don't know how to (at least not yet and as far as I am aware). So we know the answer to "what?" question, but we don't know the answer to "how?" question.
	▲	adrianN 2 hours ago \| parent \| prev [-]
		I would call brains with the needed support infrastructure small.

▲

yorwba 2 hours ago | parent | prev [-]

I think you underestimate the amount of knowledge needed to deal with the complexities of language in general as opposed to specific applications. We had algorithms to do complex mathematical reasoning before we had LLMs, the drawback being that they require input in restricted formal languages. Removing that restriction is what LLMs brought to the table.

Once the difficult problem of figuring out what the input is supposed to mean was somewhat solved, bolting on reasoning was easy in comparison. It basically fell out with just a bit of prompting, "let's think step by step."

If you want to remove that knowledge to shrink the model, we're back to contorting our input into a restricted language to get the output we want, i.e. programming.