mdp2021 9 days ago

> I suspect ... still wins in general world knowledge due to bigger size

Encyclopedic knowledge matters relatively little in perspective, given the expectable future developments: even the more knowledgeable of us will use that knowledge for reasoning and intuition (and we will have absorbed the intellectual keys during our training), but under our professional hat we should in theory be ready to go "I stand corrected" and "more precisely" with the actual data at hand.

I.e.: for the encyclopedic knowledge needed, the /understander/ will have a RAG subsystem and a corpus of knowledge to inquire upon processing queries.

(Corroboration: we can't delirate, and neither can the machine...)

▲

bitexploder 9 days ago | parent | next [-]

Don't LLMs work on attention though? The closer in their hyperdimensional space you can land your problem to their inherent understand the better they are at understanding your problem domain. RAG loops can be very slow and agents may simply lack the knowledge to use them correctly.

▲

mdp2021 7 days ago | parent [-]

But, in short, the ability to manage information, to process it properly, is more important in this regard than just having the information. "Having" more knowledge is not a guarantee to "using" it better.

And to improve reliability, if the machine can check, it will have to check. "Costly" cannot be an excuse.

▲

zigzag312 3 days ago | parent [-]

Understanding of a specific problem space can be a prerequisite to be able to form a proper query (i.e. to ask the correct question).

Model doesn't know what it doesn't know.

▲

mdp2021 2 days ago | parent [-]

Your suggestion is not clear: yes we reason and define relevant details (maybe through further information retrieval) to better construct queries - that is what Analytical school of thought taught and insisted on -, and even more crucial is that the subsequent delegated steps, of constructing replies, imply reasoning and information retrieval.

Said abilities - intellectual strength - are immensely more important than notions. The relation between network size and intellectual strength, vs network size and notions (original topic in this branch), is presumably not yet that clear. Intelligent models may not necessarily be embedded with explicit information of everything, though they will have to have ways to reach that upon contingent necessity (to solve specific problems). Like us.

	▲	zigzag312 2 days ago \| parent [-]
		I agree with what you said. I just wanted to add that intelligent models probably need to have some notion embedded (but not everything), as some information retrieval is not trivial. Too few embedded notions will hurt it's ability to solve problems but from some point onward you'll get diminishing returns (where it starts to make sense to rely just on information retrieval). For example, you if you instruct a model to create decoder for some data type users will upload to your website. The intelligent model without notions will retrieve information about that data type and build a working decoder, but it might miss from context that users uploading to a website means untrusted input and thus won't even try to gather information about what it needs to be done to securely handle such uploaded data. Or if you give it a task to translate text to a language it didn't encounter during training. You can provide it with grammar rules and a dictionary for information retrieval, but I guess it won't perform as well as inteligent model that already has some fundamental notions of that language and only needs a dictionary to expand its vocabulary. Gpt-4.1 only knows a lot of patterns, but doesn't have reasoning intelligence that would help it properly use that knowledge. So, a small reasoning model can easily beat it in a lot of tasks. The question is how will, 14 months from now, new small reasoning models compare to current big reasoning models. How much information needs to be embedded is not yet clear, but currently, bigger reasoning models are still better at complex tasks than small reasoning models. Either sweet spot of embedded notions is higher that what current small models have or information retrieval ability needs to improve.

▲

pu_pe 9 days ago | parent | prev | next [-]

I agree with you in general, but depending on the task I also find that a certain level of encyclopedic knowledge can be very valuable. For example, if you use it for coding, the model will likely not resort to search or RAGs when deciding whether to use a particular package or stack.

▲

coldcity_again 9 days ago | parent | prev [-]

A great position to take. Strong opinions, weakly held.