“Novel” to the person who has not consumed the training data. Otherwise, just training data combined in highly probable ways.

Not quite autocomplete but not intelligence either.

▲

pc86 4 hours ago | parent | next [-]

What is the difference between "novel" and "novel to someone who hasn't consumed the entire corpus of training data, which is several orders of magnitude greater than any human being could consume?"

	▲	adrian_b 3 hours ago \| parent \| next [-]
		The difference is that when you do not know how a problem can be solved, but you know that this kind of problem has been solved countless times earlier by various programmers, you know that it is likely that if you ask an AI coding assistant to provide a solution, you will get an acceptable solution. On the other hand, if the problem you have to solve has never been solved before at a quality satisfactory for your purpose, then it is futile to ask an AI coding assistant to provide a solution, because it is pretty certain that the proposed solution will be unacceptable (unless the AI succeeds to duplicate the performance of a monkey that would type a Shakespearean text by typing randomly).
	▲	szundi 3 hours ago \| parent \| prev [-]
		[dead]

▲

soulofmischief 4 hours ago | parent | prev [-]

Citation needed that grokked capabilities in a sufficiently advanced model cannot combinatorially lead to contextually novel output distributions, especially with a skilled guiding hand.

▲

arcanemachiner 4 hours ago | parent [-]

Pretty sure burden of proof is on you, here.

▲

soulofmischief 4 hours ago | parent [-]

It's not, because I haven't ruled out the possibility. I could share anecdata about how my discussions with LLMs have led to novel insights, but it's not necessary. I'm keeping my mind open, but you're asserting an unproven claim that is currently not community consensus. Therefore, the burden of proof is on you.

▲

adrian_b 2 hours ago | parent [-]

I agree that after discussions with a LLM you may be led to novel insights.

However, such novel insights are not novel due to the LLM, but due to you.

The "novel" insights are either novel only to you, because they belong to something that you have not studied before, or they are novel ideas that were generated by yourself as a consequence of your attempts to explain what you want to the LLM.

It is very frequent for someone to be led to novel insights about something that he/she believed to already understand well, only after trying to explain it to another ignorant human, when one may discover that the previous supposed understanding was actually incorrect or incomplete.

▲

soulofmischief 2 hours ago | parent [-]

The point is that the combined knowledge/process of the LLM and a user (which could be another LLM!) led to it walking the manifold in a way that produced a novel distribution for a given domain.

I talk with LLMs for hours out of the day, every single day. I'm deeply familiar with their strengths and shortcomings on both a technical and intuitive level. I push them to their limits and have definitely witnessed novel output. The question remains, just how novel can this output be? Synthesis is a valid way to produce novel data.

And beyond that, we are teaching these models general problem-solving skills through RL, and it's not absurd to consider the possibility that a good enough training regimen cannot impart deduction/induction skills into a model that are powerful enough to produce novel information even via means other than direct synthesis of existing information. Especially when given affordances such as the ability to take notes and browse the web.

▲

irishcoffee an hour ago | parent [-]

> I push them to their limits and have definitely witnessed novel output.

I’m quite curious what these novel outputs are. I imagine the entire world would like to know of an LLM producing completely, never-before-created outputs which no human has ever thought before.

Here is where I get completely hung up. Take 2+2. An LLM has never had 2 groups of two items and reached the enlightenment of 2+2=4

It only knows that because it was told that. If enough people start putting 2+2=3 on the internet who knows what the LLM will spit out. There was that example a ways back where an LLM would happily suggest all humans should eat 1 rock a day. Amusingly, even _that_ wasn’t a novel idea for the LLM, it simply regurgitated what it scraped from a website about humans eating rocks. Which leads to the crux: how much patently false information have LLMs scraped that is completely incorrect?

▲

soulofmischief an hour ago | parent [-]

This is not a correct approximation of what happens inside an LLM. They form probabilistic logical circuits which approximate the world they have learned through training. They are not simply recalling stored facts. They are exploiting organically-produced circuitry, walking a manifold, which leads to the ability to predict the next state in a staggering variety of contexts.

As an example: https://arxiv.org/abs/2301.05217

It's not hard to imagine that a sufficiently developed manifold could theoretically allow LLMs to interpolate or even extrapolate information that was missing from the training data, but is logically or experimentally valid.

▲

emp17344 31 minutes ago | parent [-]

You could find a pre-print on Arxiv to validate practically any belief. Why should we care about this particular piece of research? Is this established science, or are you cherry-picking low-quality papers?

	▲	soulofmischief 23 minutes ago \| parent [-]
		I don't need to reach far to find preliminary evidence of circuits forming in machine learning models. Here's some research from OpenAI researchers exploring circuits in vision models: https://distill.pub/2020/circuits/ Are these enough to meet your arbitrary quality bar? Circuits are the basis for features. There is still a ton of open research on this subject. I don't care what you care about, the research is still being done and it's not a new concept.