That's an interesting hypothesis : that LLM are fundamentally unable to produce original code.

Do you have papers to back this up ? That was also my reaction when i saw some really crazy accurate comments on some vibe coded piece of code, but i couldn't prove it, and thinking about it now i think my intuition was wrong (ie : LLMs do produce original complex code).

▲

jacquesm a day ago | parent | next [-]

We can solve that question in an intuitive way: if human input is not what is driving the output then it would be sufficient to present it with a fraction of the current inputs, say everything up to 1970 and have it generate all of the input data from 1970 onwards as output.

If that does not work then the moment you introduce AI you cap their capabilities unless humans continue to create original works to feed the AI. The conclusion - to me, at least - is that these pieces of software regurgitate their inputs, they are effectively whitewashing plagiarism, or, alternatively, their ability to generate new content is capped by some arbitrary limit relative to the inputs.

▲

measurablefunc a day ago | parent | next [-]

This is known as the data processing inequality. Non-invertible functions can not create more information than what is available in their inputs: https://blog.blackhc.net/2023/08/sdpi_fsvi/. Whatever arithmetic operations are involved in laundering the inputs by stripping original sources & references can not lead to novelty that wasn't already available in some combination of the inputs.

Neural networks can at best uncover latent correlations that were already available in the inputs. Expecting anything more is basically just wishful thinking.

▲

xyzzy123 a day ago | parent | next [-]

Using this reasoning, would you argue that a new proof of a theorem adds no new information that was not present in the axioms, rules of inference and so on?

If so, I'm not sure it's a useful framing.

For novel writing, sure, I would not expect much truly interesting progress from LLMs without human input because fundamentally they are unable to have human experiences, and novels are a shadow or projection of that.

But in math – and a lot of programming – the "world" is chiefly symbolic. The whole game is searching the space for new and useful arrangements. You don’t need to create new information in an information-theoretic sense for that. Even for the non-symbolic side (say diagnosing a network issue) of computing, AIs can interact with things almost as directly as we can by running commands so they are not fundamentally disadvantaged in terms of "closing the loop" with reality or conducting experiments.

▲

measurablefunc a day ago | parent [-]

Sound deductive rules of logic can not create novelty that exceeds the inherent limits of their foundational axiomatic assumptions. You can not expect novel results from neural networks that exceed the inherent information capacity of their training corpus & the inherent biases of the neural network (encoded by its architecture). So if the training corpus is semantically unsound & inconsistent then there is no reason to expect that it will produce logically sound & semantically coherent outputs (i.e. garbage inputs → garbage outputs).

▲

xyzzy123 18 hours ago | parent [-]

Maybe? But it also seems like you are that you are not accounting for new information at inference time. Let's pretend I agree the LLM is a plagiarism machine that can produce no novelty in and of itself that didn't come from what it was trained on, and produces mostly garbage (I only half agree lol, and I think "novelty" is under-specified here).

When I apply that machine (with its giant pool of pirated knowledge) _to my inputs and context_ I can get results applicable to my modestly novel situation which is not in the training data. Perhaps the output is garbage. Naturally if my situation is way out of distribution I cannot expect very good results.

But I often don't care if the results are garbage some (or even most!) of the time if I have a way to ground-truth whether they are useful to me. This might be via running a compile, a test suite, a theorem prover or mk1 eyeball. Of course the name of the game is to get agents to do this themselves and this is now fairly standard practice.

▲

measurablefunc 18 hours ago | parent [-]

I'm not here to convince you whether Markov chains are helpful for your use cases or not. I know from personal experience that even in cases where I have a logically constrained query I will receive completely nonsensical responses¹.

¹https://chatgpt.com/share/69367c7a-8258-8009-877c-b44b267a35...

▲

jacquesm 17 hours ago | parent [-]

> Here is a correct, standard correction:

It does this all the time, but as often as not then outputs nonsense again, just different nonsense, and if you keep it running long enough it starts repeating previous errors (presumably because some sliding window is exhausted).

▲

measurablefunc 16 hours ago | parent [-]

That's been my general experience and that was the most recent example. People keep forgetting that unless they can independently verify the outputs they are essentially paying OpenAI for the privilige of being very confidently gaslighted.

	▲	jacquesm 10 hours ago \| parent [-]
		It would be a really nice exercise - for which I unfortunately do not have the time - to have a non-trivial conversation with the best models of the day and then to rigorously fact-check every bit of output to determine the output quality. Judging from my own (probably not a representative sample) experience it would be a very meager showing. I use AI as a means of last resort only now and then mostly as a source of inspiration rather than a direct tool aiming to solve an issue. And like that it has been useful on occasion, but it has at least as often been a tremendous waste of time.

▲

nl 13 hours ago | parent | prev | next [-]

This is simply not true.

Modern LLMs are trained by reinforcement learning where they try to solve a coding problem and receive a reward if it succeeds.

Data Processing Inequalities (from your link) aren't relevant: the model is learning from the reinforcement signal, not from human-written code.

	▲	jacquesm 10 hours ago \| parent [-]
		Ok, then we can leave the training data out of the input, everybody happy.

▲

cornel_io 19 hours ago | parent | prev [-]

Theoretical "proofs" of limitations like this are always unhelpful because they're too broad, and apply just as well to humans as they do to LLMs. The result is true but it doesn't actually apply any limitation that matters.

	▲	measurablefunc 19 hours ago \| parent [-]
		You're confused about what applies to people & what applies to formal systems. You will continue to be confused as long as you keep thinking formal results can be applied in informal contexts.

▲

andsoitis a day ago | parent | prev | next [-]

I like your test. Should we also apply to specific humans?

We all stand on the shoulders of giants and learn by looking at others’ solutions.

▲

jacquesm a day ago | parent | next [-]

That's true. But if we take your implied rebuttal then current level AI would be able to learn from current AI as well as it would learn from humans, just like humans learn from other humans. But so far that does not seem to be the case, in fact, AI companies do everything they can to avoid eating their own tail. They'd love eating their own tail if it was worth it.

To me that's proof positive they know their output is mangled inputs, they need that originality otherwise they will sooner or later drown in nonsense and noise. It's essentially a very complex game of Chinese whispers.

▲

handoflixue 21 hours ago | parent | next [-]

Equally, of course, all six year olds need to be trained by other six year olds; we must stop this crutch of using adult teachers

	▲	subscribed 15 hours ago \| parent [-]
		Beautiful, thank you.

▲

andsoitis a day ago | parent | prev [-]

I share that perspective.

▲

a day ago | parent | prev [-]

[deleted]

▲

andrepd a day ago | parent | prev | next [-]

Excellent observation.

▲

ninetyninenine 20 hours ago | parent | prev | next [-]

[dead]

▲

bfffbgfdcb a day ago | parent | prev [-]

[flagged]

	▲	jacquesm a day ago \| parent [-]
		I think my track record belies your very low value and frankly cowardly comment. If you have something to say at least do it under your real username instead of a throwaway.

▲

fpoling a day ago | parent | prev | next [-]

Pick up a book about programming from seventies or eighties that was unlikely to be scanned and feed into LLM. Take a task from it and ask LLM to write a program from it that even a student can solve within 10 minutes. If the problem was not really published before, LLM fails spectacularly.

▲

crawshaw a day ago | parent | next [-]

This does not appear to be true. Six months ago I created a small programming language. I had LLMs write hundreds of small programs in the language, using the parser, interpreter, and my spec as a guide for the language. The vast majority of these programs were either very close or exactly what I wanted. No prior source existed for the programming language because I created it whole cloth days earlier.

▲

jazzyjackson a day ago | parent | next [-]

Obviously you accidentally recreated a language from the 70s :P

(I created a template language for JSON and added branching and conditionals and realized I had a whole programming language. Really proud of my originality until i was reading Ted Nelson's Computer Lib/Dream Machines and found out I reinvented TRAC, and to some extent, XSLT. Anyway LLMs are very good at reasoning about it because it can be constrained by a JSON schema. People who think LLMs only regurgitate haven't given it a fair shot)

	▲	zahlman a day ago \| parent [-]
		FWIW, I think a JSON-based XSLT-like thing sounds far more enjoyable to use than actual XSLT, so I'd encourage you to show it off.

▲

fpoling a day ago | parent | prev [-]

Languages with reasonable semantics are rather similar and LLMs are good at detecting that and adapting from other languages.

▲

pertymcpert 20 hours ago | parent [-]

Sounds like creativity and intelligence to me.

	▲	tatjam 11 hours ago \| parent [-]
		I think the key is that the LLM is having no trouble mapping from one "embedding" of the language to another (the task they are best performers at!), and that appears extremely intelligent to us humans, but certainly is not all there's to intelligence. But just take a look at how LLMs struggle to handle dynamical, complex systems such as the "vending machine" paper published some time ago. Those kind of tasks, which we humans tend to think of as "less intelligent" than say, converting human language to a C++ implementation, seem to have some kind of higher (or at least, different) complexity than the embedding mapping done by LLMs. Maybe that's what we typically refer to as creativity? And if so, modern LLMs certainly struggle with that! Quite sci-fi that we have created a "mind" so alien we struggle to even agree on the word to define what it's doing :)

▲

handoflixue 21 hours ago | parent | prev | next [-]

It's telling that you can't actually provide a single concrete example - because, of course, anyone skilled with LLMs would be able to trivially solve any such example within 10 minutes.

Perhaps the occasional program that relies heavily on precise visual alignment will fail - but I dare say if we give the LLM the same grace we'd give a visually impaired designer, it can do exactly as well.

	▲	tovej 19 hours ago \| parent [-]
		I recently asked an LLM to give me one of the most basic and well-documented algorithms in the world: a blocked matrix multiply. It's essentially a few nested loops and some constants for the block size. It failed massively, spitting out garbage code, where the comments claimed to use blocking access patterns, but the code did not actually use them at all. LLMs are, frankly, nearly useless for programming. They may solve a problem every once in a while, but once you look at the code, you notice it's either directly plagiarized or bad quality (or both, I suppose, in the latter case).

▲

anjel a day ago | parent | prev | next [-]

Sometimes its generated, and many times its not. Trivial to denote, but its been deemed non of your business.

▲

ahepp a day ago | parent | prev [-]

You've done this? I would love to read more about it

▲

_heimdall a day ago | parent | prev | next [-]

I have a very anecdotal, but interesting, counterexample.

I recently asked Gemini 3 Pro to create an RSS feed reader type of experience by using XSLT to style and layout an OPML file. I specifically wanted it to use a server-side proxy for CORS, pass through caching headers in the proxy to leverage standard HTTP caching, and I needed all feed entries for any feed in the OPML to be combined into a single chronological feed.

It initially told multiple times that it wasn't possible (it also reminded me that Google is getting rid of XSLT). Regardless, after reiterating that it is possible multiple times it finally decided to make a temporary POC. That POC worked on the first try, with only one follow up to standardize date formatting with support for Atom and RSS.

I obviously can't say the code was novel, though I would be a bit surprised if it trained on that task enough for it to remember roughly the full implementation and still claimed it was impossible.

▲

jacquesm a day ago | parent [-]

Why do you believe that to be a counterexample? In fragmentary form all of these elements must have been present in the input, the question is really how large the largest re-usable fragment was and whether or not barring some transformations you could trace it back to the original. I've done some experiments along the same lines to see what it spits out and what I noticed is that from example to example the programming style changed drastically, to the point that I suspect that it was mimicking even the style and not just the substance of the input data, and this over chunks of code long enough that it would definitely clear the bar for plagiarism.

▲

handoflixue 21 hours ago | parent [-]

> In fragmentary form all of these elements must have been present in the input

Yes, and Shakespeare merely copied the existing 26 letters of the English alphabet. What magical process do you think students are using when they read and re-combine learned examples to solve assignments?

	▲	jacquesm 10 hours ago \| parent [-]
		This same argument has now been made a couple of times in this thread (in different guises) and does absolutely nothing to move the conversation forward. Words and letters are not copyrightable patterns in and of themselves. It is the composition of words and letters that we consider to be original creations and 'the bard' put them in a meaningful and original order not seen before, which established his reputation as a playwright.

▲

checkmatez 14 hours ago | parent | prev | next [-]

> that LLM are fundamentally unable to produce original code.

What about humans? Are humans capable of producing completely original code or ideas or thoughts?

As the saying goes, if you want to create something from scratch, you have to start by inventing the universe.

Human mind works by noticing patterns and applying them in different contexts.

▲

martin-t a day ago | parent | prev | next [-]

The whole "reproduces training data vebatim" is a red herring.

It reproduces _patterns from the training data_, sometimes including verbatim phrases.

The work (to discover those patterns, to figure out what works and what does not, to debug some obscure heisenbug and write a blog post about it, ...) was done by humans. Those humans should be compensated for their work, not owners of mega-corporations who found a loophole in copyright.

▲

checker659 19 hours ago | parent | prev | next [-]

I think the burden of proof is on the people making the original claim (that LLMs are indeed spitting out original code).

▲

moron4hire a day ago | parent | prev [-]

No, the thing needing proof is the novel idea: that LLMs can produce original code.

▲

marcus_holmes a day ago | parent | next [-]

LLMs can definitely produce original other stuff: ask it to create an original poem and on an extremely specific niche subject and it will do so. You can specify the niche subject to the point where it is incredibly unlikely that there is a poem on that subject in its training data, and it will still produce an original poem on that subject [0]. The well-known "otter using wifi on a plane" series of images [1] is another example: this is not in the training data (well, it is now, because well-known, but you get the idea).

Is there something unique about code, that is different from language (or images), that would make it impossible for an LLM to produce original code? I don't believe so, but I'm willing to be convinced.

I think this switches the burden of proof: we know LLMs can produce original content in other contexts. Why would they not be able to create original code?

[0] Ever curious, I tested this assumption. I got Claude to write an original limerick about goats oiling their beards with olive oil, which was the first reasonable thing I could think of as a suitably niche subject. I googled the result and could not find anything close to it. I then asked it to produce another limerick on the same subject, and it produced a different limerick, so obviously not just repeating training data.

[1] https://www.oneusefulthing.org/p/the-recent-history-of-ai-in...

▲

jacquesm 18 hours ago | parent [-]

No, it transformed your prompt. Another person giving it the same prompt will get the same result when starting from the same state. f('your prompt here') is a transformation of your prompt based on hidden state.

	▲	marcus_holmes 14 hours ago \| parent [-]
		This is also true of humans, see every debate on free will ever. The trick, of course, is getting to the exact same starting state.

▲

ChromaticPanic 9 hours ago | parent | prev | next [-]

This just reeks of a lack of understanding of how transformers work. Unlike Markov Chains that can only regurgitate known sequences, transformers can actually make new combinations.

▲

handoflixue 21 hours ago | parent | prev [-]

What's your proof that the average college student can produce original code? I'm reasonably certain I can get an LLM to write something that will pass any test that the average college student can, as far as that goes.

	▲	moron4hire 8 hours ago \| parent [-]
		I'm not asking about averages. I'm asking about any. There is no need to perform an academic research study to prove that humans are capable of writing original code because the existence of our conversation right now is the counter-example to disprove the negation. Yes, it is true that a lot of humans remix existing code. But not all. It has yet to be proven that any LLM is doing something more than remixing code. I would submit as evidence to this idea (LLMs are not capable of writing original code) the fact that not a single company using LLM-based AI coding has developed a novel product that has outpaced its competition. In any category. If AI really makes people "10x" more productive, then companies that adopted AI a year ago should be 10 years ahead of their competition. Substitute any value N > 1 you want and you won't see it. Indeed, given the stories we're seeing of the massive amounts of waste that is occurring within AI startups and companies adopting AI, it would suggest that N < 1.