LLMs Are Not a Higher Level of Abstraction

▲ LLMs Are Not a Higher Level of Abstraction(lelanthran.com)

70 points by lelanthran 9 hours ago | 56 comments

▲ samstokes 10 minutes ago | parent | next [-]

I wouldn't agree that LLMs are a higher level of abstraction, but I've found they do help me think at a higher level of abstraction, by temporarily outsourcing cognitive load.

With changes like substantial refactors or ambitious feature additions, it's easy to exceed the infamous "seven things I can remember at once":

  * the idea for the big change itself
  * my reason for making the change
  * the relevant components and how they currently work
  * the new way they'll fit together after the change
  * the messy intermediate state when I'm half finished but still need a working system to get feedback
  * edge cases I'm ignoring for now but will have to tackle eventually
  * actual code changes
  * how I'm going to test this

Good lab notes, specs etc can help, but it's a lot to keep in mind. In practice these often turn into multi person projects, and communication is hard so that often means delay or drift. Having an agent temporarily worry about

  * wiring a new parameter through several layers
  * writing a test harness for an untested component
  * experimentally adding multibyte character support on a branch

frees up my mental bandwidth for the harder parts of the problem.

The main benefit is to defer the concern until I have a mostly working system. Then I come back and review its output, since I'm still responsible for what it delivers, and I want better than "mostly working".

▲ dimtion 3 hours ago | parent | prev | next [-]

I'm not sure why people struggle with the fact that an abstraction can be built on top of a non-deterministic and stochastic system. Many such abstractions already exist in the world we live.

Take sending a packet over a noisy, low SNR cell network. A high number of packets may be lost. This doesn't prevent me, as a software developer, from building an abstraction on top of a "mostly-reliable" TCP connection to deliver my website.

There are times when the service doesn't work, particularly when the packet loss rate is too high. I can still incorporate these failures into my mental model of the abstraction (e.g through TIMEOUTs, CONN_ERRs…).

Much of engineering and reliability history revolves around building mathematical models on top of an unpredictable world. We are far from solving this problem with LLMs, but this doesn't prevent me from thinking of LLMs as a new level of abstraction that can edit and transform code.

▲

distalx 3 hours ago | parent | next [-]

A transmission error has a strictly contained, predictable blast radius. If a packet drops, the system knows exactly how to handle it: it throws a timeout, drops a connection, or asks for a retry. The worst-case scenario is known.

A reasoning error has an infinite, unpredictable blast radius. When an LLM hallucinates, it doesn't fail safely but it writes perfectly compiling code that does the wrong thing. That "wrong thing" might just render a button incorrectly, or it might silently delete your production database, or open a security backdoor.

You can build reliable abstractions over failures that are predictable and contained. You cannot abstract away unpredictable destruction.

▲

yunwal 2 hours ago | parent | next [-]

> A reasoning error has an infinite, unpredictable blast radius.

Says who? It’s quite easy to limit the blast radius of a reasoning error.

	▲	distalx an hour ago \| parent \| next [-]
		In 2024, a Chevy dealership deployed an AI chatbot that confidently agreed to sell a customer a 2024 Chevy Tahoe for $1. It executed a catastrophic business failure simply because it didn't know the logic was wrong. Sure, you can patch that specific case with guardrails, but how many unpredictable edge cases are you going to cover? It only takes a user with a bit of ingenuity to circumvent them. There are already several examples of AI agents getting stuck in infinite loops, burning through massive API bills while achieving absolutely nothing. You can contain a system failure, but you cannot contain a logic failure if the system doesn't know the logic is wrong.
	▲	amazingamazing 2 hours ago \| parent \| prev [-]
		How so? Suppose you had: Math() Add() Subtract() Program() Math(“calculate rate”) This is intentionally written vaguely. How do you limit that these implementations ensure Program() runs and does the right thing when there is no guarantee Math() or its components are correct? Normally you could use a typed programming language, unit tests, etc, but if LLM is the ultimate abstraction programs will be written line above. At some point traditional software engineering principles will need to apply.

▲

td2 2 hours ago | parent | prev [-]

I mean if your talking about packets, your already one abstraction over the real data Transmission, in wich is noisy. So bits can randomly flip, noise could be interpreted as bits, and bits could get lost. A much larger blast radius

▲

zadikian 2 hours ago | parent | prev | next [-]

I'm fine with that. The part that makes it not really an abstraction is, you still deliver code in the end. It'd be different if your deliverable were prompt+conversation, and the code were merely an intermediate build artifact. Usually people throw away the convo. Some have tried making markdown files the deliverable instead, so far that doesn't really work.

It makes even less sense when people compare an LLM to a compiler. Imagine making a pull request that's just adding a binary because you threw the source code away.

▲

mpyne 2 hours ago | parent [-]

The whole field of reproducible builds is only a field because compilers also have had trouble historically of producing binary artifacts with guaranteed provenance and binary compatibility even when built from the same source codes.

If I assign a bug fix ticket to a human developer on my team, I won't be able to precisely replicate how they go about solving the bug but for many bugs I can at least be assured that the bug will get solved, and that I understand the basic approach the assigned dev would use to troubleshoot and resolve the ticket.

This is an organizational abstraction but it's an abstraction just the same, leaky as it is.

	▲	kibwen 44 minutes ago \| parent \| next [-]
		> The whole field of reproducible builds is only a field because compilers also have had trouble No, this is not comparable. The reason reproducible builds are tricky is not because compilers are inherently prone to randomness, it's because binaries often bake-in things like timestamps and the exact pathnames of the system used to produce the build. People need to stop comparing LLMs to compilers, it's an embarrassingly poor analogy.
	▲	z3c0 an hour ago \| parent \| prev [-]
		It's an abstraction for you, not the rest of that developer's team, who have to reproduce the same solution even after said developer has "won the lottery", so-to-speak. inb4: "Don't worry, just use GPT to make the docs"

▲

evrydayhustling 3 hours ago | parent | prev | next [-]

Besides deeply unpredictable factors (like signal transmission), most users of higher-level abstractions do so without certainty about how the translation will be executed. For example, one of the main selling points of C when I was growing up was that you could write code independent of architecture, and leave the architecture-specific translation to assembly to the compiler!

Abstractions often embrace nondeterministic translation because lower level details are unknown at time of expression -- which is the moivation for many LLM queries.

▲

qazxcvbnmlp 27 minutes ago | parent | prev | next [-]

Grocery stores are a level of abstraction. Exchange money, get food. If your whole life you had grown food, it might feel a bit strange.

Occasionally the low level details leak through ie: this egg came from this farm, theres a shipping issue so onions are more expensive or whatever.

I think llm assisted coding is going to work something like this.

▲

yomismoaqui an hour ago | parent | prev | next [-]

The kicker is when you delegate some work to another team member and discover that humans are also non-deterministic.

▲

dominotw 2 hours ago | parent | prev [-]

that would make sense if ai said "fail. i dont know" . Its active deception is what makes it difficult.

▲ taraharris 13 minutes ago | parent | prev | next [-]

The claim is that compilers were f(x) -> y, and LLMs are f(x) -> P(y | z1 | z2 | ... z3).

But how were various combinations of popular programming languages, operating systems and hardware platforms not effectively f(x) -> P(y | z1 | z2 | ... z3)? Suppose you were quick on the take and were writing in Unix and C in the early 80s and found yourself porting your program from a PDP-something to an 8088 PC, or to a 68k Mac, dealing with DOS extenders, printer drivers, different versions of C (remember K&R style?) or C++? Remember MFC? The evolution of the STL?

LLMs are similar to that maelstrom, just on a faster timescale.

▲ yongjik 3 hours ago | parent | prev | next [-]

It's orthogonal to whether LLMs can be a useful abstraction layer, but ...

I have a feeling that if LLMs were built on a deterministic technology, a lot of the current AI-is-not-intelligent crowd would be saying "These LLMs can only generate one answer given a question, which means they lack human creativity and they'll never be intelligent!"

▲

byzantinegene 43 minutes ago | parent [-]

it is a fruitless endeavour to try to appeal to a crowd that does not and will never understand the fundamentals of how llms work.

	▲	ofjcihen 28 minutes ago \| parent [-]
		I think that crowd would agree with you.

▲ Legend2440 2 hours ago | parent | prev | next [-]

I don't agree with this take. Determinism is a nice property for abstractions to have, but it isn't necessary to be an abstraction.

And LLMs can handle very abstract concepts that could not possibly be encoded in C++, like the user's goal in using software.

▲

farmdawgnation 2 hours ago | parent [-]

I think you could also make the case that the existing abstractions aren't actually fully deterministic themselves. The compiler or interpreter may not behave as it should. Therefore, for any correct C code, there's probability that the GCC compiler will turn it into correctly formed machine code. But it may not!

Is the probability much higher with GCC? Sure. But it's still a probability.

	▲	anon-3988 3 minutes ago \| parent [-]
		I am sorry but this is an insane take. The probability of GCC going haywire with your special snowflake correct C code? Please. Have this EVER happen to you? I am not talking about the performance of the generated assembly because that IS flaky, but functionality wise I do not think so. If people are so confident about the determinism of LLMs, or at least consider it on par with compilers, please ask it to compile your source code instead. Better yet, replace all your GNU utils with LLM instead. Replace your `ls` with `codex "prompt"`.

▲ jefurii an hour ago | parent | prev | next [-]

I don't feel that this piece explains its title very well (to me) though the idea expressed by the title is spot-on

I've gone through hand-coding HTML, CGI, CMSes, web frameworks, and CMSes built with web frameworks. Each is (roughly) a layer of abstraction on top of lower layers.

People talk about LLMs as an extension of this layering but they're not. With the layers of abstraction I've listed you can go down to the layers underneath and understand them if you take the time.

LLMs are something different. They're a replacement for or a simulation of the thinking process involved in programming at various layers.

▲ madisonmay 3 hours ago | parent | prev | next [-]

LLMs are not inherently non-deterministic during inference. I don't believe non-determinism implies lack of abstraction. Abstraction is simply hiding detail to manage complexity.

▲

danpalmer 2 hours ago | parent [-]

Non-determinism is configurable at the level of the mathematical model, but current production systems do not support deterministic evaluation of LLMs.

	▲	orbital-decay 8 minutes ago \| parent [-]
		They do, though. Providers don't because batching makes it cheaper. Among the providers, DeepSeek seems to support it for v4 (and have actually optimized their kernels for batching), and Gemini Flash is "almost deterministic".

▲ royal__ an hour ago | parent | prev | next [-]

I agree, but I think it's for a different reason than what the author says: LLMs are a very leaky abstraction compared to other levels, meaning it's much harder to convey the true intent of logic you are trying to encode through natural language, and often by doing so you are just relying on the LLM to "get it right", which is inherently messy business. Oftentimes, that leakiness just doesn't matter that much. Other times, it does.

▲ ofjcihen 27 minutes ago | parent | prev | next [-]

Tangential to the subject matter but has anyone else noticed that night time tends to have more people arguing that LLMs are intelligent and the daytime tends to have more arguing that they aren’t?

▲ bigstrat2003 4 hours ago | parent | prev | next [-]

You're right, but the reality is that the people who are excited about LLMs don't care about determinism. They are happy to hand off the thinking to a third party, even if it will give wrong answers they don't notice.

▲ calf 3 hours ago | parent | prev | next [-]

There are a few things being confused because people are having to learn/re-learn/re-discover basic computer science classes, but both formal specifications and informal specifications - such as pseudocode (I balk imagining how many AI users might not know this term), or natural language documentation - are all forms of abstraction. Programming languages and underlying models of computation all enable varying degrees of hiding details or emphasizing important ideas/information. Human thought and language, and mathematics, are already examples of abstraction in general. LLMs thus also purport to provide a (via computational model alternative to Turing machines) higher kind of abstraction, the debate is whether it is a good one, if its hallucinations make it unreliable, etc.

▲ legerdemain 4 hours ago | parent | prev | next [-]

This is absurd. The author misrepresents the type of "abstraction" that people mean. This abstraction ladder goes as follows:

  - contributing individually
  - contributing as a tech lead
  - contributing as a technical manager
  - leaving the occupation to open a vanity business, such as a gastropub or horse shoeing service

▲

maplethorpe 3 hours ago | parent [-]

Abstraction has a specific meaning in computer programming. I don't think he's misrepresenting it.

https://en.wikipedia.org/wiki/Abstraction_(computer_science)

▲

LeCompteSftware 3 hours ago | parent [-]

OP is being a bit tongue-in-cheek, I believe they mean that some vibe coders really want to be abstracted away from their own jobs, and are very much not interested in computer-scientific abstraction.

	▲	maplethorpe 3 hours ago \| parent [-]
		Oh.

▲ jqpabc123 9 hours ago | parent | prev | next [-]

In other words, LLMs are probabilistic, not deterministic.

▲

kibwen 38 minutes ago | parent | next [-]

Determinism is a red herring here. The problem is that LLMs are inductive systems, not deductive systems. This makes them powerfully general, and yet inherently unreliable.

▲

sscaryterry 8 hours ago | parent | prev [-]

Dare I say, so are humans?

▲

jqpabc123 7 hours ago | parent [-]

This used to be a big reason why we used computers --- to help eliminate the probability of error.

But apparently, not so much any more.

▲

mpyne 2 hours ago | parent | next [-]

Digital computers were named after the humans whose jobs they automated out of existence.

They were invented to reduce cost of computation, not to eliminate the probability of error per se. Ask a Windows 11 user, they'll tell you computers still make errors.

	▲	card_zero 13 minutes ago \| parent [-]
		No, I'm pretty sure it does it on purpose.

▲

somewhereoutth 5 hours ago | parent | prev [-]

Right, it was the perfect match: Humans for fuzzy touchy feely stuff, computers for hard edged correct calculations. How have we managed to screw this up so badly?

	▲	irishcoffee 3 hours ago \| parent [-]
		I think the big unmentioned elephant in the room is the gambling/dopamine aspect of using an LLM. It’s to the point where people at $dayjob joke about it… but they’re not joking. That’s how it got screwed up so badly. We have a bunch of engineers paying money to open loot boxes and they get visibly upset when they run out of tokens. LLM companies have done an absolutely brilliant job of figuring out how to burn more tokens quickly, couch it as “more advanced” and people throw money at them. I realize this wasn’t the thrust of your point, but tangentially, we fucked it up so badly because people desperately want to ignore this bit, and instead of looking at these tools analytically, there are the ardent defenders and the staunchly opposed… much like every other topic under the sun these days. I use the free stuff work pays for, and I’ve never hit any token limit or anything like that. But I’m also trying extremely hard to ensure my skillsets don’t atrophy. I just use the web interface and ask questions. I have no interest in tying my development experience directly into an LLM, not after what I’ve seen at work over the last few weeks.

▲ cyanydeez 5 hours ago | parent | prev | next [-]

This makes sense, but you need to understand that you're ignoring the compiler once you're past the machine code level which isn't an abstraction right, it's the root. So ignoring that part of the missive, goin from C to Python, different compilers do add different machine code.

C and Python have a bunch of different compilers, so you don't if you take the same code, the f' output can be different. There's determinism within the same compiler. Add in different architectures, and the machine code output definitely is more varied than presented.

But that's still a manageable; then what if you add in all the dependencies, well you get a more florid complexity.

So really, it's a shitty abstraction rather than an inaccurate analogy. If you lined them up in levels, there could be some universe where they are a valid abstraction. But it's not the current universe, because we know the models function on non-determinism.

I'd posit if there was a 'turtles all the way down' abstraction for the LLM, it's simply coming from the other end, the one where human mind might start entering the picture.

▲ conorbergin 4 hours ago | parent | prev [-]

LLMs are deterministic, the same model under the same conditions will produce the same output, unless some randomness is purposefully injected. Neural networks in general can be thought of as universal function approximators.

▲

mrob 3 hours ago | parent | next [-]

Whenever somebody calls LLMs "non-deterministic", assume they meant "chaotic", in the informal sense of being a system where small changes of input can cause large changes to output, and the only way to find out if it will happen is by running the full calculation.

For many applications, this is equally troublesome as true non-determinism.

▲

conorbergin 2 hours ago | parent [-]

I don't think LLMs are that chaotic, you can replace words in an input at get a similar answer, and they are very good at dealing with typos.

They are definitely not interpretable, I was reading some stuff from mechanistic interpretability researchers saying they've given up trying to build a bottom up model of how they work.

	▲	mylifeandtimes an hour ago \| parent [-]
		> I don't think LLMs are that chaotic, you can replace words in an input at get a similar answer, and they are very good at dealing with typos. Compare "You are a helpful assistant. Your task is to <100 lines of task description> <example problem>" with "you are a helpless assistant. Your task is to <100 lines of task description> <example problem>" I've changed 3 or 4 CHARACTERS ("ful" to "less") out of a (by construction) 1000+ character prompt. and the outputs are not at all similar. Just realized I've never tried the "you are a helpless ass" prompt. Again a very minor change in wording, just dropping a few letters. The helpless assistant at least output text apologizing for being so bad at the task.

▲

2ndorderthought 3 hours ago | parent | prev | next [-]

That's not really true. If you turn a few knobs you can make them deterministic. Namely setting temperature to zero, and turning off all history. But none of the cloud providers do this. Because it's not a product as far as they are concerned. So in practice - not so much.

▲

maplethorpe 3 hours ago | parent | next [-]

Can someone explain why this is? Do LLMs somehow contain a true random number generator? Why wouldn't they produce the same outputs given the same inputs (even temperature)?

edit: I'm not talking about an LLM as accessed through a provider. I'm just talking about using a model directly. Why wouldn't that be deterministic?

▲

anon373839 3 hours ago | parent | next [-]

The model outputs a probability distribution for the next token, given the sequence of all previous tokens in the context window. It’s just a list of floats in the same order as the list of tokens that the tokenizer uses.

After that, a piece of software that is NOT the LLM chooses the next token. This is called the sampler. There are different sampling parameters and strategies available, but if you want repeatable* outputs, just take the token with the highest probability number.

* Perfect determinism in this sense is difficult to achieve because GPU calculations naturally have a minor bit of nondeterminism. But you can get very close.

	▲	2ndorderthought 3 hours ago \| parent [-]
		I'm not so sold the LLM is an LLM without a sampler but it's not worth quibbling over. It's part of the statistical model anyways.

▲

evrydayhustling 3 hours ago | parent | prev | next [-]

An LLM model itself -- that is, the weights and the mathematical functions linking them -- does not tell you exactly how to train from data, nor how to generate an output. Instead, it describes a function providing relative likelihood(output | input).

Deciding how to pick a particular output given that likelihood function is left as an exercise for the user, which we call inference.

One obvious choice is to keep picking the highest likelihood token, feed it into the model, and get another -- on repeat. This is what most algorithms call "temperature=0". But doing this for token after token can lead boring output, or steer you into pathological low-probability sequences like a set of endless repeats.

So, the current SOTA is to intentionally introduce a random factor (temperature>0) to the sampling process -- along with other hacks, like explicit suppression of repeats.

▲

2ndorderthought 3 hours ago | parent | prev [-]

Yea sure. So temperature is baked into these LLM models and when it isn't zero it increases the probability of taking a different path to decode the tokens. Whether it's at a provider or downloaded on your own machine.

Technically even when the temperature is 0 it's not deterministic but it's more likely to be... You can have ties in probabilities for generating the next words. And floating point noise is real.

All these models are doing is guesstimating the next token to say.

▲

slashdave 2 hours ago | parent | prev [-]

Eh, conceptually true, but in practice, it is rather hard to get any decent performance out of a GPU and still produce a deterministic answer.

And in any case, setting the temperature to zero will not produce a useful result, unless you don't mind your LLM constantly running into infinite loops.

▲

alansaber 3 hours ago | parent | prev | next [-]

Yes theres a good thinking machines lab blog about this

▲

0-_-0 3 hours ago | parent | prev [-]

You're being downvoted, but you're right. Determinism is a different concept and doesn't characterise LLMs well. You can have deterministic random number generators for example.