Remix.run Logo
noosphr 5 hours ago

A library is deterministic.

LLMs are not.

That we let a generation of software developers rot their brains on js frameworks is finally coming back to bite us.

We can build infinite towers of abstraction on top of computers because they always give the same results.

LLMs by comparison will always give different results. I've seen it first hand when a $50,000 LLM generated (but human guided) code base just stops working an no one has any idea why or how to fix it.

Hope your business didn't depend on that.

doug_durham 2 hours ago | parent | next [-]

Why would that necessarily happen? With an LLM you have perfect knowledge of the code. At any time you can understand any part of your code by simply asking the LLM to explain it. It is one of the super powers of the tools. They also accelerate debugging by allowing you to have comprehensive logging. With that logging the LLM can track down the source of problems. You should try it.

mikestorrent 5 hours ago | parent | prev | next [-]

The thing is, it's possible to ask the LLM to add dynamic tracing, logging, metrics, a debug REPL, whatever you want to instrument your codebase with. You have to know to want that, and where it's appropriate to use. You still have to (with AI assistance) wire that all up so that it's visible, and you have to be able to interpret it.

If you didn't ask for traceability, if you didn't guide the actual creation and just glommed spaghetti on top of sauce until you got semi-functional results, that was $50k badly spent.

noosphr 5 hours ago | parent [-]

And if that had been done the $50k code base would be a $5,000,000 code base because the context would be 10 times as large and LLMs are quadratic.

If only we taught developers under 40 what x^2 meant instead of react.

Jorge1o1 4 hours ago | parent [-]

While I agree with your sentiment, I just want to say that if your approach is to have the LLM read every file into context, or you're working in some gigantic thread (using the million token capacity most frontier models have) that's really not the best way to do it.

Not even a human would work that way... you wouldn't open 300 different python files and then try to memorize the contents of every single file before writing your first code-change.

Additionally, you're going to have worse performance on longer context sizes anyways, so you should be doing it for reasons other than cost [1].

Things that have helped me manage context sizes (working in both Python and kdb+/q):

- Keep your AGENTS.md small but useful, in it you can give rules like "every time you work on a file in the `combobulator` module, you MUST read the `combobulator/README.md`. And in those README's you point to the other files that are relevant etc. And of course you have Claude write the READMEs for you...

- Don't let logs and other output fill up your context. Tell the agent to redirect logs and then grep over them, or run your scripts with a different loglevel.

- Use tools rather than letting it go wild with `python3 -c`. These little scripts eat context like there's no tomorrow. I've seen the bots write little python scripts that send hundreds of lines of JSON into the context.

- This last tip is more subjective but I think there's value in reviewing and cleaning up the LLM-generated code once it starts looking sloppy (for example seeing lots of repetitive if-then-elses, etc.). In my opinion when you let it start building patches & duct-tape on top of sloppy original code it's like a combinatorial explosion of tokens. I guess this isn't really "vibe" coding per se.

[1] https://arxiv.org/html/2602.06319v1

noosphr 4 hours ago | parent [-]

Yes I agree with all of that.

The way I let my agents interact with my code bases is through a 70s BSD Unix like interface, ed, grep, ctags, etc. using Emacs as the control plane.

It is surprisingly sparing on tokens, which makes sense since those things were designed to work with a teletype.

Worth noting is that by the times you start doing refactoring the agents are basically a smarter google with long form auto complete.

All my code bases use that pattern and I'm the ultimate authority on what gets added or removed. My token spend is 10% to 1% of what the average in the team is and I'm the only one who knows what's happening under the hood.

Krssst 5 hours ago | parent | prev | next [-]

Determinism is a smaller point than existence of a spec IMHO. A library has a specification one can rely on to understand what it does and how it will behave.

An LLM does not.

blackqueeriroh 3 hours ago | parent | prev [-]

Libraries are not deterministic. CPUs aren’t deterministic. There are margins of error among all things.

The fact that people who claim to be software developers (let alone “engineers”) say this thing as if it is a fundamental truism is one of the most maladaptive examples of motivated reasoning I have ever had the misfortune of coming across.