Remix.run Logo
programd 6 days ago

> [LLMs] spit out the most likely text to follow some other text based on probability.

Modern coding AI models are not just probability crunching transformers. They haven't been just that for some time. In current coding models the transformer bit is just one part of what is really an expert system. The complete package includes things like highly curated training data, specialized tokenizers, pre and post training regimens, guardrails, optimized system prompts etc, all tuned to coding. Put it all together and you get one shot performance on generating the type of code that was unthinkable even a year ago.

The point is that the entire expert system is getting better at a rapid pace and the probability bit is just one part of it. The complexity frontier for code generation keeps moving and there's still a lot of low hanging fruit to be had in pushing it forward.

> They're great for writing boilerplate that has been written a million times with different variations

That's >90% of all code in the wild. Probably more. We have three quarters of a century of code in our history so there is very little that's original anymore. Maybe original to the human coder fresh out of school, but the models have all this history to draw upon. So if the models produce the boilerplate reliably then human toil in writing if/then statements is at an end. Kind of like - barring the occasional mad genious [0] - the vast majority of coders don't write assembly to create a website anymore.

[0] https://asm32.info/index.cgi?page=content/0_MiniMagAsm/index...

motorest 6 days ago | parent | next [-]

> Modern coding AI models are not just probability crunching transformers. (...) The complete package includes things like highly curated training data, specialized tokenizers, pre and post training regimens, guardrails, optimized system prompts etc, all tuned to coding.

It seems you were not aware you ended up describing probabilistic coding transformers. Each and every single one of those details are nothing more than strategies to apply constraints to the probability distributions used by the probability crunching transformers. I mean, read what you wrote: what do you think that "curated training data" means?

> Put it all together and you get one shot performance on generating the type of code that was unthinkable even a year ago.

This bit here says absolutely nothing.

leptons 6 days ago | parent | prev | next [-]

>The complete package includes things like highly curated training data, specialized tokenizers, pre and post training regimens, guardrails, optimized system prompts etc, all tuned to coding.

And even with all that, they still produce garbage way too often. If we continue the "car" analogy, the car would crash randomly sometimes when you leave the driveway, and sometimes it would just drive into the house. So you add all kinds of fancy bumpers to the car and guard rails to the roads, and the car still runs off the road way too often.

mgaunard 6 days ago | parent | prev | next [-]

Except we should aim to reduce the boilerplate through good design, instead of creating more of it on an industrial scale.

patrickmay 6 days ago | parent | next [-]

I regret that I have but one upvote to give to this comment.

Every time someone says "LLMs are good at boilerplate" my immediate response is "Why haven't you abstracted away the boilerplate?"

exe34 6 days ago | parent | prev [-]

what we should and what we are forced to do are very different things. if I can get a machine to do the stuff I hate dealing with, I'll take it every time.

mgaunard 6 days ago | parent | next [-]

who's going to be held accountable when the boilerplate fails? the AI?

danielbln 6 days ago | parent | next [-]

The buck stops with the engineer, always. AI or no AI.

mgaunard 5 days ago | parent [-]

I've seen juniors send AI code for review, when I comment on weird things within it, it's just "I don't know, the AI did that"

danielbln 5 days ago | parent [-]

Oh, me too. And I reject them as the same as if they had copied code from Stack Overflow they can't explain.

exe34 6 days ago | parent | prev [-]

no, I'm testing it the same way I test my own code!

oneneptune 6 days ago | parent [-]

yolo merging into prod on a friday afternoon?

skydhash 6 days ago | parent | prev [-]

It's like the xkcd on automation

https://xkcd.com/1205/

After a while, it just make sense to redesign the boilerplate and build some abstraction instead. Duplicated logic and data is hard to change and fix. The frustration is a clear signal to take a step back and take an holistic view of the system.

gibbitz a day ago | parent [-]

And this is a great example of something I rarely see LLMs doing. I think we're approaching a point where we will use LLMs to manage code the way we use React to manage the DOM. You need an update to a feature? The LLM will just recode it wholesale. All of the problems we have in software development will dissolve in mountains of disposable code. I could see enterprise systems being replaced hourly for security reasons. Less chance of abusing a vulnerability if it only exists for an hour to find and exploit. Since the popularity of LLMs proves that as a society we've stopped caring about quality, I have a hard time seeing any other future.

Night_Thastus 6 days ago | parent | prev [-]

>In current coding models the transformer bit is just one part of what is really an expert system. The complete package includes things like highly curated training data, specialized tokenizers, pre and post training regimens, guardrails, optimized system prompts etc, all tuned to coding. Put it all together and you get one shot performance on generating the type of code that was unthinkable even a year ago.

This is lipstick on a pig. All those methods are impressive, but ultimately workarounds for an idea that is fundamentally unsuitable for programming.

>That's >90% of all code in the wild. Probably more.

Maybe, but not 90% of time spent on programming. Boilerplate is easy. It's the 20%/80% rule in action.

I don't deny these tools can be useful and save time - but they can't be left to their own devices. They need to be tightly controlled and given narrow scopes, with heavy oversight by an SME who knows what the code is supposed to be doing. "Design W module with X interface designed to do Y in Z way", keeping it as small as possible and reviewing it to hell and back. And keeping it accountable by making tests yourself. Never let it test itself, it simply cannot be trusted to do so.

LLMs are incredibly good at writing something that looks reasonable, but is complete nonsense. That's horrible from a code maintenance perspective.