Remix.run Logo
sid_talks 6 hours ago

I’m still surprised so many developers trust LLMs for their daily work, considering their obvious unreliability.

0xbadcafebee 8 minutes ago | parent | next [-]

[delayed]

vidarh 5 hours ago | parent | prev | next [-]

I've spent 30 years seeing the junk many human developers deliver, so I've had 30 years to figure out how we build systems around teams to make broken output coalesce into something reliable.

A lot of people just don't realise how bad the output of the average developer is, nor how many teams successfully ship with developers below average.

To me, that's a large part of why I'm happy to use LLMs extensively. Some things need smart developers. A whole lot of things can be solved with ceremony and guardrails around developers who'd struggle to reliably solve fizzbuzz without help.

reconnecting 5 hours ago | parent | next [-]

Did you also notice the evolution of average developers over time? I mean, if you take code from a developer ten years ago and compare it with their output now, you can see improvement.

I assume that over time, the output improves because of the effort and time the developer invests in themselves. However, LLMs might reduce that effort to zero — we just don't know how developers will look after ten years of using LLMs now.

Still, if you have 30 years of experience in the industry, you should be able to imagine what the real output might be.

vidarh 4 hours ago | parent | next [-]

> Did you also notice the evolution of average developers over time? I mean, if you take code from a developer ten years ago and compare it with their output now, you can see improvement.

This makes little sense to me. Yes, individual developers gets better. I've seen little to no evidence that the average developer has gotten better.

> However, LLMs might reduce that effort to zero — we just don't know how developers will look after ten years of using LLMs now.

It might reduce that effort to zero from the same people who have always invested the bare minimum of effort to hold down a job. Most of them don't advance today either, and most of them will deliver vastly better results if they lean heavily on LLMs. On the high end, what I see experienced developers do with LLMs involves a whole lot of learning, and will continue to involve a whole lot of learning for many years, just like with any other tool.

reconnecting 4 hours ago | parent [-]

After 30 years in front of the desktop, we are processing dopamine differently.

When I speak about 10 years from now, I’m referring to who will become an average developer if we replace the real coding experience learning curve with LLMs from day one.

I also hear a lot of tool analogies — tractors for developers, etc. But every tool, without an exception, provides replicable results. In the case of LLMs, however, repeatable results are highly questionable, so it seems premature to me to treat LLMs in the same way as any other tool.

Terr_ 3 hours ago | parent [-]

Right, I've seen a lot of facile comparisons to calculators.

It may be true that a cohort of teachers were wrong (on more than one level) when they chastised students with "you need to learn this because you won't always have a calculator"... However calculators have some essential qualities which LLM's don't, and if calculators lacked those qualities we wouldn't be using them the way we do.

In particular, being able to trust (and verify) that it'll do a well-defined, predictable, and repeatable task that can be wrapped into a strong abstraction.

znort_ an hour ago | parent | prev [-]

> if you take code from a developer ten years ago and compare it with their output now, you can see improvement.

really? it depends on the type of development, but ten years ago the coder profession had already long gone mainstream and massified, with a lot of people just attracted by a convenient career rather than vocation. mediocrity was already the baseline ("agile" mentality to at the very least cope with that mediocrity and turnover churn was already at its peak) and on the other extreme coder narcissism was already en vogue.

the tools, resources, environments have indoubtedly improved a lot, though at the cost of overhead, overcomplexity. higher abstraction levels help but promote detachment from the fundamentals.

so specific areas and high end teams have probably improved, but i'd say average code quality has actually diminished, and keeps doing so. if it weren't for qa, monitoring, auditing and mitigation processes it would by now be catastrophic. cue in agents and vibe coding ...

as an old school coder that nowadays only codes for fun i see llm tools as an incredibly interesting and game changing tool for the profane, but that a professional coder might cede control to an agent (as opposed to use it for prospection or menial work) makes me already cringe, and i'm unable to wrap my head around vibe coding.

dullcrisp 2 hours ago | parent | prev [-]

I’m sorry.

kelnos 5 hours ago | parent | prev | next [-]

You don't have to trust it. You can review its output. Sure, that takes more effort than vibe coding, but it can very often be significantly less effort than writing the code yourself.

Also consider that "writing code" is only one thing you can do with it. I use it to help me track down bugs, plan features, verify algorithms that I've written, etc.

hungryhobbit 5 hours ago | parent | prev | next [-]

Spoken like a true technophobe.

"There's this incredible new technology that's enabling programmers around the world to be far more productive ... but it screws up 1% of the time, so instead of understanding how to deal with that, I'm going to be violently against the new tech!"

(I really don't get the whole programmer hatred of AI thing. It's not a person stealing your job, it's just another tool! Avoiding it is like avoiding compilers, or linters, or any other tool that makes you more productive.)

shitloadofbooks 4 hours ago | parent | next [-]

I certainly wouldn't use a compiler that "screws up" 1% of the time; that's the perfect amount where it's extremely common where everything I use it for will have major issues but also so laborious to find amongst the 99% of correct output that I might as well not use it in the first place.

Which is ironically, the exact case those of us who don't find LLM-assisted coding "worth it" make.

redman25 an hour ago | parent [-]

How about a human coworker who screws up 1% of the time? Doesn’t sound so bad in that light. It’s the nature of being human.

Good code review is the solution but if it’s faster to do it yourself, that’s fine too.

bigfishrunning 4 hours ago | parent | prev | next [-]

If they only screwed up 1% of the time, they'd be as good as the LinkedIn hype men want you to believe. They're far far worse then that in reality

komali2 an hour ago | parent | prev | next [-]

> enabling programmers around the world to be far more productive

I know a lot of us feel this way, but why isn't there more evidence of it than our feelings? Where's the explosion of FOSS projects and businesses? And why do studies keep coming out showing decreased productivity? Why aren't there oodles of studies showing increases of productivity?

I like kicking back and letting claude do my job but I've yet to see evidence of this increased productivity. Objectively speaking, "I" seem to be "writing" the same amount of code as I was before, just with less cognitive effort.

b00ty4breakfast 4 hours ago | parent | prev | next [-]

not questioning the cost of adopting new tech is so foolish it boggles my mind that so many nominally intelligent people just close their eyes and take a bite without wondering whether that's really fudge on their sundae or something fecal.

Pure ideology, as a certain sniffing slav would say

krapp 5 hours ago | parent | prev [-]

LLMs screw up far more than 1% of the time. They screw up routinely, far more than a professionally trained human would, and in ways that would have said human declared mentally ill.

wvenable 5 hours ago | parent | prev | next [-]

I don't trust it completely but I still use it. Trust but verify.

I've had some funny conversations -- Me:"Why did you choose to do X to solve the problem?" ... It:"Oh I should totally not have done that, I'll do Y instead".

But it's far from being so unreliable that it's not useful.

meatmanek 5 hours ago | parent | next [-]

I find that if I ask an LLM to explain what its reasoning was, it comes up with some post-hoc justification that has nothing to do with what it was actually thinking. Most likely token predictor, etc etc.

As far as I understand, any reasoning tokens for previous answers are generally not kept in the context for follow-up questions, so the model can't even really introspect on its previous chain of thought.

redman25 an hour ago | parent | next [-]

It depends on the harness and/or inference engine whether they keep the reasoning of past messages.

Not to get all philosophical but maybe justification is post-hoc even for humans.

wvenable 4 hours ago | parent | prev [-]

I mostly find it useful for learning myself or for questioning a strange result. It usually works well for either of those. As you said, I'm probably not getting it's actual reasoning from any reasoning tokens but never thought that was happening anyway. It's just a way of interrogating the current situation in the current context.

It providing a different result is exactly because it's now looking at the existing solution and generating from there.

sid_talks 5 hours ago | parent | prev [-]

> Trust but verify.

I guess I should have used ‘completely trust’ instead of ‘trust’ in my original comment. I was referring to the subset of developers who call themselves vibe coders.

wvenable 5 hours ago | parent [-]

I think I like "blindly trust" better because vibe coders literally aren't looking.

diehunde 3 hours ago | parent | prev | next [-]

Many of us are literally being forced to use it at work by people who haven't written a line of code in years (VPs, directors, etc) and decided to play around with it during a weekend and blew their minds.

bdangubic 5 hours ago | parent | prev | next [-]

we worked with humans for decades and are used to 25x less reliability

behehebd 5 hours ago | parent | prev [-]

OP isnt holding it right.

How would you trust autocomplete if it can get it wrong? A. you don't. Verify!