You fundamentally misunderstand AI assisted coding if you think it does the work for you, or that it gets it right, or that it can be trusted to complete a job.

It is an assistant not a team mate.

If you think that getting it wrong, or bugs, or misunderstandings, or lost code, or misdirections, are AI "failing", then yes you will fail to understand or see the value.

The point is that a good AI assisted developer steers through these things and has the skill to make great software from the chaotic ingredients that AI brings to the table.

And this is why articles like this one "just don't get it", because they are expecting the AI to do their job for them and holding it to the standards of a team mate. It does not work that way.

▲

ummonk 3 hours ago | parent | next [-]

What is the actual value of using agentic LLMs (rather than just LLM-powered autocomplete in your IDE) if it requires this much supervision and handholding? When is it actually faster / more effective?

	▲	dagss 41 minutes ago \| parent \| next [-]
		Why use a nailgun instead of a hammer, if the nailgun still requires supervision and handholding? Example: Say I discover a problem in the SPA design that can be fixed by tuning some CSS. Without LLM: Dig around the code until I find the right spot. If it's been some months since I was there this can easily cost five minutes. With LLM: Explain what is wrong. Perhaps description is vague ("cancel button is too invisible, I need another solution") or specific ("1px more margin here please"). The LLM makes a best effort fix within 30 secs. The diff points to just the right location so you can fine tune it.
	▲	FiberBundle 3 hours ago \| parent \| prev [-]
		The primary value is accrued by the AI labs. You pay hundreds or thousands of dollars a month to train their AI models. While you probably do increase your productivity saving time typing all the code, the feedback that you give the agent after it has produced mediocre or poor code is extremely valuable to the companies, because they train their reinforcement learning models with them. Now while you're happy you have such a great "assistant" that helps you type out code, you will at some point realize that your architectural/design skills really weren't all that special in the first place. All the models lacked to be good at that was sufficient data containing the correct rewards. Thankfully software engineers are some of the most naive people in the world, and they gave them that data by actually paying for it.

▲

terabytest 5 hours ago | parent | prev | next [-]

That’s not what I meant. What I’m asking is whether there’s any evidence that the latest “techniques” (such as Ralph) can actually lead to high quality results both in terms of code and end product, and if so, how.

	▲	cheema33 3 hours ago \| parent \| next [-]
		I used Ralph recently, in Claude Code. We had a complex SQL script that was crunched large amounts of data and was slow to run even on tables that are normalized, have indexes for the right columns etc. We, the humans spent significant amount of time tweaking it. We were able to get some performance gains, but eventually hit a wall. That is when I let Ralph take a stab at it. I told it to create a baseline benchmark and I gave it the expected output. I told to keep iterating on the script until there was at least 3x improvement in performance number while the output was identical. I set the iteration limit to 50. I let it loose and went to dinner. When I came back, it had found a way to get 3x performance and stopped on the 20th iteration. Is there another human that could get me even better performance given the same parameters. Probably yes. In the same amount of time? Maybe, but unlikely. In any case, we don't have anybody on our team that can think of 20 different ways to improve a large and complex SQL script and try them all in a short amount of time. These tools do require two things before you can expect good results: 1. An open mind. 2. Experience. Lots of it. BTW, I never trust the code an AI agent spits out. I get other AI agents, different LLMs, to review all work, create deterministic tests that must be run and must pass before the PR is ever generated. I used to do a lot of this manually. But now I create Claude skills that automate a lot of this away.
	▲	gbalduzzi 4 hours ago \| parent \| prev [-]
		I don't understand what kind of evidence you expect to receive. There are plenty of examples from talented individuals, like Antirez or Simonw, and an ocean of examples from random individuals online. I can say to you that some tasks that would take me a day to complete are done in 2h of agentic coding and 1h of code review, with the additional feature that during the 2h of agenti coding I can do something else. Is this the kind of evidence you are looking for?

▲

xg15 5 hours ago | parent | prev | next [-]

"You're holding it wrong"

▲

TechDebtDevin 7 hours ago | parent | prev | next [-]

[dead]

▲

wewewedxfgdf 7 hours ago | parent | prev [-]

"I bought a subscription to Claude and it didn't write a perfectly coded application for me while I watched a game of baseball. AI sucks!"

▲

tjr 7 hours ago | parent [-]

Given the claims that AI is replacing jobs left and right, that there’s no more need for software developers or computer science education, then it had jolly well better be able to code a perfect application while I watch baseball.

	▲	dagss 5 hours ago \| parent [-]
		As long as it makes already senior engineers work as quickly alone as when working in a team together with 3 juniors, it can lead to replacing jobs without producing code that doesn't need review.