I’m honestly baffled by this. I don’t want to tell you “you’re holding it wrong” but if this is your normal experience there’s something weird happening.

Friday afternoon I made a new directory and told Claude Code I wanted to make a Go proxy so I could have a request/callback HTTP API for a 3rd party service whose official API is only persistent websocket connections. I had it read the service’s API docs, engage in some back and forth to establish the architecture and library choices, and save out a phased implementation plan in plan mode. It implemented it in four phases with passing tests for each, then did live tests against the service in which it debugged its protocol mistakes using curl. Finally I had it do two rounds of code review with fresh context, and it fixed a race condition and made a few things cleaner. Total time, two hours.

I have noticed some people I work with have more trouble, and my vague intuition is it happens when they give Claude too much autonomy. It works better when you tell it what to do, rather than letting it decide. That can be at a pretty high level, though. Basically reduce the problem to a set of well-established subproblems that it’s familiar with. Same as you’d do with a junior developer, really.

▲

thwarted 5 hours ago | parent | next [-]

> it happens when they give Claude too much autonomy. It works better when you tell it what to do, rather than letting it decide. That can be at a pretty high level, though. Basically reduce the problem to a set of well-established subproblems that it’s familiar with. Same as you’d do with a junior developer, really.

Equating "junior developers" and "coding LLMs" is pretty lame. You handhold a junior developers so, eventually, you don't have to handhold anymore. The junior developer is expected to learn enough, and be trusted enough, to operate more autonomously. "Junior developers" don't exist solely to do your bidding. It may be valuable to recognize similarities between a first junior developer interaction and a first LLM interaction, but when every LLM interaction requires it to be handheld, the value of the iterative nature of having a junior developer work along side you is not at all equivalent.

	▲	wrs 4 hours ago \| parent [-]
		I didn’t say they are equivalent, nor do I in any way consider them equivalent. One is a tool, the other is a person. I simply said the description of the problem should be broken down similar to the way you’d do it for a junior developer. As opposed to the way you’d express the problem to a more senior developer who can be trusted to figure out the right way to do it at a higher level.

▲

maccard 5 hours ago | parent | prev | next [-]

> I have noticed some people I work with have more trouble, and my vague intuition is it happens when they give Claude too much autonomy

What’s giving too much autonomy about

“Please load settings.toml using a library and print out the name key from the application table”? Even if it’s under specified, surely it should at least leave it _compiling_?

I’ve been posting comments like this monthly here, my experience has been consistently this with Claude, opencode, antigravity, cursor, and using gpt/opus/sonnet/gemini models (latest at time of testing). This morning was opus 4.6

▲

linsomniac 4 hours ago | parent | next [-]

> Even if it’s under specified, surely it should at least leave it _compiling_?

Are you using Claude Code? Do yo have it configured so that you are not allowing it to run the build? Because I've observed that Claude Code is extremely good at making sure the code compiles, because it'll run a compile and address any compile errors as part of the work.

I just asked it to build a TOML example program in DotNet using Tomlyn, and when it was done I was able to run "./bin/Debug/net8.0/dotnettoml example.toml", it had already built it for me (I watched it run the build step as part of its work, as I mentioned it would do above).

▲

maccard 3 hours ago | parent [-]

I am using Claude code. I didn’t explicitly tell it what the build command was (it’s dotnet build), and it didn’t ask. Thats not my fault.

> I’ve observed Claude code is extremely good at making sure the code compiles

My observation is that it’s fine until it’s absolutely not, and the agentic loop fails.

	▲	linsomniac 2 hours ago \| parent [-]
		>Thats not my fault. I don't know that it's useful to assign blame here. It probably is to your benefit, if you are a coding professional, to understand why your results are so drastically different from what others are seeing. You started this thread saying "I keep getting told I'll be amazed at what it can do, but the tools keep failing at the first hurdle." I'm telling you that something is wrong, that is why you are getting poor results. I don't know what is wrong, but I've given you an example prompt and an example output showing that Claude Code is able to produce the exact output you were looking for. This is why a lot of people are saying "you'll be amazed at what it can do", and it points to you having some issue. I don't know if you are running an ancient version of Claude Code, if you are not using Opus 4.6, you are not using "high" effort (those are what I'm using to get the results I posted elsewhere in reply to your comment), but something is definitely wrong. Some of what may be wrong is that you don't have enough experience with the tooling, which I'd understand if you are getting poor results; you have little (immediate) incentive to get more proficient. As I said, I was able to tell Claude Code to do something like the example you gave, and it did it and it built, without me asking, and produced a working program on the first try.

▲

Kiro 3 hours ago | parent | prev | next [-]

Not even the worst possible prompt would explain your unusual experience, so I don't think that's it either.

▲

wrs 4 hours ago | parent | prev [-]

There’s nothing wrong with it that I can see. Like I said, I’m a bit baffled at your experience. I will say, it’s not unusual for the initial output not to compile, but usually one short iteration later that’s fixed. Claude Code will usually even do that iteration by itself.

▲

maccard 3 hours ago | parent [-]

> I will say, it’s not unusual for the initial output not to compile,

We’ve gone from “I’m baffled at your experience” to well yeah it often fails” in two sentences here…

	▲	saulpw 15 minutes ago \| parent [-]
		It's not unusual for my initial output (as a programmer) not to compile either. I wouldn't say I "failed" if I can then get it to compile. Which as people are saying, is what happens with Claude Code and Opus, either automatically or at most when I say "get it to compile".

▲

shireboy 5 hours ago | parent | prev [-]

Similar. I regularly use Github copilot (with claude models sometimes) and it works amazingly. But I see some who struggle with them. I have sort of learned to talk to it, understand what it is generating, and routinely use to generate fixes, whole features, etc. much much faster than I could before.