Remix.run Logo
sgillen 5 hours ago

To be fair to the agent...

I think there is some behind the scenes prompting from claude code (or open code, whichever is being used here) for plan vs build mode, you can even see the agent reference that in its thought trace. Basically I think the system is saying "if in plan mode, continue planning and asking questions, when in build mode, start implementing the plan" and it looks to me(?) like the user switched from plan to build mode and then sent "no".

From our perspective it's very funny, from the agents perspective maybe it's confusing. To me this seems more like a harness problem than a model problem.

christoff12 5 hours ago | parent | next [-]

Asking a yes/no question implies the ability to handle either choice.

not_kurt_godel 5 hours ago | parent | next [-]

This is a perfect example of why I'm not in any rush to do things agentically. Double-checking LLM-generated code is fraught enough one step at a time, but it's usually close enough that it can be course-corrected with light supervision. That calculus changes entirely when the automated version of the supervision fails catastrophically a non-trivial percent of the time.

Joker_vD 5 hours ago | parent | prev | next [-]

Not when you're talking with humans, not really. Which is one of the reasons I got into computing in the first place, dangit!

efitz 5 hours ago | parent | prev | next [-]

To an LLM, answering “no” and changing the mode of the chat window are discrete events that are not necessarily related.

Many coding agents interpret mode changes as expressions of intent; Cline, for example, does not even ask, the only approval workflow is changing from plan mode to execute mode.

So while this is definitely both humorous and annoying, and potentially hazardous based on your workflow, I don’t completely blame the agent because from its point of view, the user gave it mixed signals.

hananova 4 hours ago | parent [-]

Yeah but why should I care? That’s not how consent works. A million yesses and a single no still evaluates to a hard no.

Lerc 5 hours ago | parent | prev | next [-]

But I think if you sit down and really consider the implications of it and what yes or not actually means in reality, or even a overabundance of caution causing extraneous information to confuse the issue enough that you don't realise that this sentence is completely irrelevant to the problem at hand and could be inserted by a third party, yet the AI is the only one to see it. I agree.

wongarsu 5 hours ago | parent | prev [-]

It's meant as a "yes"/"instead, do ..." question. When it presents you with the multiple choice UI at that point it should be the version where you either confirm (with/without auto edit, with/without context clear) or you give feedback on the plan. Just telling it no doesn't give the model anything actionable to do

keerthiko 5 hours ago | parent [-]

It can terminate the current plan where it's at until given a new prompt, or move to the next item on its todo list /shrug

Waterluvian 4 hours ago | parent | prev | next [-]

If we’re in a shoot first and ask questions later kind of mood and we’re just mowing down zombies (the slow kind) and for whatever reason you point to one and ask if you should shoot it… and I say no… you don’t shoot it!

clbrmbr 4 hours ago | parent | prev | next [-]

This. The models struggle with differentiating tool responses from user messages.

The trouble is these are language models with only a veneer of RL that gives them awareness of the user turn. They have very little pretraining on this idea of being in the head of a computer with different people and systems talking to you at once. —- there’s more that needs to go on than eliciting a pre-learned persona.

reconnecting 5 hours ago | parent | prev | next [-]

There is the link to the full session below.

https://news.ycombinator.com/item?id=47357042#47357656

bensyverson 5 hours ago | parent [-]

Do we know if thinking was on high effort? I've found it sometimes overthinks on high, so I tend to run on medium.

breton 5 hours ago | parent [-]

it was on "max"

stefan_ 5 hours ago | parent | prev | next [-]

This is probably just OpenCode nonsense. After prompting in "plan mode", the models will frequently ask you if you want to implement that, then if you don't switch into "build mode", it will waste five minutes trying but failing to "build" with equally nonsense behavior.

Honestly OpenCode is such a disappointment. Like their bewildering choice to enable random formatters by default; you couldn't come up with a better plan to sabotage models and send them into "I need to figure out what my change is to commit" brainrot loops.

BosunoB 5 hours ago | parent | prev [-]

The whole idea of just sending "no" to an LLM without additional context is kind of silly. It's smart enough to know that if you just didn't want it to proceed, you would just not respond to it.

The fact that you responded to it tells it that it should do something, and so it looks for additional context (for the build mode change) to decide what to do.

furyofantares 4 hours ago | parent | next [-]

I agree the idea of just sending "no" to an LLM without any task for it to do is silly. It doesn't need to know that I don't want it to implement it, it's not waiting for an answer.

It's not smart enough to know you would just not respond to it, not even close. It's been trained to do tasks in response to prompts, not to just be like "k, cool", which is probably the cause of this (egregious) error.

ForHackernews 5 hours ago | parent | prev [-]

> It's smart enough to know that if you just didn't want it to proceed, you would just not respond to it.

No it absolutely is not. It doesn't "know" anything when it's not responding to a prompt. It's not consciously sitting there waiting for you to reply.

BosunoB 5 hours ago | parent [-]

I didn't mean to imply that it was. But when you reply to it, if you just say "no" then it's aware that you could've just not responded, and that normally you would never respond to it unless you were asking for something more.

It just doesn't make any sense to respond no in this situation, and so it confuses the LLM and so it looks for more context.

alpaca128 4 hours ago | parent [-]

> it's aware that you could've just not responded

It's not aware of anything and doesn't know that a world outside the context window exists.

BosunoB 4 hours ago | parent [-]

No, it has knowledge of what it is and how it is used.

I'm guessing you and the other guy are taking issue with the words "aware of" when I'm just saying it has knowledge of these things. Awareness doesn't have to imply a continual conscious state.

saint_yossarian an hour ago | parent [-]

I think to many people awareness does imply consciousness, i.e. the thing that is aware of the knowledge.

BosunoB 28 minutes ago | parent [-]

Meh I looked up the definition:

"having knowledge or perception of a situation or fact."

They do have knowledge of the info, but they don't have perception of it.