Remix.run Logo
wvenable 5 hours ago

I don't trust it completely but I still use it. Trust but verify.

I've had some funny conversations -- Me:"Why did you choose to do X to solve the problem?" ... It:"Oh I should totally not have done that, I'll do Y instead".

But it's far from being so unreliable that it's not useful.

meatmanek 5 hours ago | parent | next [-]

I find that if I ask an LLM to explain what its reasoning was, it comes up with some post-hoc justification that has nothing to do with what it was actually thinking. Most likely token predictor, etc etc.

As far as I understand, any reasoning tokens for previous answers are generally not kept in the context for follow-up questions, so the model can't even really introspect on its previous chain of thought.

redman25 an hour ago | parent | next [-]

It depends on the harness and/or inference engine whether they keep the reasoning of past messages.

Not to get all philosophical but maybe justification is post-hoc even for humans.

wvenable 4 hours ago | parent | prev [-]

I mostly find it useful for learning myself or for questioning a strange result. It usually works well for either of those. As you said, I'm probably not getting it's actual reasoning from any reasoning tokens but never thought that was happening anyway. It's just a way of interrogating the current situation in the current context.

It providing a different result is exactly because it's now looking at the existing solution and generating from there.

sid_talks 5 hours ago | parent | prev [-]

> Trust but verify.

I guess I should have used ‘completely trust’ instead of ‘trust’ in my original comment. I was referring to the subset of developers who call themselves vibe coders.

wvenable 5 hours ago | parent [-]

I think I like "blindly trust" better because vibe coders literally aren't looking.