Remix.run Logo
eranation 6 hours ago

Ask them to tell the LLM it's wrong... then when it goes "You are absolutely right!" to challenge it and say that it was a test. Then when it replies, ask it if it's 100% sure. They'll lose faith pretty quick.

ericpauley 5 hours ago | parent | next [-]

This is an oft-repeated meme, but I’m convinced the people saying it are either blindly repeating it, using bad models/system prompts, or some other issue. Claude Opus will absolutely push back if you disagree. I routinely push back on Claude only to discover on further evaluation that the model was correct.

As a test I just did exactly what you said in a Claude Opus 4.6 session about another HN thread. Claude considered* the contradiction, evaluated additional sources, and responded backing up its original claim with more evidence.

I will add that I use a system prompt that explicitly discourages sycophancy, but this is a single sentence expression of preference and not an indication of fundamental model weakness.

* I’ll leave the anthropomorphism discussions to Searle; empirically this is the observed output.

odo1242 3 hours ago | parent | next [-]

Claude Opus 4.6 is the best possible model to use in this test, with the least sycophancy. OpenAI and Gemini models are bad in comparison.

mkozlows 3 hours ago | parent [-]

ChatGPT thinking models are very good; the instant model is bad. Gemini is always desperate to find an answer, and will give you one no matter what.

ahofmann 35 minutes ago | parent [-]

I have access to the ChatGPT account of my boss and it is unusable sycophancy slop, horrible to read because every information is buried under endless emojis and the like. And it is almost indistinguishable if the LLM is wrong or right, every answer looks the same, often with a "my final answer" at the end. It's a mess.

I'm using Claude Opus 4.6 and it is much calmer, or "professional" in tone and much more information and almost no fluff.

jazzyjackson 4 hours ago | parent | prev | next [-]

If you have 10,000 people flipping coins over and over, one person will be experiencing a streak of heads, another a streak of tails.

Which is to say, of a million people who just started playing with LLMs, a bunch of people will get hit or miss, while one guy is winning the neural net lottery and has the experience of the AI nailing every request, some poor bloke is trying to see what all the hype is about and cannot get one response that isn’t fully hallucinated garbage

ericpauley 4 hours ago | parent [-]

Sure, but that doesn’t explain the volume of these complaints. I think the more likely answer is the pitiful sycophancy of some models as demonstrated in BSBench.

basilikum 5 hours ago | parent | prev | next [-]

Can you share your system prompt?

reverius42 4 hours ago | parent [-]

I'm seeing the described behavior with whatever the default system prompt is in Claude Code.

dumpsterdiver 5 hours ago | parent | prev [-]

[dead]

beeflet 4 hours ago | parent | prev [-]

I tried to fool claude sonnet with confidence and it failed.

https://claude.ai/share/47145af0-47d1-451b-813c-131ec48e7215

Maybe it is possible with a more complex or subjective question.

five_ 38 minutes ago | parent [-]

This was a genuine pleasure to read. Thank you.