Remix.run Logo
evrimoztamur a day ago

Sounds like LLMs short-circuit without necessarily testing their context assumptions.

I also recognize this from whenever I ask it a question in a field I'm semi-comfortable in, I guide the question in a manner which already includes my expected answer. As I probe it, I often find then that it decided to take my implied answer as granted and decide on an explanation to it after the fact.

I think this also explains a common issue with LLMs where people get the answer they're looking for, regardless of whether it's true or there's a CoT in place.

BurningFrog a day ago | parent | next [-]

The LLMs copy human written text, so maybe they'll implement Motivated Reasoning just like humans do?

Or maybe it's telling people what they want to hear, just like humans do

ben_w a day ago | parent [-]

They definitely tell people what they want to hear. Even when we'd rather they be correct, they get upvoted or downvoted by users, so this isn't avoidable (but is is fawning or sychophancy?)

I wonder how deep or shallow the mimicry of human output is — enough to be interesting, but definitely not quite like us.

andrewmcwatters a day ago | parent | prev | next [-]

This is such an annoying issue in assisted programming as well.

Say you’re referencing a specification, and you allude to two or three specific values from that specification, you mention needing a comprehensive list and the LLM has been trained on it.

I’ll often find that all popular models will only use the examples I’ve mentioned and will fail to elaborate even a few more.

You might as well read specifications yourself.

It’s a critical feature of these models that could be an easy win. It’s autocomplete! It’s simple. And they fail to do it every single time I’ve tried a similar abstract.

I laugh any time people talk about these models actually replacing people.

They fail at reading prompts at a grade school reading level.

jiveturkey a day ago | parent | prev [-]

i found with the gemini answer box on google, it's quite easy to get the answer you expect. i find myself just playing with it, asking a question in the positive sense then the negative sense, to get the 2 different "confirmations" from gemini. also it's easily fooled by changing the magnitude of a numerical aspect of a question, like "are thousands of people ..." then "are millions of people ...". and then you have the now infamous black/white people phrasing of a question.

i haven't found perplexity to be so easily nudged.