| ▲ | GeoAtreides a day ago | |||||||
>which is something OP can manually fix what if the LLM gets something wrong that the operator (a junior dev perhaps) doesn't even know it's wrong? that's the main issue: if it fails here, it will fail with other things, in not such obvious ways. | ||||||||
| ▲ | godelski a day ago | parent | next [-] | |||||||
I think that's the main problem with them. It is hard to figure out when they're wrong. As the post shows, you can't trust them when they think they solved something but you also can't trust them when they think they haven't[0]. The things are optimized for human preference, which ultimately results in this being optimized to hide mistakes. After all, we can't penalize mistakes in training when we don't know the mistakes are mistakes. The de facto bias is that we prefer mistakes that we don't know are mistakes than mistakes that we do[1]. Personally I think a well designed tool makes errors obvious. As a tool user that's what I want and makes tool use effective. But LLMs flip this on the head, making errors difficult to detect. Which is incredibly problematic. [0] I frequently see this in a thing it thinks is a problem but actually isn't, which makes steering more difficult. [1] Yes, conceptually unknown unknowns are worse. But you can't measure unknown unknowns, they are indistinguishable from knowns. So you always optimize deception (along with other things) when you don't have clear objective truths (most situations). | ||||||||
| ▲ | alickz a day ago | parent | prev [-] | |||||||
>what if the LLM gets something wrong that the operator (a junior dev perhaps) doesn't even know it's wrong? the same thing that always happens if a dev gets something wrong without even knowing it's wrong - either code review/QA catches it, or the user does, and a ticket is created >if it fails here, it will fail with other things, in not such obvious ways. is infallibility a realistic expectation of a software tool or its operator? | ||||||||
| ||||||||