▲ | bunderbunder 12 hours ago | |
> The real question for me is: are they less reliable than human judges? I've spent some time poking at this. I can't go into details, but the short answer is, "Sometimes yes, sometimes no, and it depends A LOT on how you define 'reliable'." My sense is that, the more boring, mechanical and closed-ended the task is, the more likely an LLM is to be more reliable than a human. Because an LLM is an unthinking machine. It doesn't get tired, or hangry, or stressed out about its kid's problems at school. But it's also a doofus with absolutely no common sense whatsoever. | ||
▲ | visarga 12 hours ago | parent [-] | |
> Because an LLM is an unthinking machine. Unthinking can be pretty powerful these days. |