| ▲ | imiric 3 hours ago | |
"Tell me the reasons why this report is stupid" is a loaded prompt. The tool will generate whatever output pattern matches it, including hallucinating it. You can get wildly different output if you prompt it "Tell me the reasons why this report is great". It's the same as if you searched the web for a specific conclusion. You will get matches for it regardless of how insane it is, leading you to believe it is correct. LLMs take this to another level, since they can generate patterns not previously found in their training data, and the output seems credible on the surface. Trusting the output of an LLM to determine the veracity of a piece of text is a baffilingly bad idea. | ||
| ▲ | colechristensen 3 hours ago | parent [-] | |
>"Tell me the reasons why this report is stupid" is a loaded prompt. This is precisely the point. The LLM has to overcome its agreeableness to reject the implied premise that the report is stupid. It does do this but it takes a lot, but it will eventually tell you "no actually this report is pretty good" The point being filtering out slop, we can be perfectly find with false rejections. The process would look like "look at all the reports, generate a list of why each of them is stupid, and then give me a list of the ten most worthy of human attention" and it would do it and do a half-decent job at it. It could also pre-populate judgments to make the reviewer's life easier so they could very quickly glance at it to decide if it's worthy of a deeper look. | ||