▲ | orbital-decay a day ago | |||||||
If that prompt can be easily trained against, it probably doesn't exploit a generic bias. These are not that interesting, and there's no point in hiding them. | ||||||||
▲ | daedrdev a day ago | parent | next [-] | |||||||
generic biases can also be fixed | ||||||||
| ||||||||
▲ | fwip a day ago | parent | prev [-] | |||||||
Sure there is. If you want to know if students understand the material, you don't hand out the answers to the test ahead of time. Collecting a bunch of "Hard questions for LLMs" in one place will invariably result in Goodhart's law (When a measure becomes a target, it ceases to be a good measure). You'll have no idea if the next round of LLMs is better because they're generally smarter, or because they were trained specifically on these questions. |