Remix.run Logo
ACCount37 5 days ago

Yes, it's not nearly as easy as "just fix the evals".

But better evals are still helpful, because they reward LLM vendors for trying to do the very-hard-to-do thing. Instead of rewarding them for training an LLM that's really good at emitting 7% confidence guesses.

throwawaymaths 5 days ago | parent [-]

you're missing the point. SAT multiple choice negatives for random guesses, fine, you could trivially use this sort of a strategy for assigning cost functions to a classifier and backpropagate. how do you give negative weight to a wrong answer when training a transformer?

ACCount37 5 days ago | parent | next [-]

In RLVR? Quite easily.

And OpenAI has induced hallucinations in o3 with RLVR mistakes, not with a failed pre-training run. They used o4-mini as an example - similar training to o3 and similar issues.

Conversely, they have also designed a post-training system that has successfully reduced hallucinations in GPT-5.

5 days ago | parent [-]
[deleted]
RugnirViking 5 days ago | parent | prev [-]

isn't this just related to the question "how do you train a transformer"? you give it wrong examples, and use optimization algorithms to move away from that kind of completions

throwawaymaths 5 days ago | parent [-]

thats quite hard for the reasons i explained. might be solvable using q learning techniques, but those are not easy in the context of transformers iiuc