Remix.run Logo
ACCount37 5 days ago

In RLVR? Quite easily.

And OpenAI has induced hallucinations in o3 with RLVR mistakes, not with a failed pre-training run. They used o4-mini as an example - similar training to o3 and similar issues.

Conversely, they have also designed a post-training system that has successfully reduced hallucinations in GPT-5.

5 days ago | parent [-]
[deleted]