Remix.run Logo
m4nu3l 7 days ago

Developing a model of the real world, or even just learning only a subset of self-consistent information, could be detrimental to the task of predicting the next token in the average text, given that most of the written information on many subjects could be contradictory and somehow wrong. I don't know how they are doing RL on top of that, how they are using synthetic data or filtering them. But it's clear that even with GPT-5 they haven't solved the problem, as the presentation demonstrated with the very first prompt (I'm talking about the wrong explanation for lift produced by a wing).