Remix.run Logo
ej88 9 hours ago

Most of the gains come from post-training RL, not pre-training (OpenAI's GPT 5.2 is using the same base model as 4o).

Also the article seems to be somewhat outdated. 'Model collapse' is not a real issue faced by frontier labs.

dkdcio 9 hours ago | parent | next [-]

> OpenAI's GPT 5.2 is using the same base model as 4o

where’s that info from?

tintor 8 hours ago | parent [-]

Not the parent, but the only other source of that claim I found was Dylan Patel's recent post from semianalysis.

SequoiaHope 7 hours ago | parent [-]

Was that for 5.1 or 5.2? I recall that info spreading after 5.1’s release, I guess I naively assumed 5.2 was a delayed base model update.

staticshock 7 hours ago | parent [-]

You can just ask ChatGPT what its training cut-off is, and it'll say June 2024.

SequoiaHope 7 hours ago | parent [-]

Ask! 5.2 says August 2025.

dang 8 hours ago | parent | prev | next [-]

("The article" referred to https://www.theregister.com/2026/01/11/industry_insiders_see... - we've since changed the URL above.)

orwin 8 hours ago | parent | prev | next [-]

A lot of the recent gains are from RL but also better inference during the prefill phase, and none of that will be impacted by data poisoning.

But if you want to keep the "base model" on the edge, you need to frequently retrain it on more recent data. Which is where data poisoning becomes interesting.

Model collapse is still a very real issue, but we know how to avoid it. People (non-professionals) who train their own LoRA for image generation (in a TTRPG context at least) still have the issue regularly.

In any case, it will make the data curation more expensive.

simianwords 8 hours ago | parent | prev | next [-]

knowledge cutoff date is different for 4o and 5.2

8 hours ago | parent | prev [-]
[deleted]