Remix.run Logo
HeavyStorm 4 hours ago

There's no "just" in RL. Fine tuning is very important and could make a lot of difference.

merlindru 2 hours ago | parent [-]

apparently GPT-5 uses the same pretrain as 4o did, hah