Remix.run Logo
HarHarVeryFunny 7 hours ago

The entire history of RL-trained "reasoning models" from o1 to DeepSeek_R1 is basically just a year old!