Remix.run Logo
simianwords 3 days ago

It worked for you because the paper does the experiment without allowing the model to use any reasoning tokens - something that is grossly misleading.