Remix.run Logo
batshit_beaver 2 hours ago

Right, and then look at any number of research papers showing that CoT output has limited impact on the end result. We've trained these models to pretend to reason.

atleastoptimal 26 minutes ago | parent [-]

If it's only pretending to reason, then how is it that the CoT output improves performance on every single benchmark/test?