Remix.run Logo
ACCount37 3 days ago

Quite a few reasoning LLMs do reasoning in English only. Because the RL setup specifically forces them to do so.

Why?

Because the creators want the reasoning trace to be human readable. And without a pressure forcing them to think in English, they tend to get weird with the reasoning trace. Wild language-mixing, devolved grammar, strange language-mixed nonsense words that the LLM itself seemingly understands just fine.