Remix.run Logo
mountainriver 5 days ago

We do with RLVR and that works, there is only one answer, it has to find it. LLMs are often also trained on factual information, and tested on that.

If the knowledge can be represented in text then they can learn it, if it can't then we need a multimodal model.