▲ | mountainriver 5 days ago | |||||||
There is knowledge of correct and incorrect, that’s what loss is, there are just often many possible answers to a question. This is the same reason that RLVR works. There is just right one answer and LLMs learn this fairly well but not perfectly (yet) | ||||||||
▲ | Jensson 5 days ago | parent [-] | |||||||
> There is knowledge of correct and incorrect, that’s what loss is Loss is only correctness in terms of correct language, not correct knowledge. It correlates with correct knowledge, but that is all, that correlation is why LLM is useful for tasks at all but we still don't have a direct measure for correct knowledge in the models. So for language tasks loss is correctness, so for things like translations LLM are extremely reliable. But for most other kinds of tasks they are just loosely correlated. | ||||||||
|