Remix.run Logo
nextworddev 5 days ago

Thanks. Is this mainly for verifiable tasks or any general task

ag8 5 days ago | parent | next [-]

It's for any task that has an "eval", which is often verifiable tasks or ones that can be judged by LLMs (e.g. see [0]). There's also been recent work such as BRPO [1] and similar approaches to make more and more "non-verifiable" tasks have verifiable rewards!

[0]: https://runrl.com/blog/funniest-joke

[1]: https://arxiv.org/abs/2506.00103

-_- 5 days ago | parent | prev [-]

There needs to be some way of automatically assessing performance on the task, though this could be with a Python function or another LLM as a judge (or a combination!)