Remix.run Logo
mocamoca 11 hours ago

Would you mind explaining your point a view? Or point me to ressources making you think so?

nradov 5 hours ago | parent [-]

What can be asserted without evidence can also be dismissed without evidence. The benchmark creators haven't demonstrated that higher scores result in fewer humans dying or any meaningful outcome like that. If the LLM outputs some naughty words that's not an actual safety problem.