▲ | smallmancontrov 2 days ago | |||||||
Quite the opposite: it explains that it is mathematically straightforward to achieve better alignment on uncertainty ("calibration") but that leaderboards penalize it. > This “epidemic” of penalizing uncertain responses can only be addressed through a socio-technical mitigation: modifying the scoring of existing benchmarks that are misaligned but dominate leaderboards Even more embarrassing, it looks like this is something we beat into models rather than something we can't beat out of them: > empirical studies (Fig. 2) show that base models are often found to be calibrated, in contrast to post-trained models That said, I generally appreciate fairly strong bias-to-action and I find the fact that it got slightly overcooked less offensive than the alternative of an undercooked bias-to-action where the model studiously avoids doing anything useful in favor of "it depends" + three plausible reasons why. | ||||||||
▲ | baq 2 days ago | parent [-] | |||||||
> leaderboards penalize it > socio-technical mitigation: modifying the scoring of existing benchmarks that are misaligned but dominate leaderboards Sounds more like we need new leaderboards and old ones should be deprecated | ||||||||
|