▲ | robertclaus 5 days ago | |
While I get the academic perspective of sharing these insights, this article comes across as corporate justifying/complaining that their model's score is lower than it should be on the leaderboards... by saying the leaderboards are wrong. Or an even darker take is that its coorporate saying they won't prioritize eliminating hallucinations until the leaderboards reward it. | ||
▲ | skybrian 5 days ago | parent [-] | |
Yes, it's self-interested because they want to improve the leaderboards, which will help GPT-5 scores, but in the other hand, the changes they suggest seem very reasonable and will hopefully help everyone in the industry do better. And I'm sure other people will complain if notice that changing the benchmarks makes things worse. |