| ▲ | thomasliao 3 hours ago | |||||||||||||||||||
It's an important question! If you are paying a lot of money to use AI models, you care that you are using the best for your task. And it turns out that figuring out which AI models is best for your task is not trivial and requires some expertise. | ||||||||||||||||||||
| ▲ | liveoneggs 22 minutes ago | parent | next [-] | |||||||||||||||||||
They all change day to day and are non-deterministic by design. Your settled answer is only good for a moment. | ||||||||||||||||||||
| ▲ | wseqyrku 3 hours ago | parent | prev | next [-] | |||||||||||||||||||
That was too nice of a reply, I apologize. I just can't understand the thought process and that what exactly are we optimizing for? If you are paying a lot of money to use AI models, you already have so much overhead that precise ranking in an eval is not gonna make much difference between equally "frontier" models. Especially since models are sensitive to the input. So the eval is just gonna evaluate the eval with very high accuracy. It might be equivalent to the illusion of safety thing applied to financial risk. | ||||||||||||||||||||
| ||||||||||||||||||||
| ▲ | lupire an hour ago | parent | prev [-] | |||||||||||||||||||
But frontier models are constantly changing. | ||||||||||||||||||||