| ▲ | cj 15 hours ago | ||||||||||||||||||||||
It's like saying "Star Wars is the best movie in the world" - to some people it is. To others it's terrible. I feel like it would be advantageous to move away from a "one model fits all" mindset, and move towards a world where we have different genres of models that we use for different things. The benchmark scores are turning into being just as useful as tomatometer movie scores. Something can score high, but if that's not the genre you like, the high score doesn't guarantee you'll like it. | |||||||||||||||||||||||
| ▲ | everdrive 15 hours ago | parent [-] | ||||||||||||||||||||||
Outside of experience and experimentation, is there a good way to know what models are strong for what tasks? | |||||||||||||||||||||||
| |||||||||||||||||||||||