Remix.run Logo
llmslave 10 hours ago

The benchmarks on all these models are meaningless

alchemist1e9 10 hours ago | parent [-]

Why and what would a good benchmark look like?

moffkalast 9 hours ago | parent [-]

30 people trying out all models on the list for their use case for a week and then checking what they're still using a month after.