Remix.run Logo
m-dot-reviews 7 hours ago

For anyone who's interested, I've put together a simple site for sharing ratings/opinions on models at a task-specific granularity. https://model.reviews/

The idea is that benchmark score comparisons are useful for a large cross-product comparison across models + their settings, but less useful if you're looking for the best model for <your-specific-task>. So I thought having a place to review and comment could be beneficial to people.

I'm not sure how best to get the corpus bootstrapped (i.e. people will likely only visit/post on the site if there's already activity), so posting it here for anyone who'd like to contribute.