Remix.run Logo
m-dot-reviews 3 hours ago

I've been plugging this perhaps too many times now, but I am trying to bootstrap a user-sourced corpus of exactly "what model is good at task X". So, not benchmarks, but high-level tasks. There's a bit of a ordering problem in that nobody wants to bother commenting on a site that has few comments - so PTAL and contribute if you can. https://model.reviews