Remix.run Logo
jwr 3 hours ago

> "run a model like Gemma 4 31b, which is almost anthropic sonnet levels of performance"

I wish people stopped deluding themselves — I regularly try (and benchmark for my purposes) local models and they are NOWHERE near the huge models like Sonnet or Opus. Nowhere. Yes, you can sometimes get plausibly-looking output for simple tasks, but for anything even remotely requiring thinking there is simply no comparison.

Local models are useful. I use them for spam filtering, and soon intend to use them for image tagging and OCR. But let's stop saying they can get us "anthropic sonnet levels of performance", because that's just not true.

2 hours ago | parent [-]
[deleted]