Remix.run Logo
lgienapp 2 hours ago

I feel like many of the “as good as opus” crowd would achieve the same with sonnet tbh. Actually reaching the ceiling of what Opus can do is maybe 10% of tasks, the rest is wasting compute on a too-strong model they default to for whatever they are doing. Hence they see little drop in output quality when trying out smaller open models.

theptip an hour ago | parent [-]

The Eval problem; “Alice is supposedly smarter than Bob, but they can both tie their shoes just as fast”.