Remix.run Logo
nl a day ago

> I suspect it’s closer than you think!

It's not.

I've done this (although not with all these tools).

For a reasonable sized project it's easy to tell the difference in quality between say Grok-4.1-Fast (30 on AA Coding Index) and Sonnet 4.5 (37 on AA).

Sonnet 3.7 scores 27. No way I'm touching that.

Opus 4.5 scores 46 and it's easy to see that difference. Give the models something with high cyclomtric complexity or complex dependency chains and Grok-4.1-Fast falls to bits, Opus 4.5 solves things.