Remix.run Logo
Snuggly73 2 days ago

Ok, if its almighty, then why is not the benchmarks at 100%? If you look at the individual issues, those are somewhat small and trivial changes in existing codebases.

https://swe-rebench.com/

(note that if you look at individual slices, Opus is getting often outperformed by Sonnet).