Remix.run Logo
aussieguy1234 3 hours ago

If SWE-Bench Verified is no longer a good measure of agentic coding abilities, what benchmark now is?