Remix.run Logo
candiddevmike a day ago

People's interpretation of benchmarks will largely depend on whether they believe they will be better or worse off by GenAI taking over SWE jobs. Think you'd need someone outside the industry to weigh in to have a real, unbiased view.

douglasisshiny a day ago | parent [-]

Or someone who has been a developer for a decade plus trying to use these models on actual existing code bases, solving specific problems. In my experience, they waste time and money.

sandspar a day ago | parent [-]

These people are the most experienced, yes, but by the same token they also have the most incentive to disbelieve that an AI will take their job.