Remix.run Logo
SWE-Bench Failures: When Coding Agents Spiral into 693 Lines of Hallucinations(surgehq.ai)
20 points by landonxi 18 hours ago | 1 comments
egillie 18 hours ago | parent [-]

Is this because GPT-5 hallucinates less in general?