▲ | TheNewsIsHere 8 days ago | |
> I've seen claude "monkey patching" a system so that it returns true to the tests. I’ve watched Github Copilot do the same thing. I’ve also seen it doubling down on ridiculous things and just spewing crash-laden messes. There seems to be a low upper ceiling on how “competent” it is, which makes sense. | ||
▲ | al_borland 13 hours ago | parent [-] | |
In my own use of Copilot, I found Gemini gives me better results than ChatGPT and Claude. To the point where ChatGPT and Claude will flounder on a problem for hours of back and forth, where Gemini will one-shot the same thing. |