▲ | Uehreka 6 hours ago | |||||||
I remember super clearly the first time an LLM told me “No.” It was in May when I was using Copilot in VS Code and switched from Claude 3.7 Sonnet to Claude Sonnet 4. I asked Sonnet 4 to do something 3.7 Sonnet had been struggling with (something involving the FasterLivePortrait project in Python) and it told me that what I was asking for was not possible and explained why. I get that this is different from getting an LLM to admit that it doesn’t know something, but I thought “getting a coding agent to stop spinning its wheels when set to an impossible task” was months or years away, and then suddenly it was here. I haven’t yet read a good explanation of why Claude 4 is so much better at this kind of thing, and it definitely goes against what most people say about how LLMs are supposed to work (which is a large part of why I’ve been telling people to stop leaning on mechanical explanations of LLM behavior/strengths/weaknesses). However it was definitely a step-function improvement. | ||||||||
▲ | cainxinth 4 hours ago | parent [-] | |||||||
Yet LLMs also sometimes erroneously claim they cannot do something they can. | ||||||||
|