| ▲ | atleastoptimal 14 hours ago | |
Good point, it is interesting that it fails on that question when it seems it doesn't take a lot of extrapolation/interpretation to determine the answer. Perhaps the issue is that to think of the right answer the LLM needs to "imagine" the process of walking and the state of the person upon arriving. Consistent mental models like that trip up LLM's, but their semantic understanding allows them to avoid that handicap. I asked the question to the default version of ChatGPT and Claude and got the same "Walk" answer, though Opus 4.7 with thinking determined that it was a trick question, and that only driving would make sense. | ||