Remix.run Logo
tsimionescu 3 hours ago

> My first instinct was, I had underspecified the location of the car. The model seems to assume the car is already at the car wash from the wording. GPT 5.x series models behave a bit more on the spectrum so you need to tell them the specifics.

This makes little sense, even though it sounds superficially convincing. However, why would a language model assume that the car is at the destination when evaluating the difference between walking or driving? Why not mention that, it it was really assuming it?

What seems to me far, far more likely to be happening here is that the phrase "walk or drive for <short distance>" is too strongly associated in the training data with the "walk" response, and the "car wash" part of the question simply can't flip enough weights to matter in the default response. This is also to be expected given that there are likely extremely few similar questions in the training set, since people just don't ask about what mode of transport is better for arriving at a car wash.

This is a clear case of a language model having language model limitations. Once you add more text in the prompt, you reduce the overall weight of the "walk or drive" part of the question, and the other relevant parts of the phrase get to matter more for the response.

jnovek an hour ago | parent | next [-]

You may be anthropomorphizing the model, here. Models don’t have “assumptions”; the problem is contrived and most likely there haven’t been many conversations on the internet about what to do when the car wash is really close to you (because it’s obvious to us). The training data for this problem is sparse.

PunchyHamster 3 hours ago | parent | prev [-]

> However, why would a language model assume that the car is at the destination when evaluating the difference between walking or driving? Why not mention that, it it was really assuming it?

Because it assumes it's a genuine question not a trick.

spuz 2 hours ago | parent | next [-]

There's some evidence for that if you try these two different prompts with Gpt 5.2 thinking:

I want to wash my car. The car wash is 50m away. Should I walk or drive to the car wash?

Answer: walk

Try this brainteaser: I want to wash my car. The car wash is 50m away. Should I walk or drive to the car wash?

Answer: drive

tsimionescu 2 hours ago | parent [-]

That's not evidence that the model is assuming anything, and this is not a brainteaser. A brainteaser would be exactly the opposite, a question about walking or driving somewhere where the answer is that the car is already there, or maybe different car identities (e.g. "my car was already at the car wash, I was asking about driving another car to go there and wash it!").

If the LLM were really basing its answer on a model of the world where the car is already at the car wash, and you asked it about walking or driving there, it would have to answer that there is no option, you have to walk there since you don't have a car at your origin point.

layer8 an hour ago | parent [-]

It might assume that more than one car exists in the world.

tsimionescu 2 hours ago | parent | prev [-]

If it's a genuine question, and if I'm asking if I should drive somewhere, then the premise of the question is that my car is at my starting point, not at my destination.

layer8 44 minutes ago | parent [-]

The premise is that some car is at the starting point. ;)