As a human, in the photo of that sandwich I see 4 slices of bread and 4 slices of cheese (distributed unevenly). I have no idea about the weight of the bread, flour type or its sugar content. I don't know the type of the cheese, dimensions of the slices or total amount of cheese inside the bread. I don't know if there is butter or anything else inside. I can guess the size of the plate as a size reference but I can't be sure. Human or AI, it's an ill-posed problem. There can be widely different estimates which can be equally plausible.

▲

bcjdjsndon 18 hours ago | parent [-]

But why would the same llm give you wildly different answers EACH TIME you ask?

▲

pkaye 16 hours ago | parent | next [-]

There is a parameter in LLMs called temperature that controls creativity/randomness. If you set it to 0 it makes the model deterministic. I think some LLMs expose this as a tunable parameter.

▲

muwtyhg 16 hours ago | parent | next [-]

The study used a temperature of 0.01.

> "Thirteen food photographs were each submitted 495–561 times to four LLM vision APIs (GPT-5.4, Claude Sonnet 4.6, Gemini 2.5 Pro, Gemini 3.1 Pro Preview) using an identical structured prompt adapted from the iAPS automated insulin delivery system (26,904 total queries, temperature 0.01)"

▲

jihadjihad 15 hours ago | parent | prev [-]

> If you set it to 0 it makes the model deterministic.

No, it doesn't. It can help make the model more deterministic, but it does not guarantee it.

	▲	azakai 14 hours ago \| parent [-]
		The hardware can also add nondeterminism. GPUs reorder operations, leading to different results. Vendors might also be running A/B testing or who knows what, even when you ask for a temperature of 0. But, if you run a fixed model with temperature 0 on your local CPU, it will be deterministic (unless there are bugs).

▲

zdragnar 17 hours ago | parent | prev [-]

Because that's how they work? They aren't knowledge machines, they are random generators.

▲

bcjdjsndon 17 hours ago | parent [-]

They're next word predictors. They explicitly add in randomness at various stages of the transformer itself, otherwise it'd be too obvious it's not actually intelligent and just a next word predictor

	▲	pertymcpert 16 hours ago \| parent [-]
		No that's not why.