Remix.run Logo
jaccola 19 hours ago

It’s just an impossible problem. Photons don’t provide sufficient information to determine calories (at least not in any way they could practically be captured). Inside that sandwich could be drenched with olive oil or it could be hollow cheese with lettuce. It’s impossible to tell.

2ndorderthought 19 hours ago | parent | next [-]

The average person has no idea this is true. And the average person cannot tell when this is the case. So we have a bunch of people, going their way through school, and then when they get stuck relying on AI. The future is gonna be wild.

lordleft 19 hours ago | parent | next [-]

Yep. And it doesn't help that the people selling AI products act as if they're going to build God. Going, "well AI can't do that" isn't going to fly when you are lax about communicating its limitations!

2ndorderthought 19 hours ago | parent [-]

It also doesn't help when the messaging is linked to how "there will be no jobs where you use your brain anymore everything will be automated". What motivation does the average 16 year old have to try hard and learn anything beyond what they immediately need.

No jobs, ai Jesus is coming, and if you use ai it will use all of the worlds compute power to try to convince you it's correct even when it's not.

renticulous 19 hours ago | parent | prev | next [-]

Here's technical literacy of population on display. I love these prank examples which show the true education of populace.

https://www.youtube.com/shorts/B7c9qJcRnVk

dredmorbius 15 hours ago | parent | next [-]

A more robust measurement might be the (former) US Department of Education's "Adult Literacy in the United States" survey, most recently conducted in 2019. The results of this are sobering enough:

<https://nces.ed.gov/pubs2019/2019179/index.asp>.

There's a related study of adult technical literacy conducted in 33 OECD nations:

<https://www.oecd-ilibrary.org/education/skills-matter_978926...>.

Both show that only a small fraction (5--10%) of adults operate at high levels of literacy (whether of text, numeracy, or technology), and that a large fraction (roughly 50%) operate at a minimal or below-minimal rate.

fcarraldo 19 hours ago | parent | prev | next [-]

True education? What idiot would say yes to this?

Even if you _know_ the debit card transaction is safe, there’s no reason to risk it when a weirdo is filming you with some wild contraption.

rcxdude 18 hours ago | parent | prev | next [-]

Anything like this is going to have a very heavy selection bias, don't take any of this kind of content as a reflection of the average person.

WarmWash 18 hours ago | parent | prev [-]

Many of of witnessed the technical literacy of the general population when we ran to show them ChatGPT 3.5, and they just kind of shrugged like "So? What are you showing me?"

16 hours ago | parent | prev | next [-]
[deleted]
engineer_22 19 hours ago | parent | prev | next [-]

I am asking a lot here, but school needs to be training people what AI is and what it's weaknesses are and how to use it... My school taught me to use a calculator. It also taught me how to check my work when I relied on the calculator.

AI is a very complicated calculator - you give it an input, magic happens, it gives you an output. Really no different, to a layman.

jaccola 19 hours ago | parent | next [-]

To be fair, this should probably be covered by basic physics/maybe cooking classes. “You can’t determine the calories in food by looking at it” isn’t really ML specific.

2ndorderthought 19 hours ago | parent [-]

Won't help much if kids are ai'ing their way through physics then ten years later need to go on a diet having not applied the knowledge possibly ever or exercised their critical thinking skills

garciasn 19 hours ago | parent | prev | next [-]

Considering the lack of basic math skills I encounter each and every day, I don't think schools did enough; they certainly aren't going to do enough w/LLMs.

Ekaros 19 hours ago | parent [-]

Knowing the lack of understanding of basic chemistry and physics like fundamental thermodynamics... I have little hope any population can be trained to understand LLMs sufficiently...

2ndorderthought 19 hours ago | parent | prev [-]

It's more complicated than a calculator. Even researchers who have dedicated their lives to the field don't know all of the limitations of any given model. That fact alone isn't helpful when a model is 80% correct in one area but 2% in another.

pirates 18 hours ago | parent [-]

If even experts in the field don’t know all of the limitations then it’s even more important to stress that relying on the output of an LLM is a poor choice without additional checking and verification.

Even with calculators, I was taught that you should double check by hand sometimes to make sure you got it right.

lesuorac 18 hours ago | parent | prev [-]

If you're looking for a citation about this, the 1999 Dunning-Kurger paper "Unskilled and unaware" [1] is about this.

People who are unskilled at a task are unaware of what that task performed correctly is. So, somebody who can't count calories is unable to tell that the AI can't perform the task correctly either.

[1]: https://pubmed.ncbi.nlm.nih.gov/10626367/

hombre_fatal 18 hours ago | parent [-]

Fwiw invoking Dunning-Kruger is beyond trite at this point.

Which is a good thing because it means we can talk like normal humans ("people don't know that it's unreliable") instead of acting like we're making such a profound claim that it needs a citation and psychological dissection.

ozgung 18 hours ago | parent | prev | next [-]

As a human, in the photo of that sandwich I see 4 slices of bread and 4 slices of cheese (distributed unevenly). I have no idea about the weight of the bread, flour type or its sugar content. I don't know the type of the cheese, dimensions of the slices or total amount of cheese inside the bread. I don't know if there is butter or anything else inside. I can guess the size of the plate as a size reference but I can't be sure. Human or AI, it's an ill-posed problem. There can be widely different estimates which can be equally plausible.

bcjdjsndon 18 hours ago | parent [-]

But why would the same llm give you wildly different answers EACH TIME you ask?

pkaye 16 hours ago | parent | next [-]

There is a parameter in LLMs called temperature that controls creativity/randomness. If you set it to 0 it makes the model deterministic. I think some LLMs expose this as a tunable parameter.

muwtyhg 16 hours ago | parent | next [-]

The study used a temperature of 0.01.

> "Thirteen food photographs were each submitted 495–561 times to four LLM vision APIs (GPT-5.4, Claude Sonnet 4.6, Gemini 2.5 Pro, Gemini 3.1 Pro Preview) using an identical structured prompt adapted from the iAPS automated insulin delivery system (26,904 total queries, temperature 0.01)"

jihadjihad 15 hours ago | parent | prev [-]

> If you set it to 0 it makes the model deterministic.

No, it doesn't. It can help make the model more deterministic, but it does not guarantee it.

azakai 14 hours ago | parent [-]

The hardware can also add nondeterminism. GPUs reorder operations, leading to different results.

Vendors might also be running A/B testing or who knows what, even when you ask for a temperature of 0.

But, if you run a fixed model with temperature 0 on your local CPU, it will be deterministic (unless there are bugs).

zdragnar 17 hours ago | parent | prev [-]

Because that's how they work? They aren't knowledge machines, they are random generators.

bcjdjsndon 17 hours ago | parent [-]

They're next word predictors. They explicitly add in randomness at various stages of the transformer itself, otherwise it'd be too obvious it's not actually intelligent and just a next word predictor

pertymcpert 16 hours ago | parent [-]

No that's not why.

drrotmos 18 hours ago | parent | prev | next [-]

It is and it isn't. If you ask a human how many calories (or carbs) are in that sandwich, they can give you a qualified guess based on how a sandwich like that is typically constructed. They may not know the calories for a slice of bread or a slice of cheese by heart, but if you give them a food database, they can look it up.

They absolutely won't be 100% correct (bread sizes e.g. are going to be an estimate), but unless it's a trick sandwich drenched in olive oil or with hollow cheese, they're probably going to be in the right ballpark.

I don't think it's outside the realm of possibility for an LLM to be in the right ballpark as well, but that doesn't seem to be where we're at now.

muwtyhg 16 hours ago | parent [-]

And furthermore, once the person has "determined" how many calories the sandwich contains, they are likely to give you the same answer next time you ask instead of randomly changing their minds.

YeGoblynQueenne 17 hours ago | parent | prev | next [-]

It's not impossible to tell. Diabetics and others with dietary restrictions, have to do that sort of thing every day to decide what they can and can't eat. If you pick up a loaf of bread at the baker's, the baker usually has no idea the amount of carbs, or salt, or sugar is in that loaf of bread. Try it. Ask the baker: "how much carbs in this loaf of bread?". They'll just stare at you. They can tell you whether the loaf has salt or sugar in it but can't tell you how much because they don't calculate the amount by loaf. So if you have dietary restrictions you have to know what you can and can't eat and that requires the ability to judge the contents of a piece of food from the way it looks.

Photons don't carry that information? Sure. But you don't just have photons to go by. You can rely on a large database of prior knowledge about how food is usually made and with what ingredients.

Other people who have to rely on their imperfect human senses to decide what they can and can't eat: people with allergies, people with heart problems, hypertensives, kidney patients, etc. etc.

bryanlarsen 19 hours ago | parent | prev | next [-]

The question isn't about calories, it's about carbs. Drenching that sandwhich in olive oil won't change its carb count. From the picture it's a thin cheese sandwich -- we can see cheese and we can see it's thin enough that there's little else. Might be no butter, might be lots of butter, but that won't affect carb count. If there's lettuce in the sandwhich there's likely a negligble amount. Hand it to a knowledgeable human and you're going to get a very consistent carb reading -- 30g, the value of two slices of wonder bread.

It could be much different -- it could one of those breads with weird macros, or fake cheese, or it could be hollowed out and packed full of hidden vegetables. But a human is going to give you the answer for two slices of plain white bread.

beached_whale 19 hours ago | parent | prev | next [-]

From personal experience, one can get practically close guessing such that the error isn't going to be more significant compared to the errors in insulin to carb ratios/sensitivity factors/...

I am pretty good at this and the cheese sandwich example threw me, I would have estimated around 10-15g of carb for each slice. So the 28g is fairly consistent with that, not 40g. The only real way would be to weigh it and use the labeling. Another thing that often gets people is the labeling often has a serving size of say 2 slices and a weight that does not reflect the actual weight of 2 slices.

Luckily with good tools the significance is reduced, people using closed loop insulin pumps will automatically correct for that. Lots more room to wiggle.

Ekaros 19 hours ago | parent | prev | next [-]

Then it should refuse to answer 100% of time.

falcor84 19 hours ago | parent | next [-]

I don't think refusal is the right approach. I would much prefer that it respond with something like:

> There is not enough information to make an accurate estimate, but if you'd like, I can take a stab at it. If so, how much effort to put into it?

> Yes, go ahead and spend up to 5mins and $1 to analyze it.

> Done, I've had 100 subagents analyze the image and have arrived at a 95% confidence interval of the portion containing ...

muwtyhg 16 hours ago | parent [-]

I know this is just an example but my eyes kind of bugged out thinking about paying $1 every time I want to estimate the calories in my sandwich.

jaccola 19 hours ago | parent | prev [-]

Indeed, I think any reasonable human might say “A few hundred calories but without measuring the ingredients I might be way off”. I think LLMs could get there, I don’t see anything stopping that. Though they have been notoriously bad at this so far.

dredmorbius 15 hours ago | parent | prev | next [-]

If the problem is so evidently impossible then the LLM itself should recognise this, state that the problem isn't solvable, not* provide what's certain to be an inaccurate result, and suggest better approaches to arriving at a reasonable answer.

That said, it's notable that diabetes education materials often suggest estimating glycemic loads by rough portion size / plate ratios. Which is to say that absent accurate weight measurements (themselves subject to variations in ingredients, moisture levels, etc.) current clinical recommendations are themselves pretty rough.

Aurornis 18 hours ago | parent | prev | next [-]

That’s exactly the point of this article.

Many of the comments here assume the authors are stupid and were surprised by the result, but the point of the article is to inform readers that AI carb counting apps don’t work. That’s why they did the study.

jeroenhd 19 hours ago | parent | prev | next [-]

It's not even impossible from a technical point of view.

Your cheese sandwich may contain a lot more or a lot less calories, even if you take the numbers from the packaging and calculate the correct ratios by weight. The calories on the label are based on an average and individual packages may contain more or less of any listed nutrient to some margin. Of course, counting calories is meaningless if not done on a long-term scale anyway, but on a long-term scale the LLM doesn't need to guess the correct amount either.

unsupp0rted 19 hours ago | parent | prev | next [-]

And what if that guy in the surveillance video is just 2 kids in a trench coat? There's no way for AI to be sure from the photons: we should scrap it.

ge96 19 hours ago | parent | prev | next [-]

I was thinking at least if you had an advanced phone with lidar like iPhones can get volume but yeah the hidden/inner mass is a problem plus the oil as mentioned

tsimionescu 19 hours ago | parent | prev | next [-]

This is a bad take. If LLMs are supposed to work as general purpose assistants, as they are being sold as by both the companies making them and by the majority of AI believers, then it is very much a solvable problem. The LLM could give a high level estimate (a sandwich is not going to be 0 Cal, and it's not going to be 5000 Cal, so you can give some kind of range), and then ask for the type of information needed to make a more accurate estimate.

p-e-w 19 hours ago | parent | prev | next [-]

Then the correct answer is “I can’t tell.”

Not “Here’s a random guess that I just pulled out of my ass.”

LLMs have picked up the bad habit of trying to give an answer when no answer can be given from scientists, who overall don’t say “I don’t know” nearly as often as they should.

jeroenhd 19 hours ago | parent | next [-]

I tried asking LLMs about food before. They all say "I can't tell for certain, but this is an estimate based on the ingredients I can spot/infer/guess".

You need to write a specific prompt to avoid any warnings.

Of course a lot of people don't know what limitations LLMs have, so there's some value to a blog post about it, but it's not as black-and-white as the article might suggest with its graphs.

The prompt (documented here: https://www.diabettech.com/wp-content/uploads/2026/04/Supple...) lists specific instructions and a specific output format that doesn't allow the LLM any room for explanation or warning in processable data (only in notes fields). In fact, the prompt explicitly tells the LLM to ignore visual inferencing for some statistics and to rely on a nutrition authority instead.

Even in that intentionally restricted format, the English language output uses words like "roughly" and "estimated" in the LLMs I've tested.

Sure, if you take the numeric values and plot them in graphs, you get wildly inconsistent results, but that research method intentionally restricts the usefulness and reliability of the LLMs being researched.

What's much more troubling is this line from the preprint:

> The open-source iAPS automated insulin delivery (AID) system now offers food analysis through APIs from OpenAI, Anthropic and Google [8]

The linked app does seem to have a disclaimer, though:

> "AI nutritional estimates are approximations only. Always consult with your healthcare provider for medical decisions. Verify nutritional information whenever possible. Use at your own risk."

Ukv 19 hours ago | parent | prev | next [-]

> Then the correct answer is “I can’t tell.”

From the paper they're using structured JSON schema mode opposed to freeform answers, so it can't. Models do typically caveat their answer for questions like this, in my experience.

professoretc 18 hours ago | parent [-]

They'll qualify their answers in English but as the article mentions, if your prompt asks for a confidence score, that "uncertainty" doesn't translate into low numerical confidence.

Ukv 17 hours ago | parent [-]

Quantifying their own confidence is also something they're not good at, and which the format would prevent them from refusing to do or preceding with a caveat if that's what you'd want of them. Particularly since the response format seems backwards - giving confidence, then carbs estimate, then observations/notes, rather than being able to base carbs estimate off of observations/notes and then confidence estimate off of both of those.

> They'll qualify their answers in English but [...]

That the default user-facing chat as a normal user would use it gives a warning is the key part IMO. I don't think expectations of there being no "wrong way" to use the model can necessarily extend to API usage with long custom system prompt and restricted output format.

agentultra 19 hours ago | parent | prev [-]

LLMs had no agency to choose such a course of action.

They’re algorithms and they were designed this way.

badgersnake 17 hours ago | parent | prev | next [-]

Correct. But why doesn’t the AI just say that.

19 hours ago | parent | prev | next [-]
[deleted]
therobots927 19 hours ago | parent | prev | next [-]

Why is the AI answering questions without answers then?

pohl 19 hours ago | parent [-]

could be because, at the end of the day, it's just predicting the next likely token

SirMaster 17 hours ago | parent | next [-]

The next likely tokens for a response to a question that can't possibly be answered from the context should be "I" followed by "don't", followed by "know".

therobots927 18 hours ago | parent | prev [-]

The paper millionaires did NOT like your comment.

nyc_data_geek1 18 hours ago | parent | prev [-]

[flagged]