Remix.run Logo
dahart 18 hours ago

And what’s the variance & accuracy of their responses? Isn’t comparing the models’ variance to baseline human variance what matters here? It seems like they didn’t do that, and I agree with parent’s call for that kind of baseline.

Having counted calories for years, I don’t think I could reliably estimate the calories or carbs in the example picture of a cheese sandwich. I can make assumptions about the bread and the cheese, but I might easily be off by 2-3x. Calorie counting apps that use text descriptions also have huge variance for the same thing. The problem might be the belief that a picture or description is enough, regardless of who or what is guessing…?

Edit: Ah, I see from sibling thread you meant commercial services are LLMs, I thought you meant there were human-backed services to compare to. Anyway, I totally agree there’s a problem if people rely on AI for safety, but I’m not sure LLMs are the core issue here, it seems like using vague information and guessing is the core issue.

swiftcoder 18 hours ago | parent [-]

> Isn’t comparing the models’ variance to baseline human variance what matters here?

You seem to be missing the context that this isn't just about diet apps - this is about apps claiming to be able to track carbs sufficiently accurately to be used in a medical context to dose insulin (a substance which can be lethal if incorrectly dosed)

dahart 17 hours ago | parent [-]

No I understand apps are making dubious claims and implications; obviously claiming LLMs can accurately estimate carbs from a photo is just wrong. But that doesn’t necessarily change my question. Should people use photos to estimate carbs? Can people looking at photos do any better?

The presence of variance in the LLM output doesn’t actually prove anything, in fact I would expect and hope for variance when confidence is less than 1.0. I’m more curious about accuracy of the mean of guesses for different models, for example.

But should any diabetic expect photos to be reliable, regardless of whether it’s an app or an LLM or a human? I know some diabetics, and the people I know do not rely on photos for their safety. They don’t even rely on food labels either (which are far more accurate than photos), they measure their insulin.

It’s probably useful to raise awareness, and useful to scare app makers away from making bogus medical claims - products and scams that make bogus medical claims is of course a practice as old as history. But we can still hold the studies and PR around this up to high standards, right? Even assuming this article & the paper behind it are right, there are reasonable questions here about how to demonstrate the problem and what the baselines are.

It’s worth keeping in mind that trying to prove the bogus apps wrong with a flawed methodology or questionable reasoning or just an overly heavy handed style can cause backlash and do damage to the cause. We’re already seeing that effect play out with respect to vaccinations.