| ▲ | freedomben 9 hours ago | |||||||||||||
> Despite having access to my weight, blood pressure and cholesterol, ChatGPT based much of its negative assessment on an Apple Watch measurement known as VO2 max, the maximum amount of oxygen your body can consume during exercise. Apple says it collects an “estimate” of VO2 max, but the real thing requires a treadmill and a mask. Apple says its cardio fitness measures have been validated, but independent researchers have found those estimates can run low — by an average of 13 percent. There's plenty of blame to go around for everyone, but at least for some of it (such as the above) I think the blame more rests on Apple for falsely representing the quality of their product (and TFA seems pretty clearly to be blasting OpenAI for this, not others like Apple). What would you expect the behavior of the AI to be? Should it always assume bad data or potentially bad data? If so, that seems like it would defeat the point of having data at all as you could never draw any conclusions from it. Even disregarding statistical outliers, it's not at all clear what part of the data is "good" vs "unrealiable" especially when the company that collected that data claims that it's good data. | ||||||||||||||
| ▲ | brandonb 9 hours ago | parent | next [-] | |||||||||||||
FWIW, Apple has published validation data showing the Apple Watch's estimate is within 1.2 ml/kg/min of a lab-measured Vo2Max. Behind the scenes, it's using a pretty cool algorithm that combines deep learning with physiological ODEs: https://www.empirical.health/blog/how-apple-watch-cardio-fit... | ||||||||||||||
| ||||||||||||||
| ▲ | aeonfox 9 hours ago | parent | prev | next [-] | |||||||||||||
> I think the blame more rests on Apple for falsely representing the quality of their product There was plenty of other concerning stuff in that article. And from a quick read it wasn't suggested or implied the VO2 max issue was the deciding factor for the original F score the author received. The article did suggest many times over the ChatGPT is really not equipped for the task of health diagnosis. > There was another problem I discovered over time: When I tried asking the same heart longevity-grade question again, suddenly my score went up to a C. I asked again and again, watching the score swing between an F and a B. | ||||||||||||||
| ||||||||||||||
| ▲ | jayd16 8 hours ago | parent | prev | next [-] | |||||||||||||
Well if it doesn't know the quality of the data and especially if it would be dangerous to guess then it should probably say it doesn't have an answer. | ||||||||||||||
| ▲ | AndrewKemendo 8 hours ago | parent | prev | next [-] | |||||||||||||
> Should it always assume bad data or potentially bad data? If so, that seems like it would defeat the point of having data at all as you could never draw any conclusions from it. Yes. You, and every other reasoning system, should always challenge the data and assume it’s biased at a minimum. This is better described as “critical thinking” in its formal form. You could also call it skepticism. That impossibility of drawing conclusions assumes there’s a correct answer and is called the “problem of induction.” I promise you a machine is better at avoiding it than a human. Many people freeze up or fail with too much data - put someone with no experience in front of 500 ppl to give a speech if you want to watch this live. | ||||||||||||||
| ▲ | hmokiguess 9 hours ago | parent | prev | next [-] | |||||||||||||
I have been sitting and waiting for the day these trackers get exposed as just another health fad that is optimized to deliver shareholder value and not serious enough for medical grade applications | ||||||||||||||
| ||||||||||||||
| ▲ | miltonlost 9 hours ago | parent | prev [-] | |||||||||||||
> What would you expect the behavior of the AI to be? Should it always assume bad data or potentially bad data? If so, that seems like it would defeat the point of having data at all as you could never draw any conclusions from it. Well, I would expect the AI to provide the same response as a real doctor did from the same information. Which the article went over the doctors were able to. I also would expect the AI to provide the same answer every time to the same data unlike what it did (from F to B over multiple attempts in the article) OpenAI is entirely to blame here when they are putting out faulty products, (hallucinations even on accurate data are a fault of them). | ||||||||||||||
| ||||||||||||||