LLMs are not a mythical universal machine learning model that you can feed any input and have it magically do the same thing a specialized ML model could do.

You can't feed an LLM years of time-series meteorological data, and expect it to work as a specialized weather model, you can't feed it years of medical time-series and expect it to work as a model specifically trained, and validated on this specific kind of data.

An LLM generates a stream of tokens. You feed it a giant set of CSVs, if it was not RL'd to do something useful with it, it will just try to make whatever sense of it and generate something that will most probably have no strong numerical relationship to your data, it will simulate an analysis, it won't do it.

You may have a giant context windows, but attention is sparse, the attention mechanism doesn't see your whole data at the same time, it can do some simple comparisons, like figuring out that if I say my current pressure is 210X180 I should call an ER immediately. But once I send it a time-series of my twice a day blood-pressure measurements for the last 10 years, it can't make any real sense of it.

Indeed, it would have been better for the author to ask the LLM to generate a python notebook to do some data analysis on it, and then run the notebook and share the result with the doctor.

▲

rfw300 7 hours ago | parent | next [-]

This is true as a technical matter, but this isn't a technical blog post! It's a consumer review, and when companies ship consumer products, the people who use them can't be expected to understand failure modes that are not clearly communicated to them. If OpenAI wants regular people to dump their data into ChatGPT for Health, the onus is on them to make it reliable.

▲

themafia 7 hours ago | parent [-]

> the onus is on them to make it reliable.

That is not a plausible outcome given the current technology or of any of OpenAI's demonstrated capabilities.

"If Bob's Hacksaw Surgery Center wants to stay in business they have to stop killing patients!"

Perhaps we should just stop him before it goes too far?

▲

vineyardmike 6 hours ago | parent [-]

> That is not a plausible outcome given the current technology or of any of OpenAI's demonstrated capabilities

OpenAI has said that medical advice was one of their biggest use-cases they saw from users. It should be assumed they're investigating how to build out this product capability.

Google has LLMs fine tuned on medical data. I have a friend who works at a top-tier US medical research university, and the university is regularly working with ML research labs to generate doctor-annotated training data. OpenAI absolutely could be involved in creating such a product using this sort of source.

You can feed an LLM text, pictures, videos, audio, etc - why not train a model to accept medical-time-series data as another modality? Obviously this could have a negative performance impact on a coding model, but could potentially be valuable for a consumer-oriented chat bot. Or, of course, they could create a dedicated model and tool-call that model.

	▲	elzbardico 5 hours ago \| parent [-]
		They are going to do the same thing they do with code. They are going to hire armies of developing world workers to massage those models on post-training to have some acceptable behaviors, and they will create the appropriate agents with the appropriate tools to have something that will simulate the real thing in a most plausible way. Problem is, RLVR is cheap with code, but it can get very expensive with human physiology.

▲

protocolture 6 hours ago | parent | prev [-]

This LLM is advertising itself in a medical capacity. You arent wrong, but the customer has been fed the wrong set of expectations. Its the fault of the marketing of the tool.