Remix.run Logo
adastra22 a day ago

LLMs perform better than doctors in a randomized trial:

https://jamanetwork.com/journals/jamanetworkopen/fullarticle...

And here: https://arxiv.org/html/2503.10486v1

globular-toast a day ago | parent [-]

> the use of an LLM did not significantly enhance diagnostic reasoning performance compared with the availability of only conventional resources.

The other one isn't peer reviewed. Your précis doesn't appear to be warranted.

adastra22 a day ago | parent [-]

You only read the first line of the summary. This is the juicy bit:

> The LLM alone scored 16 percentage points (95% CI, 2-30 percentage points; P = .03) higher than the conventional resources group.

Basically they setup the experiment as a control group and a LLM-assisted group. There was no difference between the two groups and that is what was reported in the top level finding that you quote.

Then they went back and said “wait, what if we just blindly trusted the LLM? What if we had a third group that had no doctor involved — just let the LLM do the diagnosis?” This retroactively synthesized group did significantly better than either of the actual experimental groups:

> The LLM alone scored 16 percentage points (95% CI, 2-30 percentage points; P = .03) higher than the conventional resources group … The LLM alone demonstrated higher performance than both physician groups, indicating the need for technology and workforce development to realize the potential of physician-artificial intelligence collaboration in clinical practice.