Remix.run Logo
globular-toast a day ago

Honestly it just sounds like you've been sold on "AI" being a thing and don't have any idea how any of it works. I don't even know what you're referring to with "more accurate than doctors". Classifying scans or something? Do you realise how different that is to generative LLMs writing code etc? Scan classification may well have been shown to be more accurate, but generative LLMs have never been shown to be "better" than humans and in fact it's easy to demonstrate they are much, much worse in many ways.

adastra22 a day ago | parent [-]

LLMs perform better than doctors in a randomized trial:

https://jamanetwork.com/journals/jamanetworkopen/fullarticle...

And here: https://arxiv.org/html/2503.10486v1

globular-toast a day ago | parent [-]

> the use of an LLM did not significantly enhance diagnostic reasoning performance compared with the availability of only conventional resources.

The other one isn't peer reviewed. Your précis doesn't appear to be warranted.

adastra22 a day ago | parent [-]

You only read the first line of the summary. This is the juicy bit:

> The LLM alone scored 16 percentage points (95% CI, 2-30 percentage points; P = .03) higher than the conventional resources group.

Basically they setup the experiment as a control group and a LLM-assisted group. There was no difference between the two groups and that is what was reported in the top level finding that you quote.

Then they went back and said “wait, what if we just blindly trusted the LLM? What if we had a third group that had no doctor involved — just let the LLM do the diagnosis?” This retroactively synthesized group did significantly better than either of the actual experimental groups:

> The LLM alone scored 16 percentage points (95% CI, 2-30 percentage points; P = .03) higher than the conventional resources group … The LLM alone demonstrated higher performance than both physician groups, indicating the need for technology and workforce development to realize the potential of physician-artificial intelligence collaboration in clinical practice.