▲ | atleastoptimal 3 days ago | ||||||||||||||||
A process is described here: https://arxiv.org/pdf/2506.22405 >A physician or AI begins with a short case abstract and must iteratively request additional details from a gatekeeper model that reveals findings only when explicitly queried. Performance is assessed not just by diagnostic accuracy but also by the cost of physician visits and tests performed. | |||||||||||||||||
▲ | apical_dendrite 2 days ago | parent | next [-] | ||||||||||||||||
I believe that dataset was built off of cases that were selected for being unusual enough for physicians to submit to the New England Journal of Medicine. The real-world diagnostic accuracy of physicians in these cases was 100% - the hospital figured out a diagnosis and wrote it up. In the real world these cases are solved by a team of human doctors working together, consulting with different specialists. Comparing the model's results to the results of a single human physician - particularly when all the irrelevant details have been stripped away and you're just left with the clean case report - isn't really reflective of how medicine works in practice. They're also not the kind of situations that you as a patient are likely to experience, and your doctor probably sees them rarely if ever. | |||||||||||||||||
| |||||||||||||||||
▲ | sorcerer-mar 3 days ago | parent | prev [-] | ||||||||||||||||
Okay you have a point. AI probably would do really well when short case abstracts start walking into clinics. | |||||||||||||||||
|