This site is part of the Siconnects Division of Sciinov Group
This site is operated by a business or businesses owned by Sciinov Group and all copyright resides with them.
ADD THESE DATES TO YOUR E-DIARY OR GOOGLE CALENDAR
September 5, 2025
Large language models (LLMs) are advanced AI systems trained on vast datasets to interpret and generate natural language. In this study, researchers evaluated whether LLMs could determine the final diagnosis for UDN patients using available clinical information and compared their performance with historical clinical reviews.
Vanderbilt University Medical Center (VUMC) is a founding member of the NIH-supported Undiagnosed Diseases Network (UDN), created in 2014 to advance the diagnosis and treatment of patients with unexplained conditions. In 2023, the Potocsnak family’s generous gift led to the establishment of the Potocsnak Center for Undiagnosed and Rare Disorders at VUMC.
Would you like me to also create a short, publication-style summary of this (2–3 lines) for quick use?
Patients referred to the UDN and the Potocsnak Center are often the most difficult to diagnose said study corresponding author Rizwan Hamid, MD, PhD, the Dorothy Overall Wells Professor of Pediatrics and director of the Potocsnak Center. “For many with rare disorders, the ‘diagnostic odyssey’ — the time from symptom onset to a confirmed diagnosis — can extend beyond 10 years.
The study analyzed 90 VUMC UDN cases diagnosed between November 2016 and April 2024. The median age at symptom onset was 7.6 months, while the median diagnostic odyssey lasted 7.6 years.
Researchers used two LLMs — ChatGPT (version 4o) and Llama 3.1 8B — prompting each to generate a differential diagnosis, or a list of potential conditions matching a patient’s standardized UDN clinical summary (which includes clinical history, family history, and prior evaluations).
The differential diagnoses were evaluated by Thomas Cassini, MD, assistant professor of Pediatrics and associate director of the Potocsnak Center, and Rory Tinker, MD, pediatrics resident. Kevin Byram, MD, associate professor of Medicine, adjudicated any disagreements.
The results showed diagnostic rates of 13.3% for ChatGPT and 10.0% for Llama, compared to 5.6% for historical clinical review. Helpful diagnoses were provided in 23.3% of cases by ChatGPT and 16.7% by Llama. Both models also suggested next steps for evaluating possible conditions. The cost and processing time per case were $0.03 and 5 seconds for ChatGPT, and $0 and 120 seconds for Llama.
These findings highlight that LLMs can support clinicians by generating initial differentials and guiding further diagnostic workup.
“Our study is the first to evaluate LLM diagnostic performance within the UDN,” said Cathy Shyr, PhD, first author of the study. “It contributes to the growing evidence base for LLMs in clinical applications.”
The authors emphasized that prospective studies are needed to fully assess clinical impact. “These AI tools may help shorten the diagnostic odyssey for patients with undiagnosed and rare disorders,” Hamid added.