JAMA Netw Open:利用大语言模型对脓毒症队列进行综合征分析

Abstract

Importance: Presenting signs and symptoms affect the care of patients with possible sepsis. However, signs and symptoms are not incorporated into most large observational studies because they are difficult to extract from clinical notes at scale.

重要性:首发症状和体征影响疑似脓毒症患者的诊疗。然而,由于从临床病历中大规模提取症状和体征较为困难,大多数大型观察性研究并未纳入这些信息。

Objective: To assess the use of large language models (LLMs) to extract presenting signs and symptoms from admission notes and characterize their associations with infectious diagnoses, multidrug-resistant infections, and mortality.

目的:评估大语言模型(LLMs)从入院病历中提取首发症状和体征的能力,并分析这些症状与感染诊断、多重耐药感染及死亡率之间的关联。

Design, setting, and participants: This retrospective cohort study obtained data from 5 Massachusetts hospitals within 1 health care system between June 1, 2015, and August 1, 2022. Participants were hospitalized adult patients with possible infection (determined by blood culture drawn and intravenous antibiotics administered within 24 hours of arrival). An LLM (LLaMA 3 8B; Meta) was used to extract up to 10 presenting signs and symptoms from each patient’s history-and-physical admission notes. LLM-generated labels were validated by blinded review of 303 random admission notes. Data analyses were performed from July 2023 to August 2025.

设计、背景和参与者:本回顾性队列研究的数据来自2015年6月1日至2022年8月1日期间,美国马萨诸塞州某医疗系统内5家医院。研究对象为住院的成年患者,其可能感染的标准为:入院24小时内进行了血培养并接受了静脉抗生素治疗。研究采用一个大语言模型(LLaMA 3 8B;Meta)从每份患者的“病史与体格检查”入院病历中提取最多10个首发症状和体征。模型生成的症状标签通过盲法审查303份随机入院病历进行验证。数据分析时间为2023年7月至2025年8月。

Exposures: Thirty most common signs and symptoms were retained as exposures, and unsupervised clustering was used to create syndromes, which were compared with infection sources derived from the International Statistical Classification of Diseases, Tenth Revision, Clinical Modification discharge codes.

暴露因素:保留最常见的30种症状和体征作为暴露变量,并采用无监督聚类方法将其归类为不同综合征,随后与国际疾病分类第十版临床修订版(ICD-10-CM)出院编码推导出的感染源进行比较。

Main outcomes and measures: Outcomes included positive cultures for methicillin-resistant Staphylococcus aureus (MRSA), positive cultures for multidrug-resistant gram-negative (MDRGN) organisms, and in-hospital mortality. Multivariable logistic regression was used to adjust for demographics, comorbidities, physiologic markers of severity of illness, and time to antibiotics.

主要结局和指标:主要结局包括耐甲氧西林金黄色葡萄球菌(MRSA)培养阳性、多重耐药革兰阴性菌(MDRGN)培养阳性以及院内死亡率。采用多变量逻辑回归模型调整人口统计学因素、合并症、疾病严重程度的生理指标以及抗生素使用时间。

Results: Among the 104 248 patients (median [IQR] age, 66 [52-78] years; 54 137 males [51.9%]) included, 23 619 (22.7%) had sepsis without shock, 25 990 (24.9%) had septic shock, and 94 913 (91.0%) had 1 or more admission note within 24 hours. The LLM labeled the notes of 93 674 of 94 913 patients (98.7%). On manual validation, LLM labels had an accuracy of 99.3% (95% CI, 99.2%-99.3%), balanced accuracy of 84.6% (95% CI, 83.5%-85.8%), positive predictive value of 68.4% (95% CI, 66.0%-70.7%), sensitivity of 69.7% (95% CI, 67.3%-72.0%), and specificity of 99.6% (95% CI, 99.6%-99.6%) compared with the physician medical record reviewer. The 30 most common signs and symptoms were clustered into syndromes that correlated with infection sources. Presence of skin and soft tissue symptoms (adjusted odds ratio [AOR], 1.73; 95% CI, 1.49-2.00) and absence of gastrointestinal (AOR, 0.63; 95% CI, 0.54-0.73) or urinary tract symptoms (AOR, 0.34; 95% CI, 0.22-0.50) were associated with MRSA culture positivity; inverse associations were seen for MDRGN organisms. Cardiopulmonary symptoms were associated with increased mortality (AOR, 1.30; 95% CI, 1.17-1.45).

JAMA Netw Open:利用大语言模型对脓毒症队列进行综合征分析

结果:在纳入的104,248例患者中(中位年龄[IQR] 66 [52-78]岁;男性54,137例[51.9%]),23,619例(22.7%)为无休克的脓毒症,25,990例(24.9%)为脓毒性休克,94,913例(91.0%)在入院24小时内至少有一份入院病历。大语言模型成功为94,913例中的93,674例(98.7%)病历提取了症状标签。人工验证显示,与医生病历审查相比,大语言模型标签的准确率为99.3%(95% CI, 99.2%-99.3%),平衡准确率为84.6%(95% CI, 83.5%-85.8%),阳性预测值为68.4%(95% CI, 66.0%-70.7%),灵敏度为69.7%(95% CI, 67.3%-72.0%),特异度为99.6%(95% CI, 99.6%-99.6%)。最常见的30种症状和体征被聚类为若干综合征,这些综合征与感染源存在相关性。出现皮肤和软组织症状(校正比值比[AOR] 1.73;95% CI, 1.49-2.00)以及缺乏胃肠道(AOR 0.63;95% CI, 0.54-0.73)或泌尿系统症状(AOR 0.34;95% CI, 0.22-0.50)与MRSA培养阳性相关;而多重耐药革兰阴性菌则呈现相反的关联模式。心肺症状与死亡率增加相关(AOR 1.30;95% CI, 1.17-1.45)。

Conclusions and relevance: This cohort study found that an LLM accurately extracted presenting signs and symptoms from admission notes that clustered into syndromes differentially correlated with infection sources, multidrug-resistant infections, and mortality. Further research is warranted to evaluate the value of large-scale sign-and-symptom data in models of antibiotic choice, effectiveness, and outcomes in patients with possible sepsis.

结论与意义:本队列研究发现,大语言模型能够准确地从入院病历中提取首发症状和体征,并将其聚类为与感染源、多重耐药感染及死亡率存在差异性关联的综合征。未来研究应进一步评估大规模症状数据在指导疑似脓毒症患者抗生素选择、疗效评估及预后建模中的价值。

原创文章(本站视频密码:66668888),作者:xujunzju,如若转载,请注明出处:https://zyicu.cn/?p=21168

Like (0)
Donate 微信扫一扫 微信扫一扫 支付宝扫一扫 支付宝扫一扫
xujunzju的头像xujunzju管理者
Previous 2025年11月13日 17:11
Next 2024年7月1日 20:06

相关推荐

发表回复

Please Login to Comment
联系我们
邮箱:
xujunzju@gmail.com
公众号:
xujunzju6174
捐赠本站
捐赠本站
SHARE
TOP