Enhancing Coronary Revascularization Decisions: The Promising Role of Large Language Models as a Decision-Support Tool for Multidisciplinary Heart Team

doi:10.1161/circinterventions.124.014201

DOI: 10.1161/circinterventions.124.014201 ISSN: 1941-7640

Enhancing Coronary Revascularization Decisions: The Promising Role of Large Language Models as a Decision-Support Tool for Multidisciplinary Heart Team

Karin Sudri, Iris Motro-Feingold, Roni Ramon-Gonen, Noam Barda, Eyal Klang, Paul Fefer, Sergei Amunts, Zachi Itzhak Attia, Mohamad Alkhouli, Amitai Segev, Michal Cohen-Shelly, Israel Moshe Barbash

Show PDF Cite

BACKGROUND:

While clinical practice guidelines advocate for multidisciplinary heart team (MDHT) discussions in coronary revascularization, variability in implementation across health care settings remains a challenge. This variability could potentially be addressed by language learning models like ChatGPT, offering decision-making support in diverse health care environments. Our study aims to critically evaluate the concordance between recommendations made by MDHT and those generated by language learning models in coronary revascularization decision-making.

METHODS:

From March 2023 to July 2023, consecutive coronary angiography cases (n=86) that were referred for revascularization (either percutaneous or surgical) were analyzed using both ChatGPT-3.5 and ChatGPT-4. Case presentation formats included demographics, medical background, detailed description of angiographic findings, and SYNTAX score (Synergy Between Percutaneous Coronary Intervention With Taxus and Cardiac Surgery; I and II), which were presented in 3 different formats. The recommendations of the models were compared with those of an MDHT.

RESULTS:

ChatGPT-4 showed high concordance with decisions made by the MDHT (accuracy 0.82, sensitivity 0.8, specificity 0.83, and kappa 0.59), while ChatGPT-3.5 (0.67, 0.27, 0.84, and 0.12, respectively) showed lower concordance. Entropy and Fleiss kappa of ChatGPT-4 were 0.09 and 0.9, respectively, indicating high reliability and repeatability. The best correlation between ChatGPT-4 and MDHT was achieved when clinical cases were presented in a detailed context. Specific subgroups of patients yielded high accuracy (>0.9) of ChatGPT-4, including those with left main disease, 3 vessel disease, and diabetic patients.

CONCLUSIONS:

The present study demonstrates that advanced language learning models like ChatGPT-4 may be able to predict clinical recommendations for coronary artery disease revascularization with reasonable accuracy, especially in specific patient groups, underscoring their potential role as a supportive tool in clinical decision-making.

Outline

Enhancing Coronary Revascularization Decisions: The Promising Role of Large Language Models as a Decision-Support Tool for Multidisciplinary Heart Team

BACKGROUND:

METHODS:

RESULTS:

CONCLUSIONS:

More from our Archive