An evaluation of ‘ChatGPT’ Compared to Dermatological Surgeons’ Choice of Reconstruction of Mohs Surgical Defects

doi:10.1093/ced/llae184

DOI: 10.1093/ced/llae184 ISSN: 0307-6938

An evaluation of ‘ChatGPT’ Compared to Dermatological Surgeons’ Choice of Reconstruction of Mohs Surgical Defects

Adrian Cuellar-Barboza, Elizabeth Brussolo-Marroquín, Fanny C Cordero-Martinez, Patrizia E Aguilar-Calderon, Osvaldo Vazquez-Martinez, Jorge Ocampo-Candiani

Show PDF Cite

Abstract

Background

ChatGPT® (OpenAI; California, USA) is an open-access chatbot developed using artificial intelligence (AI) that generates human-like responses.

Objective

To evaluate the ChatGPT-4’s concordance with three dermatologic surgeons on reconstructions for dermatological surgical defects.

Methods

A total of 70 cases of non-melanoma skin cancer treated with surgery were obtained from clinical records for analysis. A list of 30 reconstruction options was designed by the main authors which included primary closure, secondary skin closure, skin flaps and skin grafts. Three blinded dermatologic surgeons, along with ChatGPT-4, were asked to select two reconstruction options from the list.

Results

Seventy responses were analyzed using Cohen’s kappa looking for concordance between each dermatologist and ChatGPT. The level of agreement among dermatologic surgeons was higher compared to that between dermatologic surgeons and ChatGPT, highlighting differences in decision-making. In the best reconstruction technique, the results indicated a fair level of agreement among the dermatologists ranging between κ 0.268 and 0.331. However, the concordance with ChatGPT-4 and the dermatologists was slight with κ values from 0.107 to 0.121. In the analysis of the second-choice options, the dermatologists showed slight agreement. In contrast, the level of concordance between ChatGPT-4 and the dermatologists was below chance.

Conclusions

As anticipated, this study reveals variability in medical decisions between dermatologic surgeons and ChatGPT. Although these tools offer exciting possibilities for the future, it's vital to acknowledge the risk of inadvertently rely on non-certified AI for medical advice.

Outline

An evaluation of ‘ChatGPT’ Compared to Dermatological Surgeons’ Choice of Reconstruction of Mohs Surgical Defects

Abstract

Background

Objective

Methods

Results

Conclusions

More from our Archive