TY - JOUR
T1 - Accuracy of ChatGPT-Generated Information on Head and Neck and Oromaxillofacial Surgery: A Multicenter Collaborative Analysis
AU - Vaira, Luigi Angelo
AU - Lechien, Jerome R.
AU - Abbate, Vincenzo
AU - Allevi, Fabiana
AU - Audino, Giovanni
AU - Beltramini, Giada Anna
AU - Bergonzani, Michela
AU - Bolzoni, Alessandro
AU - Committeri, Umberto
AU - Crimi, Salvatore
AU - Gabriele, Guido
AU - Lonardi, Fabio
AU - Maglitto, Fabio
AU - Petrocelli, Marzia
AU - Pucci, Resi
AU - Saponaro, Gianmarco
AU - Tel, Alessandro
AU - Vellone, Valentino
AU - Chiesa-Estomba, Carlos Miguel
AU - Boscolo-Rizzo, Paolo
AU - Salzano, Giovanni
AU - De Riu, Giacomo
PY - 2024
Y1 - 2024
N2 - Objective: To investigate the accuracy of Chat-Based Generative Pre-trained Transformer (ChatGPT) in answering questions and solving clinical scenarios of head and neck surgery. Study Design: Observational and valuative study. Setting: Eighteen surgeons from 14 Italian head and neck surgery units. Methods: A total of 144 clinical questions encompassing different subspecialities of head and neck surgery and 15 comprehensive clinical scenarios were developed. Questions and scenarios were inputted into ChatGPT4, and the resulting answers were evaluated by the researchers using accuracy (range 1-6), completeness (range 1-3), and references' quality Likert scales. Results: The overall median score of open-ended questions was 6 (interquartile range[IQR]: 5-6) for accuracy and 3 (IQR: 2-3) for completeness. Overall, the reviewers rated the answer as entirely or nearly entirely correct in 87.2% of cases and as comprehensive and covering all aspects of the question in 73% of cases. The artificial intelligence (AI) model achieved a correct response in 84.7% of the closed-ended questions (11 wrong answers). As for the clinical scenarios, ChatGPT provided a fully or nearly fully correct diagnosis in 81.7% of cases. The proposed diagnostic or therapeutic procedure was judged to be complete in 56.7% of cases. The overall quality of the bibliographic references was poor, and sources were nonexistent in 46.4% of the cases. Conclusion: The results generally demonstrate a good level of accuracy in the AI's answers. The AI's ability to resolve complex clinical scenarios is promising, but it still falls short of being considered a reliable support for the decision-making process of specialists in head-neck surgery.
AB - Objective: To investigate the accuracy of Chat-Based Generative Pre-trained Transformer (ChatGPT) in answering questions and solving clinical scenarios of head and neck surgery. Study Design: Observational and valuative study. Setting: Eighteen surgeons from 14 Italian head and neck surgery units. Methods: A total of 144 clinical questions encompassing different subspecialities of head and neck surgery and 15 comprehensive clinical scenarios were developed. Questions and scenarios were inputted into ChatGPT4, and the resulting answers were evaluated by the researchers using accuracy (range 1-6), completeness (range 1-3), and references' quality Likert scales. Results: The overall median score of open-ended questions was 6 (interquartile range[IQR]: 5-6) for accuracy and 3 (IQR: 2-3) for completeness. Overall, the reviewers rated the answer as entirely or nearly entirely correct in 87.2% of cases and as comprehensive and covering all aspects of the question in 73% of cases. The artificial intelligence (AI) model achieved a correct response in 84.7% of the closed-ended questions (11 wrong answers). As for the clinical scenarios, ChatGPT provided a fully or nearly fully correct diagnosis in 81.7% of cases. The proposed diagnostic or therapeutic procedure was judged to be complete in 56.7% of cases. The overall quality of the bibliographic references was poor, and sources were nonexistent in 46.4% of the cases. Conclusion: The results generally demonstrate a good level of accuracy in the AI's answers. The AI's ability to resolve complex clinical scenarios is promising, but it still falls short of being considered a reliable support for the decision-making process of specialists in head-neck surgery.
KW - ChatGPT
KW - otorhinolaryngology
KW - maxillofacial surgery
KW - artificial intelligence
KW - ChatGPT
KW - otorhinolaryngology
KW - maxillofacial surgery
KW - artificial intelligence
UR - http://hdl.handle.net/10807/303039
U2 - 10.1002/ohn.489
DO - 10.1002/ohn.489
M3 - Article
SN - 0194-5998
VL - 170
SP - 1492
EP - 1503
JO - Otolaryngology - Head and Neck Surgery
JF - Otolaryngology - Head and Neck Surgery
ER -