[en] [en] INTRODUCTION: Chatbot generative pre-trained transformer (ChatGPT) is a new artificial intelligence-powered language model of chatbot able to help otolaryngologists in practice and research. We investigated the accuracy of ChatGPT-3.5 and -4 in the referencing of manuscripts published in otolaryngology.
METHODS: ChatGPT-3.5 and ChatGPT-4 were interrogated for providing references of the top-30 most cited papers in otolaryngology in the past 40 years including clinical guidelines and key studies that changed the practice. The responses were regenerated three times to assess the accuracy and stability of ChatGPT. ChatGPT-3.5 and ChatGPT-4 were compared for accuracy of reference and potential mistakes.
RESULTS: The accuracy of ChatGPT-3.5 and ChatGPT-4.0 ranged from 47% to 60%, and 73% to 87%, respectively (p < 0.005). ChatGPT-3.5 provided 19 inaccurate references and invented 2 references throughout the regenerated questions. ChatGPT-4.0 provided 13 inaccurate references, while it proposed only one invented reference. The stability of responses throughout regenerated answers was mild (k = 0.238) and moderate (k = 0.408) for ChatGPT-3.5 and 4.0, respectively.
CONCLUSIONS: ChatGPT-4.0 reported higher accuracy than the free-access version (3.5). False references were detected in both 3.5 and 4.0 versions. Practitioners need to be careful regarding the use of ChatGPT in the reach of some key reference when writing a report.
Disciplines :
Otolaryngology
Author, co-author :
Lechien, Jerome R ; Division of Laryngology and Broncho-Esophagology, Department of Otolaryngology-Head Neck Surgery, EpiCURA Hospital, UMONS Research Institute for Health Sciences and Technology, University of Mons (UMons), Mons, Belgium. Jerome.Lechien@umons.ac.be ; Department of Otorhinolaryngology and Head and Neck Surgery, School of Medicine, Phonetics and Phonology Laboratory (UMR 7018, Foch Hospital, CNRS, Université Sorbonne Nouvelle/Paris 3), Paris, France. Jerome.Lechien@umons.ac.be ; Department of Otorhinolaryngology and Head and Neck Surgery, School of Medicine, CHU de Bruxelles, CHU Saint-Pierre, Université Libre de Bruxelles, Brussels, Belgium. Jerome.Lechien@umons.ac.be ; Polyclinique Elsan de Poitiers, Poitiers, France. Jerome.Lechien@umons.ac.be ; Department of Human Anatomy and Experimental Oncology, Faculty of Medicine, UMONS Research Institute for Health Sciences and Technology, Avenue du Champ de Mars, 6, 7000, Mons, Belgium. Jerome.Lechien@umons.ac.be
Briganti, Giovanni ; Université de Mons - UMONS > Faculté de Médecine et de Pharmacie > Service de Médecine computationnelle et Neuropsychiatrie
Vaira, Luigi A; Maxillofacial Surgery Operative Unit, Department of Medicine, Surgery and Pharmacy, University of Sassari, Sassari, Italy ; Biomedical Sciences Department, PhD School of Biomedical Science, University of Sassari, Sassari, Italy
Language :
English
Title :
Accuracy of ChatGPT-3.5 and -4 in providing scientific references in otolaryngology-head and neck surgery.
Publication date :
April 2024
Journal title :
European Archives of Oto-Rhino-Laryngology
ISSN :
0937-4477
eISSN :
1434-4726
Publisher :
Springer Science and Business Media Deutschland GmbH, Germany
N.F. Ayoub Y.J. Lee D. Grimm V. Divi Head-to-head comparison of ChatGPT versus google search for medical knowledge acquisition Otolaryngol Head Neck Surg 2023 10.1002/ohn.465 37529853
M. Salvagno F.S. Taccone A.G. Gerli Can artificial intelligence help for scientific writing? Crit Care 2023 27 1 75 10.1186/s13054-023-04380-2 36841840 9960412
L.A. Vaira J.R. Lechien V. Abbate F. Allevi G. Audino G.A. Beltramini et al. Accuracy of ChatGPT-generated information on head and neck and oromaxillofacial surgery: a multicenter collaborative analysis Otolaryngol Head Neck Surg 2023 10.1002/ohn.489 38031504
C.C. Hoch B. Wollenberg J.C. Lüers S. Knoedler L. Knoedler K. Frank S. Cotofana M. Alfertshofer ChatGPT's quiz skills in different otolaryngology subspecialties: an analysis of 2576 single-choice and multiple-choice board certification preparation questions Eur Arch Otorhinolaryngol 2023 280 9 4271 4278 10.1007/s00405-023-08051-4 37285018 10382366
W.J. Fokkens et al. European position paper on rhinosinusitis and nasal polyps 2012 Rhinology 2012 50 1 298 10.4193/Rhino12.000 22469599
J.W. House D.E. Brackmann Facial nerve grading system Otolaryngol Head Neck Surg 1985 93 146 147 1:STN:280:DyaL2M7ovFCisg%3D%3D 10.1177/019459988509300202 3921901
B.R. Glasberg B.C.J. Moore Derivation of auditory filter shapes from notched-noise data Hear Res 1990 47 103 138 1:STN:280:DyaK3M%2FjsV2qsw%3D%3D 10.1016/0378-5955(90)90170-T 2228789
B.H. Jacobson et al. The voice handicap index (VHI): development and validation Am J Speech Lang Pathol 1997 6 66 70 10.1044/1058-0360.0603.66
J. Bernier et al. Postoperative irradiation with or without concomitant chemotherapy for locally advanced head and neck cancer N Engl J Med 2004 350 19 1945 1952 1:CAS:528:DC%2BD2cXjslehsrk%3D 10.1056/NEJMoa032641 15128894
J.R. Lechien et al. Olfactory and gustatory dysfunctions as a clinical presentation of mild-to-moderate forms of the coronavirus disease (COVID-19): a multicenter European study Eur Arch Otorhinolaryngol 2020 277 8 2251 2261 10.1007/s00405-020-05965-1 32253535 7134551
J.C. Rosenbek et al. A penetration aspiration scale Dysphagia 1996 11 93 98 1:STN:280:DyaK28zjvFShsw%3D%3D 10.1007/BF00417897 8721066
G.P. Jacobson C.W. Newman The development of the Dizziness Handicap Inventory Arch Otolaryngol Head Neck Surg 1998 116 424 427 10.1001/archotol.1990.01870040046011
P.A. Luce D.B. Pisoni Recognizing spoken words: the neighborhood activation model Ear Hear 1998 19 1 36 1:STN:280:DyaK1c7msFSgsg%3D%3D 10.1097/00003446-199802000-00001 9504270 3467695
J.A. Koufman The otolaryngologic manifestation of gastroesophageal reflux disease (GERD): a clinical investigation of 225 patients using ambulatory 24-hour pH monitoring and experimental investigation of the role of acid and pepsin in the development of laryngeal injury Laryngoscope 1991 101 1 78 1:STN:280:DyaK3MznsFWqtg%3D%3D 10.1002/lary.1991.101.s53.1 1895864
J.B. Vermorken et al. Cisplatin, fluorouracil, and docetaxel in unresectable head and neck cancer N Engl J Med 2007 357 17 1695 1704 1:CAS:528:DC%2BD2sXht1ansLvP 10.1056/NEJMoa071028 17960012
H. Stammberger W. Posawetz Functional endoscopic sinus surgery: concept, indications and results of the Messerklinger technique Eur Arch Otorhinolaryngol 1990 247 63 76 1:STN:280:DyaK3c7pvVWlsA%3D%3D 10.1007/BF00183169 2180446
R.H. Spiro Salivary neoplasms: overview of a 35-year experience with 2807 patients Head Neck Surg 1986 8 177 184 1:STN:280:DyaL28zgtlGrtQ%3D%3D 10.1002/hed.2890080309 3744850
J.M. Epley The canalith repositioning procedure: for treatment of benign paroxysmal positional vertigo Otolaryngol Head Neck Surg 1992 107 399 404 1:STN:280:DyaK3s%2Fis1Cluw%3D%3D 10.1177/019459989210700310 1408225
G. Hadad et al. A novel reconstructive technique after endoscopic expanded endonasal approaches: vascular pedicle nasoseptal flap Laryngoscope 2006 116 1882 1886 10.1097/01.mlg.0000234933.37779.e4 17003708
P.C. Belafsky et al. Validity and reliability of the reflux symptom index (RSI) J Voice 2002 16 274 277 10.1016/S0892-1997(02)00097-8 12150380
T. Hummel et al. Normative data for the Sniffin’ Sticks including tests of odor identification, odor discrimination, and olfactory thresholds: an upgrade based on a group of more than 3000 subjects Eur Arch Otorhinolaryngol 2007 264 237 243 1:STN:280:DC%2BD2s%2FnsVWltA%3D%3D 10.1007/s00405-006-0173-0 17021776
J. Bernier et al. Defining risk levels in locally advanced head and neck cancers: a comparative analysis of concurrent postoperative radiation plus chemotherapy trials of the EORTC (#22931) and RTOG (#9501) Head Neck 2005 27 843 850 10.1002/hed.20279 16161069
W. Fokkens et al. European position paper on rhinosinusitis and nasal polyps Rhinol Suppl 2007 20 1 36 17844873
M.S. Benninger Adult chronic rhinosinusitis: definitions, diagnosis, epidemiology, and pathophysiology Otolaryngol Head Neck Surg 2003 129 S1 32 10.1053/hn.2003.v128.amhn0312811 12958561
P.C. Belafsky et al. The validity and reliability of the reflux finding score (RFS) Laryngoscope 2001 111 1313 1317 1:STN:280:DC%2BD3MritVagsA%3D%3D 10.1097/00005537-200108000-00001 11568561
S. Gatehouse W. Noble The speech, spatial and qualities of hearing scale (SSQ) Int J Audiol 2004 43 85 99 10.1080/14992020400050014 15035561 5593096
R.M. Rosenfeld et al. Clinical practice guideline: adult sinusitis Otolaryngol Head Neck Surg 2007 137 S1 31 10.1016/j.otohns.2006.10.032 17761281
P.H. Dejonckere et al. A basic protocol for functional assessment of voice pathology, especially for investigating the efficacy of (phonosurgical) treatments and evaluating new assessment techniques – Guideline elaborated by the Committee on Phoniatrics of the European Laryngological Society (ELS) Eur Arch Otorhinolaryngol 2001 258 77 82 1:STN:280:DC%2BD3MrjslWgtA%3D%3D 10.1007/s004050000299 11307610
H. Stammberger Endoscopic endonasal surgery: concepts in treatment of recurring rhinosinusitis. 1. Anatomic and pathophysiologic considerations Otolaryngol Head Neck Surg 1986 94 143 147 1:STN:280:DyaL287ns1Wquw%3D%3D 10.1177/019459988609400202 3083326
V.J. Lund D.W. Kennedy Staging for rhinosinusitis Otolaryngol Head Neck Surg 1997 117 S35 40 1:STN:280:DyaK2svnsFalsA%3D%3D 10.1016/S0194-5998(97)70005-6 9334786
K.T. Robbins et al. Neck dissection classification update-revisions proposed by the American Head and Neck Society and the American Academy of Otolaryngology-Head and Neck Surgery Arch Otolaryngol Head Neck Surg 2002 128 751 758 10.1001/archotol.128.7.751 12117328
J.F. Piccirillo et al. Psychometric and clinimetric validity of the 20- Item Sino-Nasal Outcome Test (SNOT-20) Otolaryngol Head Neck Surg 2002 126 41 47 10.1067/mhn.2002.121022 11821764
D.W. Kennedy et al. Functional endoscopic sinus surgery: theory and diagnostic evaluation Arch Otolaryngol Head Neck Surg 1985 111 576 582 1:STN:280:DyaL2M3nvVyrsA%3D%3D 10.1001/archotol.1985.00800110054002
K.T. Robbins et al. Standardizing neck dissection terminology: official report of the Academy’s Committee for Head and Neck Surgery and Oncology Arch Otolaryngol Head Neck Surg 1991 117 601 605 1:STN:280:DyaK3M3kt1Sltw%3D%3D 10.1001/archotol.1991.01870180037007 2036180
D.C. Lanza D.W. Kennedy Adult rhinosinusitis defined Otolaryngol Head Neck Surg 1997 117 S1 7 1:STN:280:DyaK2svnsFamug%3D%3D 10.1016/S0194-5998(97)70001-9 9334782
A. Frosolini L. Franz S. Benedetti L.A. Vaira C. de Filippis P. Gennaro G. Marioni G. Gabriele Assessing the accuracy of ChatGPT references in head and neck and ENT disciplines Eur Arch Otorhinolaryngol 2023 10.1007/s00405-023-08205-4 37679532
B. Morath U. Chiriac E. Jaszkowski C. Deiß H. Nürnberg K. Hörth T. Hoppe-Tichy K. Green Performance and risks of ChatGPT used in drug information: an exploratory real-world analysis Eur J Hosp Pharm 2023 10.1136/ejhpharm-2023-003750 37263772
J.R. Lechien A. Gorton J. Robertson L.A. Vaira Is ChatGPT accurate in proofread a manuscript in otolaryngology-head and neck surgery? Otolaryngol Head Neck Surg 2023 10.1002/ohn.526 38123531
D.J. Campbell L.E. Estephan E. Sina E.V. Mastrolonardo R. Alapati D.R. Amin E. Cottrill Evaluating ChatGPT responses on thyroid nodules for patient education Thyroid 2023 10.1089/thy.2023.0491 38010917 10024591