Evaluating the Potential of AI Chatbots in Treatment Decision-making for Acquired Bilateral Vocal Fold Paralysis in Adults.

Dronkers, Emilie A C; Geneid, Ahmed; Al Yaghchi, Chadwan; Lechien, Jérome

doi:10.1016/j.jvoice.2024.02.020

Article (Scientific journals)

Evaluating the Potential of AI Chatbots in Treatment Decision-making for Acquired Bilateral Vocal Fold Paralysis in Adults.

Dronkers, Emilie A C; Geneid, Ahmed; Al Yaghchi, Chadwan et al.

2024 • In Journal of Voice

Peer Reviewed verified by ORBi

Permalink
https://hdl.handle.net/20.500.12907/50697

DOI
10.1016/j.jvoice.2024.02.020

PubMed
38584026

Files (1)Send to Details Statistics Bibliography Similar publications

Files

Full Text

1-s2.0-S0892199724000596-main.pdf

Author postprint (366.04 kB)

Download

All documents in ORBi UMONS are protected by a user license.

Send to

RIS BibTex APA Chicago Permalink X Linkedin

Details

Keywords :

Artificial intelligence; Bilateral vocal fold paralysis; ChatGPT; Decision-making; Laryngology; Llama; Otorhinolaryngology; Speech and Hearing; LPN and LVN

Abstract :

[en] [en] OBJECTIVES: The development of artificial intelligence-powered language models, such as Chatbot Generative Pre-trained Transformer (ChatGPT) or Large Language Model Meta AI (Llama), is emerging in medicine. Patients and practitioners have full access to chatbots that may provide medical information. The aim of this study was to explore the performance and accuracy of ChatGPT and Llama in treatment decision-making for bilateral vocal fold paralysis (BVFP). METHODS: Data of 20 clinical cases, treated between 2018 and 2023, were retrospectively collected from four tertiary laryngology centers in Europe. The cases were defined as the most common or most challenging scenarios regarding BVFP treatment. The treatment proposals were discussed in their local multidisciplinary teams (MDT). Each case was presented to ChatGPT-4.0 and Llama Chat-2.0, and potential treatment strategies were requested. The Artificial Intelligence Performance Instrument (AIPI) treatment subscore was used to compare both Chatbots' performances to MDT treatment proposal. RESULTS: Most common etiology of BVFP was thyroid surgery. A form of partial arytenoidectomy with or without posterior transverse cordotomy was the MDT proposal for most cases. The accuracy of both Chatbots was very low regarding their treatment proposals, with a maximum AIPI treatment score in 5% of the cases. In most cases even harmful assertions were made, including the suggestion of vocal fold medialisation to treat patients with stridor and dyspnea. ChatGPT-4.0 performed significantly better in suggesting the correct treatment as part of the treatment proposal (50%) compared to Llama Chat-2.0 (15%). CONCLUSION: ChatGPT and Llama are judged as inaccurate in proposing correct treatment for BVFP. ChatGPT significantly outperformed Llama. Treatment decision-making for a complex condition such as BVFP is clearly beyond the Chatbot's knowledge expertise. This study highlights the complexity and heterogeneity of BVFP treatment, and the need for further guidelines dedicated to the management of BVFP.

Disciplines :

Otolaryngology

Author, co-author :

Dronkers, Emilie A C; National Centre for Airway Reconstruction, Imperial College Healthcare NHS Trust, London, UK. Electronic address: emiliedronkers@gmail.com

Geneid, Ahmed; Department of Otolaryngology and Phoniatrics-Head and Neck Surgery, Helsinki University Hospital and University of Helsinki, Helsinki, Finland

Al Yaghchi, Chadwan; National Centre for Airway Reconstruction, Imperial College Healthcare NHS Trust, London, UK

Lechien, Jérome ; Université de Mons - UMONS > Faculté de Psychologie et des Sciences de l'Education > Service de Métrologie et Sciences du langage ; Université de Mons - UMONS > Faculté de Médecine et de Pharmacie > Service de Chirurgie

Language :

English

Title :

Evaluating the Potential of AI Chatbots in Treatment Decision-making for Acquired Bilateral Vocal Fold Paralysis in Adults.

Publication date :

06 April 2024

Journal title :

Journal of Voice

ISSN :

0892-1997

eISSN :

1873-4588

Publisher :

Elsevier Inc., United States

Peer reviewed :

Peer Reviewed verified by ORBi

Additional URL :

https://api.elsevier.com/content/article/PII:S0892199724000596?httpAccept=text/xml

Research unit :

M120 - Service de Chirurgie

Research institute :

Santé

Available on ORBi UMONS :

since 19 December 2024

Statistics

Number of views

7 (0 by UMONS)

Number of downloads

6 (1 by UMONS)

More statistics

Scopus citations^®

Scopus citations^®
without self-citations

OpenCitations

OpenAlex citations

Bibliography

Hill-Yardin, E.L., Hutchinson, M.R., Laycock, R., et al. A Chat(GPT) about the future of scientific publishing. Brain Behav Immun 110 (2023), 152–154, 10.1016/j.bbi.2023.02.022.
Goodman, R.S., Patrinely, J.R., Stone, C.A. Jr, et al. Accuracy and reliability of chatbot responses to physician questions. JAMA Netw Open, 6, 2023, e2336483, 10.1001/jamanetworkopen.2023.36483.
Li, Y., Li, Z., Zhang, K., et al. ChatDoctor: a medical chat model fine-tuned on a Large Language Model Meta-AI (LLaMA) using medical domain knowledge. Cureus, 15, 2023, e40895, 10.7759/cureus.40895.
Djugai, S., Boeger, D., Buentzel, J., et al. Chronic vocal cord palsy in Thuringia, Germany: a population-based study on epidemiology and outcome. Eur Arch Otorhinolaryngol 271 (2014), 329–335, 10.1007/s00405-013-2655-1.
Sapundzhiev, N., Lichtenberger, G., Eckel, H.E., et al. Surgery of adult bilateral vocal fold paralysis in adduction: history and trends. Eur Arch Otorhinolaryngol 265 (2008), 1501–1514, 10.1007/s00405-008-0665-1.
Nawka, T., Gugatschka, M., Kolmel, J.C., et al. Therapy of bilateral vocal fold paralysis: real world data of an international multi-center registry. PLoS One, 14, 2019, e0216096, 10.1371/journal.pone.0216096.
de Almeida, R.B.S., Costa, C.C., Silva Duarte, P.L.E., et al. Surgical treatment applied to bilateral vocal fold paralysis in adults: systematic review. J Voice 37 (2023), 289.e1–289.e13, 10.1016/j.jvoice.2020.11.018.
Titulaer, K., Schlattmann, P., Guntinas-Lichius, O., Surgery for bilateral vocal fold paralysis: systematic review and meta-analysis. Front Surg, 22, 2022, 956338, 10.3389/fsurg.2022.956338.
Liu, S., Wright, A.P., Patterson, B.L., et al. Using AI-generated suggestions from ChatGPT to optimize clinical decision support. J Am Med Inform Assoc 30 (2023), 1237–1245, 10.1093/jamia/ocad072.
Lechien, J.R., Maniaci, A., Gengler, I., et al. Validity and reliability of an instrument evaluating the performance of intelligent chatbot: the Artificial Intelligence Performance Instrument (AIPI). Eur Arch Otorhinolaryngol 281 (2024), 2063–2079, 10.1007/s00405-023-08219-y.
Vaishya, R., Misra, A., Vaish, A., ChatGPT: is this version good for healthcare and research?. Diabetes Metab Syndr, 17, 2023, 102744, 10.1016/j.dsx.2023.102744.
Lechien, J.R., Chiesa-Estomba, C.M., Baudouin, R., et al. Accuracy of ChatGPT in head and neck oncological board decisions: preliminary findings. Eur Arch Otorhinolaryngol 281 (2024), 2105–2114, 10.1007/s00405-023-08326-w.
Lechien, J.R., Georgescu, B.M., Hans, S., et al. ChatGPT performance in laryngology and head and neck surgery: a clinical case-series. Eur Arch Otorhinolaryngol 281 (2024), 319–333, 10.1007/s00405-023-08282-5.
Vaira, L.A., Lechien, J.R., Abbate, V., et al. Accuracy of ChatGPT-generated information on head and neck and oromaxillofacial surgery: a multicenter collaborative analysis. Otolaryngol Head Neck Surg, 2023, 10.1002/ohn.489 Online ahead of print.
Wiggers K. Glass health is building an AI for suggesting medical diagnoses; 2023. Available at: 〈https://techcrunch.com/2023/09/08/glass-health-is-building-an-ai-for-suggesting-medical-diagnoses/〉. Accessed December 12, 2023.
Singhal, K., Azizi, S., Tu, T., et al. Large language models encode clinical knowledge. Nature 620 (2023), 172–180, 10.1038/s41586-023-06291-2.
Kobie N. Babylon Disrupted the UK's Health System. Then It Left; 2023. Available at: 〈https://www.wired.co.uk/article/babylon-disrupted-uk-health-system-then-left〉. Accessed December 12, 2023.