Study Reveals AI Chatbot ChatGPT's Inaccurate Medical Advice on Vitreoretinal Disease
A cross-sectional study reveals that an AI chatbot, ChatGPT, provided inappropriate and potentially harmful medical advice about vitreoretinal disease in response to patient questions. The study was conducted by Peter Y. Zhao, MD, and colleagues from the New England Eye Center at Tufts Medical Center in Boston. Two ophthalmologists evaluated the chatbot's responses to 52 questions about retinal health submitted in late January.
‘ChatGPT provided inaccurate and potentially harmful medical advice about a vitreoretinal disease, raising concerns about the use of AI chatbots in the medical field. #ChatGPT #EyeHealth’
ChatGPT accurately answered only eight of these questions. However, when the same questions were resubmitted two weeks later, all 52 responses changed, with 26 responses significantly altering. The accuracy improved in 30.8% of the cases but worsened in 19.2%.Recent advances in AI have sparked debates about how to use the technology effectively while preventing potential harm. In the medical field, AI chatbots have been tested by clinicians to assess their performance in answering healthcare-related questions.
Some chatbots performed well in responding to questions about cardiac care and oncology, with accurate answers for the majority of questions. However, one chatbot in the cardiac care study provided false information, known as "hallucinating" in AI terms when it wrongly stated that a specific drug was unavailable.
The study authors hypothesize that the high inaccuracy rate in ChatGPT's responses was due to the relatively limited online resources available for retinal health compared to more common medical fields. The risk of generating factually inaccurate responses in AI-based platforms is known, and such errors can potentially cause harm to patients in the medical domain.
One instance where ChatGPT provided dangerous advice was recommending corticosteroids for the treatment of central serous chorioretinopathy, a condition worsened by corticosteroid use. Another error involved the inclusion of injection therapy and laser therapy as treatments for epiretinal membrane, although it correctly mentioned vitrectomy as an option.
Advertisement
ChatGPT vs Google for Queries Related to Dementia and Other Cognitive Decline: Comparison of Results
Go to source).
The study used Google's "People Also Ask" subsection to compile commonly asked questions about vitreoretinal conditions and procedures. The questions were posed to ChatGPT on January 31, 2023, and resubmitted on February 13, considering the chatbot's continuous updates.
Advertisement
Moving forward, chatbot developers should be attentive to how clinicians and patients ask questions and ensure that chatbots provide consistent and accurate information across different users. Moreover, comparing chatbots against each other, other online sources and human experts will be crucial for continuous improvement and patient safety.
Reference:
- ChatGPT vs Google for Queries Related to Dementia and Other Cognitive Decline: Comparison of Results - (https://www.jmir.org/2023/1/e48966)
Source-Medindia