'Eye-Opening' Inaccuracy of ChatGpt Discovered Over Its 'Eye-ronic' Medical Advice

Study Reveals AI Chatbot ChatGPT's Inaccurate Medical Advice on Vitreoretinal Disease

A cross-sectional study reveals that an AI chatbot, ChatGPT, provided inappropriate and potentially harmful medical advice about vitreoretinal disease in response to patient questions.

The study was conducted by Peter Y. Zhao, MD, and colleagues from the New England Eye Center at Tufts Medical Center in Boston. Two ophthalmologists evaluated the chatbot's responses to 52 questions about retinal health submitted in late January.

ChatGPT’s Crucial Role in Advancing Healthcare
New study reveals ChatGPT as an additional information tool for patients and physicians in healthcare.

‘ChatGPT provided inaccurate and potentially harmful medical advice about a vitreoretinal disease, raising concerns about the use of AI chatbots in the medical field. #ChatGPT #EyeHealth’

ChatGPT accurately answered only eight of these questions. However, when the same questions were resubmitted two weeks later, all 52 responses changed, with 26 responses significantly altering. The accuracy improved in 30.8% of the cases but worsened in 19.2%.

Recent advances in AI have sparked debates about how to use the technology effectively while preventing potential harm. In the medical field, AI chatbots have been tested by clinicians to assess their performance in answering healthcare-related questions.

Some chatbots performed well in responding to questions about cardiac care and oncology, with accurate answers for the majority of questions. However, one chatbot in the cardiac care study provided false information, known as "hallucinating" in AI terms when it wrongly stated that a specific drug was unavailable.

ChatGPT Helps Breast Cancer Screening With Limitations
Artificial intelligence model ChatGPT could help spread awareness about breast cancer symptoms with some inaccurate or even fictitious information.

The study authors hypothesize that the high inaccuracy rate in ChatGPT's responses was due to the relatively limited online resources available for retinal health compared to more common medical fields. The risk of generating factually inaccurate responses in AI-based platforms is known, and such errors can potentially cause harm to patients in the medical domain.

One instance where ChatGPT provided dangerous advice was recommending corticosteroids for the treatment of central serous chorioretinopathy, a condition worsened by corticosteroid use. Another error involved the inclusion of injection therapy and laser therapy as treatments for epiretinal membrane, although it correctly mentioned vitrectomy as an option.

ChatGPT's Impact on Moral Dilemmas
Study finds that artificial intelligence (AI) chatbot statements can impact human responses to moral dilemmas.

While some newer versions of ChatGPT demonstrated more consistent appropriateness in responses to retinal disease-related questions, the issue of inaccurate information persists in certain cases ().

The study used Google's "People Also Ask" subsection to compile commonly asked questions about vitreoretinal conditions and procedures. The questions were posed to ChatGPT on January 31, 2023, and resubmitted on February 13, considering the chatbot's continuous updates.

AI Tool ChatGPT's Performance Inaccuracy Revealed
Recent study has shown that ChatGPT, an AI tool, had a low accuracy rate in answering test questions from a popular study resource used by ophthalmologists for board certification.

However, it's important to note that the study has limitations. Assessing the accuracy of answers solely based on the entire response might not fully capture the nuances of accuracy. Additionally, comparing chatbot responses with those of real-life clinicians and considering the potential influence of researchers' biases would strengthen future studies in this area.

Moving forward, chatbot developers should be attentive to how clinicians and patients ask questions and ensure that chatbots provide consistent and accurate information across different users. Moreover, comparing chatbots against each other, other online sources and human experts will be crucial for continuous improvement and patient safety.

Reference:

ChatGPT vs Google for Queries Related to Dementia and Other Cognitive Decline: Comparison of Results - (https://www.jmir.org/2023/1/e48966)

Source-Medindia