AI News

New Study Highlights Challenges for AI in Real-World Medical Conversations

ai in healthcare addressing communication challenges

Tools like ChatGPT have recently been hailed for their ability to assist healthcare workers in categorising patients, gathering their medical histories, and providing first-rate diagnoses. However, as it has been demonstrated in the coursework, despite high results on standardized medical exams, a recent study shows that there are challenges when the mentioned instruments are used in practice.

A recently published academic paper, conducted by the members of Harvard Medical School and Stanford University, has shown the weaknesses of AI in complex interactions between healthcare providers and consumers. The study is highlighted in the January 2 issue of Nature Medicine and describes a new form of assessment, the CRAFT-MD, as an acronym for the Conversational Reasoning Assessment Framework for Testing in Medicine. The goal of this framework is to assess the performance of AI tools in scenarios that are as close to real-life doctor-patient consultation as possible.

A group of researchers undertook a study in which they implemented CRAFT-MD to assess four large language models in 2000 clinical cases that included primary care and medical specialties. Consequently, the research was as follows as these models achieved high accuracy in providing responses to the multiple-choice structured medical questions. Their performance, however, dropped drastically when placed in more conversational situation. This is a good example showing how AI performs very well in cases where there are well-defined procedures of interacting but struggles in real-life, especially in open-ended conversations.

According to the study conducted by Dr. Pranav Rajpurkar, an assistant professor at Harvard Medical School, there is a major bottleneck in practicing medicine with Artificial Intelligence. Although these kinds of models yield good results on medical board examinations, the students fail to apply this learning in actual patient practice sessions as it involves more of critical decision making capacity and is dynamic in nature. This leads to them not receiving information and having to try to gather small bits and pieces of information during free flowing conversations. Hence, their effectiveness reduces considerably in those types of settings compared to multiple choice formats.

Earlier AI models were evaluated using questions on par with tests of physicians and surgeons. However, this method does not take into account that doctors’ communication is not as polished and full of well-formed thoughts as the wording in the above method might suggest it is. Shreya Johri, who is one of the authors of the research and a doctoral student at Harvard Medical School, noted that it is necessary to employ a more stringent approach. She mentioned that real-world interactions are much more complicated and need tools to capture clinical practice accurately.

The development of CRAFT-MD was established to achieve specific goals that were deemed necessary in the medical profession. This program accurately simulates realistic interactions because the first AI agent acts as a patient and will give natural and conversational responses while the second AI agent checks for the accuracy of the diagnosis given. Subsequently, the results are reviewed by human validators in terms of accuracy, relevance, and compliance with the industry norms. This new approach enables scientists to assess an AI model on its capacity to obtain crucial patient information and synthesize disjointed details to arrive at accurate diagnoses.

As such, based on the results of the study, different strategic directions for improving results and an AI model in medical conditions are indicated. Such approaches include the development of tools that are capable of analyzing free-form discussions, integrating various types of data including images and test results into a stream of conversation, and enabling the AI to not only hear what is said, but also how it is said, including tone of voice and body language. Also, the researchers discuss the idea of integrating AI evaluators with human specialists to make evaluations more efficient without sacrificing accuracy.

Speaking to the CNN, Dr. Roxana Daneshjou, one of the co-senior authors of the study from Stanford University, emphasized the need to align the AI development with the existing clinical standards. She stressed that CRAFT-MD is a framework that reproduces how professionals communicate and helps to navigate the use of intelligent technologies in practice ethically and efficiently.

The designated team of researchers plan to continually update and improve CRAFT-MD in order to reflect advancements in AI technology. Their actions demonstrate the need to overcome the existing gap between the conceptual application of AI and its implementation in real healthcare practice that will in future lead to more effective diagnostic tools.

Source: https://www.news-medical.net/news/20250102/AI-models-struggle-in-real-world-medical-conversations.aspx

Latest Stories:

LG Expands AI Home Solutions to Mobility Spaces at CES 2025

CloudPaths Unveils New Executive Structure to Enhance SaaS Solutions

AI Unveils Hidden Secrets of Centuries- Old Masterpiece

 

 

What is your reaction?

Excited
0
Happy
0
In Love
0
Not Sure
0
Silly
0
ToAI Team
Fueled by a shared fascination with Artificial Intelligence, the Times Of AI journalists team brings together various researchers, writers, and analysts. We aim to provide a comprehensive knowledge of AI for a broad audience of the Times Of AI. Through in-depth analysis of the latest advancements, investigation of ethical considerations around AI development, AI governance, machine learning, data science, automation, cybersecurity, and discussions about the future impact of AI across various sectors, we aim to empower readers with the details they need to navigate this rapidly evolving field.
    You may also like

    Leave a reply

    Your email address will not be published. Required fields are marked *

    More in:AI News