In a new study conducted at Beth Israel Deaconess Medical Center (BIDMC), researchers compared the clinical reasoning capabilities of a large language model to those of human physicians using the revised-IDEA (r-IDEA) score. The study involved giving a GPT-4 powered chatbot, 21 attending physicians, and 18 resident physicians 20 clinical cases to work through. The results of the study showed that the chatbot actually achieved the highest r-IDEA scores, indicating impressive diagnostic reasoning abilities. However, the chatbot also made more errors compared to the human physicians, highlighting the limitations of artificial intelligence in fully replacing human clinical reasoning.
Lead author of the study, Stephanie Cabral, M.D., emphasized the need for further research to determine the best ways to integrate large language models (LLMs) into clinical practice. She suggested that LLMs could serve as a checkpoint to help physicians avoid missing important diagnostic factors. The study results supported the idea that AI powered systems, such as chatbots, are better suited as tools to enhance a physician’s diagnostic abilities rather than replace them entirely.
Physician leaders and technologists explain that the practice of medicine requires a deep sense of reasoning and clinical intuition, which is difficult to replicate through algorithms alone. However, AI tools like chatbots can provide valuable diagnostic and clinical support to physicians, potentially saving time and improving efficiency in the diagnostic process. Organizations are already beginning to harness the power of AI technologies to augment clinical workflows, such as through artificial intelligence powered scribing technologies and enterprise search tools that help physicians access and analyze large amounts of patient data.
While AI systems may not be ready for clinical diagnostics, there is potential to leverage this technology to enhance clinical workflows with human oversight to ensure accuracy and safety. Tools in fields like radiology and dermatology are emerging that can provide suggestions for potential diagnoses based on data analysis. Despite the progress made in incorporating AI into healthcare practices, there is still much work to be done to fully realize the benefits of AI in clinical settings while maintaining the important role of human physicians in the decision-making process.