Researchers conducted a small study to compare diagnoses performed by doctors versus those performed by AI, specifically ChatGPT. After recruiting 50 doctors for the experiment, the researchers divided the experiments into three groups: 1) Doctors not using the chatbot; 2) Doctors using the chatbot; 3) The chatbot by itself. The three groups were given six real-world case histories and told to suggest diagnoses and explain why they favored or ruled them out. The case histories had never been published, so neither the doctors nor ChatGPT would have foreknowledge about them.
Here are the results: Doctors who did not use the chatbot had an average score of 74 percent. Those who did use the chatbot had an average score of 76 percent. The chatbot by itself scored an average of 90 percent, vastly outperforming all the doctors!
[The researchers, in a subsequent interview, said "The results were not what we expected....we thought the doctors who had access to the chatbot were going to do way better than the doctors who only had access to the usual internet—UpToDate, PubMed, Google, whatever."]
The doctors using the chatbots often were not persuaded when the chatbot pointed out something that was at odds with their diagnoses. Instead, they tended to be wedded to their own ideas of the correct diagnoses. In describing how they came up with a diagnosis, doctors would say, “intuition,” or “based on my experience.”
Researchers also found that few doctors knew how to take advantage of the chatbot’s ability to solve complex diagnostic problems. For example, they treated the chatbot like a search engine, asking questions such as “What are the possible diagnoses for eye pain?” Only a few of the doctors figured out that they could copy and paste the entire case history into the chatbot and ask for a comprehensive diagnosis.
Hey! We could try this at home!
P.S. This post marks the tenth anniversary of my weekly blogs.
For an introduction to this blog, see I Just Say No; for a list of blog topics, click the Topics tab.
What a thought-provoking post to celebrate your 10th, Connie! I congratulate and also appreciate you.
ReplyDeleteI recently Googled symptoms related to an eye issue and after reading your blog used AI to compare results. Chatbot asked me perhaps12 or 15 questions related to symtoms and general health and then suggested three possible diagnoses. What was different from Goggle was that AI assigned the possibilities with a probability %, one being significantly more likely than the others. I will be interested in what my doctor comes up with when I see him next week.
ReplyDeleteCongratulations on ten years, Connie. Keep the columns coming! Janet
ReplyDeleteCongratulations! Great work.
ReplyDeleteCongratulations!!!
ReplyDelete