ChatGPT Health misses urgent medical crises over 50% of the time

According to new research published in Nature Medicine, ChatGPT Health (OpenAI’s dedicated AI-driven chatbot that’s “designed for health and wellness,” which launched earlier this year) repeatedly failed to identify medical emergencies that required immediate medical attention, reports The Guardian.
Lead researcher Dr. Ashwin Ramaswamy, along with his colleagues, created “60 realistic patient scenarios covering health conditions from mild illnesses to emergencies,” which were reviewed by independent doctors based on established clinical guidelines.
In 51.6% of the cases where patients should’ve been sent to the hospital for emergency care, they were instead advised to stay home and/or book a regular doctor’s appointment.
ChatGPT Health performed well enough in clear-cut emergency situations, such as in the case of strokes and severe allergic reactions, it didn’t fare so well when symptoms were more complex and weren’t yet emergencies but could become life-threatening very quickly.
“If you’re experiencing respiratory failure or diabetic ketoacidosis, you have a 50/50 chance of this AI telling you it’s not a big deal,” said doctoral researcher Alex Ruani. “Eight times out of 10, [ChatGPT Health] sent a suffocating woman to a future appointment she would not live to see. […] Meanwhile, 64.8% of completely safe individuals were told to seek immediate medical care.”
OpenAI told The Guardian that these results don’t reflect how the service is normally used and that the model is continuously refined.




