The field of medicine is, like other industries and disciplines, in the process of incorporating AI as a standard tool, and it stands to be immensely useful — if it’s properly regulated, argue researchers. Without meaningful and standardized rules, it will be difficult to quantify benefits or prevent disasters issuing from systematic bias or poor implementation.
AI tools, or to be precise, machine learning agents trained to sift through medical data, are popping up in every room in the hospital, from the x-ray machine to the ICU. A well-trained model may spot an anomaly on a lung scan, or hear arrhythmia in a resting patient, faster or more reliably than a nurse or doctor.
At least that’s the theory; and while there’s no reason to doubt that an AI could be very helpful and even save lives, these models amount to medical treatments and must be documented and tested with especial rigor. So say Ravi B. Parikh, Ziad Obermeyer and Amol S. Navathe, from the University of Pennsylvania, UC Berkeley and the Crescencz VA Medical Center in Philadelphia respectively.
“Regulatory standards for assessing algorithms’ safety and impact have not existed until recently. Furthermore, evaluations of these algorithms, which are not as readily understandable by clinicians as previous algorithms, are not held to traditional clinical trial standards,” they write in an editorial published in the journal Science.
“Unlike a drug or device, algorithms are not static products. Their inputs, often based on thousands of variables, can change with context. And their predictive performance may change over time as the algorithm is exposed to more data.”
Nevertheless the FDA has partially approved a system called the WAVE Clinical Platform, which watches vitals for trouble. But if WAVE and others like it are truly to provide ongoing service they need to be assessed on standards created with AI models in mind.
Naturally the authors did not propose this without examples, which they list and describe, summarized as follows:
- Meaningful endpoints:
- Appropriate benchmarks:
- Interoperability and generalization:
- Specific interventions:
- Structured auditing:
from TechCrunch https://my.onmedic.com/2Erltxl