Too good to be true
Artificial intelligence could make MRI scans faster — a benefit to both patients and clinicians. But a recent study by Berkeley researchers found that developers may unknowingly introduce bias into their AI algorithms through their “off label” use of medical image databases. These free, online databases often contain compressed, preprocessed images, not the raw scanner data needed to properly train AI algorithms. The researchers, led by Michael Lustig, professor of electrical engineering and computer sciences, showed how this mix-and-match approach of using processed and raw files to train algorithms — something they dubbed “implicit data crimes” — yields images that are “too good to be true.” In their tests, the researchers found that algorithms pre-trained on processed data produced inaccurate results when handling real-world, raw data. “In some extreme cases, small, clinically important details related to pathology could be completely missing,” said lead author Efrat Shimron, a postdoctoral researcher in Lustig’s lab. The study’s authors — including Jonathan Tamir of the University of Texas at Austin and Berkeley Ph.D. student Ke Wang — hope to raise awareness about these “data crimes” among researchers developing new AI methods for medical imaging.
Learn more: ‘Off label’ use of imaging databases could lead to bias in AI algorithms, study finds