The most popular and comprehensive Open Source ECM platform
AI and BioMedicine: Revolutionary – If it Can Achieve Greater Transparency
AI is increasingly being applied to biomedicine and healthcare. It is being used for less-invasive early detection and diagnosis of disease, the development of long-term disease management, and as a tool to speed up the research and discovery of new drugs.
Some AI models have achieved stunning results, such as the ability to read and identify medical issues from XRays and imaging. But there have been failures too. Just because a procedure has the seal of AI on it doesn’t mean that the results are infallible. Reproducibility and verification AI algorithms and their effectiveness has been a criticism of AI that needs to be addressed.
For example, an AI model widely deployed for detecting the disease septis, turned out in practice to actually be accurate only about one-third of the time and generates large numbers of false positives. This kind of problem is often the result of using AI training data that is very specific to one type of population, like older white middle-class women, and then applying the algorithm to data from different population groups. The accuracy of the results often drop considerably when run against data that is a mismatch to the population used for the original training.
Stanford researchers call the problem an “AI chasm” because they say that AI models are being released without sufficient information for determining how they were developed and the origin and characteristics of the training data used. The Stanford researchers looked at twelve AI models currently commercially used and found 90 percent of them as satisfactory in transparency of information the provided, but most did not fully comply with the complete range of best-practice guidelines.
Lack of or poor documentation was cited as a major deficiency of most of the commercial algorithms today. Chief among problems found were that models are often deployed into populations substantially different than the population used to train the algorithm. This may be a result of the fact that the detailed description of the origin of the training data is often absent. The models also often don’t document if the test has biases for race, ethnicity or sex.